Posted on Leave a comment

Check storage performance with dd

This article includes some example commands to show you how to get a rough estimate of hard drive and RAID array performance using the dd command. Accurate measurements would have to take into account things like write amplification and system call overhead, which this guide does not. For a tool that might give more accurate results, you might want to consider using hdparm.

To factor out performance issues related to the file system, these examples show how to test the performance of your drives and arrays at the block level by reading and writing directly to/from their block devices. WARNING: The write tests will destroy any data on the block devices against which they are run. Do not run them against any device that contains data you want to keep!

Four tests

Below are four example dd commands that can be used to test the performance of a block device:

  1. One process reading from $MY_DISK:
    # dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache
  2. One process writing to $MY_DISK:
    # dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct
  3. Two processes reading concurrently from $MY_DISK:
    # (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache &); (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache skip=200 &)
  4. Two processes writing concurrently to $MY_DISK:
    # (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct &); (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct skip=200 &)

– The iflag=nocache and oflag=direct parameters are important when performing the read and write tests (respectively) because without them the dd command will sometimes show the resulting speed of transferring the data to/from RAM rather than the hard drive.

– The values for the bs and count parameters are somewhat arbitrary and what I have chosen should be large enough to provide a decent average in most cases for current hardware.

– The null and zero devices are used for the destination and source (respectively) in the read and write tests because they are fast enough that they will not be the limiting factor in the performance tests.

– The skip=200 parameter on the second dd command in the concurrent read and write tests is to ensure that the two copies of dd are operating on different areas of the hard drive.

16 examples

Below are demonstrations showing the results of running each of the above four tests against each of the following four block devices:

  1. MY_DISK=/dev/sda2 (used in examples 1-X)
  2. MY_DISK=/dev/sdb2 (used in examples 2-X)
  3. MY_DISK=/dev/md/stripped (used in examples 3-X)
  4. MY_DISK=/dev/md/mirrored (used in examples 4-X)

A video demonstration of the these tests being run on a PC is provided at the end of this guide.

Begin by putting your computer into rescue mode to reduce the chances that disk I/O from background services might randomly affect your test results. WARNING: This will shutdown all non-essential programs and services. Be sure to save your work before running these commands. You will need to know your root password to get into rescue mode. The passwd command, when run as the root user, will prompt you to (re)set your root account password.

$ sudo -i
# passwd
# setenforce 0
# systemctl rescue

You might also want to temporarily disable logging to disk:

# sed -r -i.bak 's/^#?Storage=.*/Storage=none/' /etc/systemd/journald.conf
# systemctl restart systemd-journald.service

If you have a swap device, it can be temporarily disabled and used to perform the following tests:

# swapoff -a
# MY_DEVS=$(mdadm --detail /dev/md/swap | grep active | grep -o "/dev/sd.*")
# mdadm --stop /dev/md/swap
# mdadm --zero-superblock $MY_DEVS

Example 1-1 (reading from sda)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 1)
# dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.7003 s, 123 MB/s

Example 1-2 (writing to sda)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 1)
# dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.67117 s, 125 MB/s

Example 1-3 (reading concurrently from sda)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 1)
# (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache &); (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.42875 s, 61.2 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.52614 s, 59.5 MB/s

Example 1-4 (writing concurrently to sda)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 1)
# (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct &); (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct skip=200 &)
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.2435 s, 64.7 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.60872 s, 58.1 MB/s

Example 2-1 (reading from sdb)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 2)
# dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.67285 s, 125 MB/s

Example 2-2 (writing to sdb)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 2)
# dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.67198 s, 125 MB/s

Example 2-3 (reading concurrently from sdb)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 2)
# (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache &); (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.52808 s, 59.4 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.57736 s, 58.6 MB/s

Example 2-4 (writing concurrently to sdb)

# MY_DISK=$(echo $MY_DEVS | cut -d ' ' -f 2)
# (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct &); (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.7841 s, 55.4 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 3.81475 s, 55.0 MB/s

Example 3-1 (reading from RAID0)

# mdadm --create /dev/md/stripped --homehost=any --metadata=1.0 --level=0 --raid-devices=2 $MY_DEVS
# MY_DISK=/dev/md/stripped
# dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 0.837419 s, 250 MB/s

Example 3-2 (writing to RAID0)

# MY_DISK=/dev/md/stripped
# dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 0.823648 s, 255 MB/s

Example 3-3 (reading concurrently from RAID0)

# MY_DISK=/dev/md/stripped
# (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache &); (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.31025 s, 160 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.80016 s, 116 MB/s

Example 3-4 (writing concurrently to RAID0)

# MY_DISK=/dev/md/stripped
# (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct &); (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.65026 s, 127 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.81323 s, 116 MB/s

Example 4-1 (reading from RAID1)

# mdadm --stop /dev/md/stripped
# mdadm --create /dev/md/mirrored --homehost=any --metadata=1.0 --level=1 --raid-devices=2 --assume-clean $MY_DEVS
# MY_DISK=/dev/md/mirrored
# dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.74963 s, 120 MB/s

Example 4-2 (writing to RAID1)

# MY_DISK=/dev/md/mirrored
# dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.74625 s, 120 MB/s

Example 4-3 (reading concurrently from RAID1)

# MY_DISK=/dev/md/mirrored
# (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache &); (dd if=$MY_DISK of=/dev/null bs=1MiB count=200 iflag=nocache skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.67171 s, 125 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 1.67685 s, 125 MB/s

Example 4-4 (writing concurrently to RAID1)

# MY_DISK=/dev/md/mirrored
# (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct &); (dd if=/dev/zero of=$MY_DISK bs=1MiB count=200 oflag=direct skip=200 &)
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 4.09666 s, 51.2 MB/s
200+0 records in
200+0 records out
209715200 bytes (210 MB, 200 MiB) copied, 4.1067 s, 51.1 MB/s

Restore your swap device and journald configuration

# mdadm --stop /dev/md/stripped /dev/md/mirrored
# mdadm --create /dev/md/swap --homehost=any --metadata=1.0 --level=1 --raid-devices=2 $MY_DEVS
# mkswap /dev/md/swap
# swapon -a
# mv /etc/systemd/journald.conf.bak /etc/systemd/journald.conf
# systemctl restart systemd-journald.service
# reboot

Interpreting the results

Examples 1-1, 1-2, 2-1, and 2-2 show that each of my drives read and write at about 125 MB/s.

Examples 1-3, 1-4, 2-3, and 2-4 show that when two reads or two writes are done in parallel on the same drive, each process gets at about half the drive’s bandwidth (60 MB/s).

The 3-x examples show the performance benefit of putting the two drives together in a RAID0 (data stripping) array. The numbers, in all cases, show that the RAID0 array performs about twice as fast as either drive is able to perform on its own. The trade-off is that you are twice as likely to lose everything because each drive only contains half the data. A three-drive array would perform three times as fast as a single drive (all drives being equal) but it would be thrice as likely to suffer a catastrophic failure.

The 4-x examples show that the performance of the RAID1 (data mirroring) array is similar to that of a single disk except for the case where multiple processes are concurrently reading (example 4-3). In the case of multiple processes reading, the performance of the RAID1 array is similar to that of the RAID0 array. This means that you will see a performance benefit with RAID1, but only when processes are reading concurrently. For example, if a process tries to access a large number of files in the background while you are trying to use a web browser or email client in the foreground. The main benefit of RAID1 is that your data is unlikely to be lost if a drive fails.

Video demo

Testing storage throughput using dd

Troubleshooting

If the above tests aren’t performing as you expect, you might have a bad or failing drive. Most modern hard drives have built-in Self-Monitoring, Analysis and Reporting Technology (SMART). If your drive supports it, the smartctl command can be used to query your hard drive for its internal statistics:

# smartctl --health /dev/sda
# smartctl --log=error /dev/sda
# smartctl -x /dev/sda

Another way that you might be able to tune your PC for better performance is by changing your I/O scheduler. Linux systems support several I/O schedulers and the current default for Fedora systems is the multiqueue variant of the deadline scheduler. The default performs very well overall and scales extremely well for large servers with many processors and large disk arrays. There are, however, a few more specialized schedulers that might perform better under certain conditions.

To view which I/O scheduler your drives are using, issue the following command:

$ for i in /sys/block/sd?/queue/scheduler; do echo "$i: $(<$i)"; done

You can change the scheduler for a drive by writing the name of the desired scheduler to the /sys/block/<device name>/queue/scheduler file:

# echo bfq > /sys/block/sda/queue/scheduler

You can make your changes permanent by creating a udev rule for your drive. The following example shows how to create a udev rule that will set all rotational drives to use the BFQ I/O scheduler:

# cat << END > /etc/udev/rules.d/60-ioscheduler-rotational.rules
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"
END

Here is another example that sets all solid-state drives to use the NOOP I/O scheduler:

# cat << END > /etc/udev/rules.d/60-ioscheduler-solid-state.rules
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="none"
END

Changing your I/O scheduler won’t affect the raw throughput of your devices, but it might make your PC seem more responsive by prioritizing the bandwidth for the foreground tasks over the background tasks or by eliminating unnecessary block reordering.


Photo by James Donovan on Unsplash.

Posted on Leave a comment

Mirror your System Drive using Software RAID

Nothing lasts forever. When it comes to the hardware in your PC, most of it can easily be replaced. There is, however, one special-case hardware component in your PC that is not as easy to replace as the rest — your hard disk drive.

Drive Mirroring

Your hard drive stores your personal data. Some of your data can be backed up automatically by scheduled backup jobs. But those jobs scan the files to be backed up for changes and trying to scan an entire drive would be very resource intensive. Also, anything that you’ve changed since your last backup will be lost if your drive fails. Drive mirroring is a better way to maintain a secondary copy of your entire hard drive. With drive mirroring, a secondary copy of all the data on your hard drive is maintained in real time.

An added benefit of live mirroring your hard drive to a secondary hard drive is that it can increase your computer’s performance. Because disk I/O is one of your computer’s main performance bottlenecks, the performance improvement can be quite significant.

Note that a mirror is not a backup. It only protects your data from being lost if one of your physical drives fail. Types of failures that drive mirroring, by itself, does not protect against include:

Some of the above can be addressed by other file system features that can be used in conjunction with drive mirroring. File system features that address the above types of failures include:

This guide will demonstrate one method of mirroring your system drive using the Multiple Disk and Device Administration (mdadm) toolset. Just for fun, this guide will show how to do the conversion without using any extra boot media (CDs, USB drives, etc). For more about the concepts and terminology related to the multiple device driver, you can skim the md man page:

$ man md

The Procedure

  1. Use sgdisk to (re)partition the extra drive that you have added to your computer:
    $ sudo -i
    # MY_DISK_1=/dev/sdb
    # sgdisk --zap-all $MY_DISK_1
    # test -d /sys/firmware/efi/efivars || sgdisk -n 0:0:+1MiB -t 0:ef02 -c 0:grub_1 $MY_DISK_1
    # sgdisk -n 0:0:+1GiB -t 0:ea00 -c 0:boot_1 $MY_DISK_1
    # sgdisk -n 0:0:+4GiB -t 0:fd00 -c 0:swap_1 $MY_DISK_1
    # sgdisk -n 0:0:0 -t 0:fd00 -c 0:root_1 $MY_DISK_1

    – If the drive that you will be using for the second half of the mirror in step 12 is smaller than this drive, then you will need to adjust down the size of the last partition so that the total size of all the partitions is not greater than the size of your second drive.
    – A few of the commands in this guide are prefixed with a test for the existence of an efivars directory. This is necessary because those commands are slightly different depending on whether your computer is BIOS-based or UEFI-based.

  2. Use mdadm to create RAID devices that use the new partitions to store their data:
    # mdadm --create /dev/md/boot --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/disk/by-partlabel/boot_1 missing
    # mdadm --create /dev/md/swap --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/disk/by-partlabel/swap_1 missing
    # mdadm --create /dev/md/root --homehost=any --metadata=1.0 --level=1 --raid-devices=2 /dev/disk/by-partlabel/root_1 missing
    # cat << END > /etc/mdadm.conf
    MAILADDR root
    AUTO +all
    DEVICE partitions
    END
    # mdadm --detail --scan >> /etc/mdadm.conf

    – The missing parameter tells mdadm to create an array with a missing member. You will add the other half of the mirror in step 14.
    – You should configure sendmail so you will be notified if a drive fails.
    – You can configure Evolution to monitor a local mail spool.

  3. Use dracut to update the initramfs:
    # dracut -f --add mdraid --add-drivers xfs

    – Dracut will include the /etc/mdadm.conf file you created in the previous section in your initramfs unless you build your initramfs with the hostonly option set to no. If you build your initramfs with the hostonly option set to no, then you should either manually include the /etc/mdadm.conf file, manually specify the UUID’s of the RAID arrays to assemble at boot time with the rd.md.uuid kernel parameter, or specify the rd.auto kernel parameter to have all RAID arrays automatically assembled and started at boot time. This guide will demonstrate the rd.auto option since it is the most generic.

  4. Format the RAID devices:
    # mkfs -t vfat /dev/md/boot
    # mkswap /dev/md/swap
    # mkfs -t xfs /dev/md/root

    – The new Boot Loader Specification states “if the OS is installed on a disk with GPT disk label, and no ESP partition exists yet, a new suitably sized (let’s say 500MB) ESP should be created and should be used as $BOOT” and “$BOOT must be a VFAT (16 or 32) file system”.

  5. Reboot and set the rd.auto, rd.break and single kernel parameters:
    # reboot

    – You may need to set your root password before rebooting so that you can get into single-user mode in step 7.
    – See “Making Temporary Changes to a GRUB 2 Menu” for directions on how to set kernel parameters on compters that use the GRUB 2 boot loader.

  6. Use the dracut shell to copy the root file system:
    # mkdir /newroot
    # mount /dev/md/root /newroot
    # shopt -s dotglob
    # cp -ax /sysroot/* /newroot
    # rm -rf /newroot/boot/*
    # umount /newroot
    # exit

    – The dotglob flag is set for this bash session so that the wildcard character will match hidden files.
    – Files are removed from the boot directory because they will be copied to a separate partition in the next step.
    – This copy operation is being done from the dracut shell to insure that no processes are accessing the files while they are being copied.

  7. Use single-user mode to copy the non-root file systems:
    # mkdir /newroot
    # mount /dev/md/root /newroot
    # mount /dev/md/boot /newroot/boot
    # shopt -s dotglob
    # cp -Lr /boot/* /newroot/boot
    # test -d /newroot/boot/efi/EFI && mv /newroot/boot/efi/EFI/* /newroot/boot/efi && rmdir /newroot/boot/efi/EFI
    # test -d /sys/firmware/efi/efivars && ln -sfr /newroot/boot/efi/fedora/grub.cfg /newroot/etc/grub2-efi.cfg
    # cp -ax /home/* /newroot/home
    # exit

    – It is OK to run these commands in the dracut shell shown in the previous section instead of doing it from single-user mode. I’ve demonstrated using single-user mode to avoid having to explain how to mount the non-root partitions from the dracut shell.
    – The parameters being past to the cp command for the boot directory are a little different because the VFAT file system doesn’t support symbolic links or Unix-style file permissions.
    – In rare cases, the rd.auto parameter is known to cause LVM to fail to assemble due to a race condition. If you see errors about your swap or home partition failing to mount when entering single-user mode, simply try again by repeating step 5 but omiting the rd.break paramenter so that you will go directly to single-user mode.

  8. Update fstab on the new drive:
    # cat << END > /newroot/etc/fstab
    /dev/md/root / xfs defaults 0 0
    /dev/md/boot /boot vfat defaults 0 0
    /dev/md/swap swap swap defaults 0 0
    END

  9. Configure the boot loader on the new drive:
    # NEW_GRUB_CMDLINE_LINUX=$(cat /etc/default/grub | sed -n 's/^GRUB_CMDLINE_LINUX="\(.*\)"/\1/ p')
    # NEW_GRUB_CMDLINE_LINUX=${NEW_GRUB_CMDLINE_LINUX//rd.lvm.*([^ ])}
    # NEW_GRUB_CMDLINE_LINUX=${NEW_GRUB_CMDLINE_LINUX//resume=*([^ ])}
    # NEW_GRUB_CMDLINE_LINUX+=" selinux=0 rd.auto"
    # sed -i "/^GRUB_CMDLINE_LINUX=/s/=.*/=\"$NEW_GRUB_CMDLINE_LINUX\"/" /newroot/etc/default/grub

    – You can re-enable selinux after this procedure is complete. But you will have to relabel your file system first.

  10. Install the boot loader on the new drive:
    # sed -i '/^GRUB_DISABLE_OS_PROBER=.*/d' /newroot/etc/default/grub
    # echo "GRUB_DISABLE_OS_PROBER=true" >> /newroot/etc/default/grub
    # MY_DISK_1=$(mdadm --detail /dev/md/boot | grep active | grep -m 1 -o "/dev/sd.")
    # for i in dev dev/pts proc sys run; do mount -o bind /$i /newroot/$i; done
    # chroot /newroot env MY_DISK_1=$MY_DISK_1 bash --login
    # test -d /sys/firmware/efi/efivars || MY_GRUB_DIR=/boot/grub2
    # test -d /sys/firmware/efi/efivars && MY_GRUB_DIR=$(find /boot/efi -type d -name 'fedora' -print -quit)
    # test -e /usr/sbin/grub2-switch-to-blscfg && grub2-switch-to-blscfg --grub-directory=$MY_GRUB_DIR
    # grub2-mkconfig -o $MY_GRUB_DIR/grub.cfg \;
    # test -d /sys/firmware/efi/efivars && test /boot/grub2/grubenv -nt $MY_GRUB_DIR/grubenv && cp /boot/grub2/grubenv $MY_GRUB_DIR/grubenv
    # test -d /sys/firmware/efi/efivars || grub2-install "$MY_DISK_1"
    # logout
    # for i in run sys proc dev/pts dev; do umount /newroot/$i; done
    # test -d /sys/firmware/efi/efivars && efibootmgr -c -d "$MY_DISK_1" -p 1 -l "$(find /newroot/boot -name shimx64.efi -printf '/%P\n' -quit | sed 's!/!\\!g')" -L "Fedora RAID Disk 1"

    – The grub2-switch-to-blscfg command is optional. It is only supported on Fedora 29+.
    – The cp command above should not be necessary, but there appears to be a bug in the current version of grub which causes it to write to $BOOT/grub2/grubenv instead of $BOOT/efi/fedora/grubenv on UEFI systems.
    – You can use the following command to verify the contents of the grub.cfg file right after running the grub2-mkconfig command above:

    # sed -n '/BEGIN .*10_linux/,/END .*10_linux/ p' $MY_GRUB_DIR/grub.cfg

    – You should see references to mdraid and mduuid in the output from the above command if the RAID array was detected properly.

  11. Boot off of the new drive:
    # reboot

    – How to select the new drive is system-dependent. It usually requires pressing one of the F12, F10, Esc or Del keys when you hear the System OK BIOS beep code.
    – On UEFI systems the boot loader on the new drive should be labeled “Fedora RAID Disk 1”.

  12. Remove all the volume groups and partitions from your old drive:
    # MY_DISK_2=/dev/sda
    # MY_VOLUMES=$(pvs | grep $MY_DISK_2 | awk '{print $2}' | tr "\n" " ")
    # test -n "$MY_VOLUMES" && vgremove $MY_VOLUMES
    # sgdisk --zap-all $MY_DISK_2

    WARNING: You want to make certain that everything is working properly on your new drive before you do this. A good way to verify that your old drive is no longer being used is to try booting your computer once without the old drive connected.
    – You can add another new drive to your computer instead of erasing your old one if you prefer.

  13. Create new partitions on your old drive to match the ones on your new drive:
    # test -d /sys/firmware/efi/efivars || sgdisk -n 0:0:+1MiB -t 0:ef02 -c 0:grub_2 $MY_DISK_2
    # sgdisk -n 0:0:+1GiB -t 0:ea00 -c 0:boot_2 $MY_DISK_2
    # sgdisk -n 0:0:+4GiB -t 0:fd00 -c 0:swap_2 $MY_DISK_2
    # sgdisk -n 0:0:0 -t 0:fd00 -c 0:root_2 $MY_DISK_2

    – It is important that the partitions match in size and type. I prefer to use the parted command to display the partition table because it supports setting the display unit:

    # parted /dev/sda unit MiB print
    # parted /dev/sdb unit MiB print

  14. Use mdadm to add the new partitions to the RAID devices:
    # mdadm --manage /dev/md/boot --add /dev/disk/by-partlabel/boot_2
    # mdadm --manage /dev/md/swap --add /dev/disk/by-partlabel/swap_2
    # mdadm --manage /dev/md/root --add /dev/disk/by-partlabel/root_2

  15. Install the boot loader on your old drive:
    # test -d /sys/firmware/efi/efivars || grub2-install "$MY_DISK_2"
    # test -d /sys/firmware/efi/efivars && efibootmgr -c -d "$MY_DISK_2" -p 1 -l "$(find /boot -name shimx64.efi -printf "/%P\n" -quit | sed 's!/!\\!g')" -L "Fedora RAID Disk 2"

  16. Use mdadm to test that email notifications are working:
    # mdadm --monitor --scan --oneshot --test

As soon as your drives have finished synchronizing, you should be able to select either drive when restarting your computer and you will receive the same live-mirrored operating system. If either drive fails, mdmonitor will send an email notification. Recovering from a drive failure is now simply a matter of swapping out the bad drive with a new one and running a few sgdisk and mdadm commands to re-create the mirrors (steps 13 through 15). You will no longer have to worry about losing any data if a drive fails!

Video Demonstrations

Converting a UEFI PC to RAID1
Converting a BIOS PC to RAID1
  • TIP: Set the the quality to 720p on the above videos for best viewing.
Posted on Leave a comment

Managing RAID arrays with mdadm

Mdadm stands for Multiple Disk and Device Administration. It is a command line tool that can be used to manage software RAID arrays on your Linux PC. This article outlines the basics you need to get started with it.

The following five commands allow you to make use of mdadm’s most basic features:

  1. Create a RAID array:
    # mdadm –create /dev/md/test –homehost=any –metadata=1.0 –level=1 –raid-devices=2 /dev/sda1 /dev/sdb1
  2. Assemble (and start) a RAID array:
    # mdadm –assemble /dev/md/test /dev/sda1 /dev/sdb1
  3. Stop a RAID array:
    # mdadm –stop /dev/md/test
  4. Delete a RAID array:
    # mdadm –zero-superblock /dev/sda1 /dev/sdb1
  5. Check the status of all assembled RAID arrays:
    # cat /proc/mdstat

Notes on features

mdadm --create

The create command shown above includes the following four parameters in addition to the create parameter itself and the device names:

  1. –homehost:
    By default, mdadm stores your computer’s name as an attribute of the RAID array. If your computer name does not match the stored name, the array will not automatically assemble. This feature is useful in server clusters that share hard drives because file system corruption usually occurs if multiple servers attempt to access the same drive at the same time. The name any is reserved and disables the homehost restriction.
  2. –metadata:
    mdadm reserves a small portion of each RAID device to store information about the RAID array itself. The metadata parameter specifies the format and location of the information. The value 1.0 indicates to use version-1 formatting and store the metadata at the end of the device.
  3. –level:
    The level parameter specifies how the data should be distributed among the underlying devices. Level 1 indicates each device should contain a complete copy of all the data. This level is also known as disk mirroring.
  4. –raid-devices:
    The raid-devices parameter specifies the number of devices that will be used to create the RAID array.

By using level=1 (mirroring) in combination with metadata=1.0 (store the metadata at the end of the device), you create a RAID1 array whose underlying devices appear normal if accessed without the aid of the mdadm driver. This is useful in the case of disaster recovery, because you can access the device even if the new system doesn’t support mdadm arrays. It’s also useful in case a program needs read-only access to the underlying device before mdadm is available. For example, the UEFI firmware in a computer may need to read the bootloader from the ESP before mdadm is started.

mdadm --assemble

The assemble command above fails if a member device is missing or corrupt. To force the RAID array to assemble and start when one of its members is missing, use the following command:

# mdadm --assemble --run /dev/md/test /dev/sda1

Other important notes

Avoid writing directly to any devices that underlay a mdadm RAID1 array. That causes the devices to become out-of-sync and mdadm won’t know that they are out-of-sync. If you access a RAID1 array with a device that’s been modified out-of-band, you can cause file system corruption. If you modify a RAID1 device out-of-band and need to force the array to re-synchronize, delete the mdadm metadata from the device to be overwritten and then re-add it to the array as demonstrated below:

# mdadm --zero-superblock /dev/sdb1
# mdadm --assemble --run /dev/md/test /dev/sda1
# mdadm /dev/md/test --add /dev/sdb1

These commands completely overwrite the contents of sdb1 with the contents of sda1.

To specify any RAID arrays to automatically activate when your computer starts, create an /etc/mdadm.conf configuration file.

For the most up-to-date and detailed information, check the man pages:

$ man mdadm 
$ man mdadm.conf

The next article of this series will show a step-by-step guide on how to convert an existing single-disk Linux installation to a mirrored-disk installation, that will continue running even if one of its hard drives suddenly stops working!