I recently started using DRBD (Distributed Replicated Block Device) on Debian Linux to have a setup in which there are two servers: One server which hosts some virtual machines, and its hot-standby companion which holds exactly the same data and can take over if the hardware of the master server dies.
Now, it is clear how to setup DRBD for the RAID array which holds all the data — DRBD’s documentation is really good. What remained unclear to me, though, is how I can also use DRBD for the root file system. Otherwise, I’d need to put in some extra effort to remember to replicate all root filesystem changes, which makes the whole setup much more complex to use.
I suspect people are deploying machines like this with root filesystems that are centrally managed by puppet or similar.
Instead, I decided to also use DRBD for the root device. While that setup is largely undocumented and not recommended on the DRBD mailing list, for experienced Linux administrators, it is not THAT complex. Essentially, you need to shrink the existing root filesystem, create the DRBD metadata and then change the initramfs so that it will start DRBD before mounting the root filesystem.
Shrinking the root filesystem
To calculate the size to which you have to shrink the existing filesystem, you can use the following script which performs the calculation documented in the DRBD manual:
#!/bin/bash which bc >/dev/null 2>&1 if [ ! $? -eq 0 ]; then echo "Error: bc is not installed" exit 1 fi if [ $# -lt 1 ]; then echo "Error: Please supply block device path" echo "Eg. /dev/vg1/backups" exit 1 fi DEVICE=$1 SECTOR_SIZE=$( blockdev --getss $DEVICE ) SECTORS=$( blockdev --getsz $DEVICE ) MD_SIZE=$( echo "((($SECTORS + (2^18)-1) / 262144 * 8) + 72)" | bc ) FS_SIZE=$( echo "$SECTORS - $MD_SIZE" | bc ) MD_SIZE_MB=$( echo "($MD_SIZE / 4 / $SECTOR_SIZE) + 1" | bc ) FS_SIZE_MB=$( echo "($FS_SIZE / 4 / $SECTOR_SIZE)" | bc ) echo "Filesystem: $FS_SIZE_MB MiB" echo "Filesystem: $FS_SIZE Sectors" echo "Meta Data: $MD_SIZE_MB MiB" echo "Meta Data: $MD_SIZE Sectors" echo "--" echo "Resize commands: resize2fs -p "$DEVICE $FS_SIZE_MB"M"
You might need to boot the system using a live system so that you can shrink the filesystem. ext4 for example does not support online shrinking.
Configure the DRBD resource
After rebooting into the system with the shrinked root filesystem, you need to configure the DRBD resource itself. Here is what I use:
cat > /etc/drbd.d/root.res <<'EOT' resource root { handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh"; local-io-error "/usr/lib/drbd/notify-io-error.sh"; } startup { become-primary-on master; # Wait 10 seconds on boot until the peer connects. wfc-timeout 10; } net { # While DRBD uses TCP, it might not detect all errors when # checksum offloading is enabled. CRC32 is computationally # cheap enough to just turn it on. data-integrity-alg crc32c; } syncer { rate 100M; verify-alg crc32c; } on master { device /dev/drbd0; disk /dev/vda2; address 192.168.1.10:7789; meta-disk internal; } on slave { device /dev/drbd0; disk /dev/vda2; address 192.168.1.20:7789; meta-disk internal; } } EOT
Configuring the initramfs hook
The first script we create is the one which will be placed in the initramfs itself. It needs to set the correct hostname, setup the ethernet interface, possibly start mdadm, then create the DRBD devices and finally mount the root filesystem:
cat > /usr/share/initramfs-tools/scripts/drbd <<'EOT' # vim:ts=4:sw=4:noet # DRBD mounting retry_nr=0 do_drbdmount() { configure_networking [ "$quiet" != "y" ] && log_begin_msg "Running /scripts/drbd-premount" run_scripts /scripts/drbd-premount [ "$quiet" != "y" ] && log_end_msg ifconfig eth0 up ifconfig eth0 192.168.1.10 netmask 255.255.255.0 hostname master # In case you are using mdraid: #mdadm --assemble --scan /sbin/drbdadm up all /sbin/drbdadm sh-b-pri all for x in $(cat /proc/cmdline); do case $x in drbdroot=*) DRBDROOT="${x#drbdroot=}" ;; esac done mount -t ext4 ${DRBDROOT} ${rootmnt} } mountroot() { [ "$quiet" != "y" ] && log_begin_msg "Running /scripts/drbd-top" run_scripts /scripts/drbd-top [ "$quiet" != "y" ] && log_end_msg modprobe drbd # For DHCP modprobe af_packet wait_for_udev 10 # Default delay is around 180s delay=${ROOTDELAY:-180} # loop until nfsmount succeeds do_drbdmount while [ ${retry_nr} -lt ${delay} ] && [ ! -e ${rootmnt}${init} ]; do [ "$quiet" != "y" ] && log_begin_msg "Retrying drbd mount" /bin/sleep 1 do_drbdmount retry_nr=$(( ${retry_nr} + 1 )) [ "$quiet" != "y" ] && log_end_msg done [ "$quiet" != "y" ] && log_begin_msg "Running /scripts/drbd-bottom" run_scripts /scripts/drbd-bottom [ "$quiet" != "y" ] && log_end_msg } EOT
After reading the script, it should be clear to you why such a script is not normally included in distributions nor recommended: The dependencies are hard to set up in a generic way (e.g. configuring the network, starting RAID arrays, etc.).
The second script will run every time we generate a new initramfs and include the necessary tools and files.
cat > /usr/share/initramfs-tools/hooks/drbd <<'EOT' #!/bin/sh PREREQ="" prereqs() { echo "$PREREQ" } case $1 in prereqs) prereqs exit 0 ;; esac . /usr/share/initramfs-tools/hook-functions copy_exec /sbin/drbdadm copy_exec /sbin/drbdmeta copy_exec /sbin/drbdsetup cp -R /etc/drbd.* "${DESTDIR}/etc/" mkdir -p "${DESTDIR}/var/lib/drbd" cp -p /var/lib/drbd/node_id "${DESTDIR}/var/lib/drbd/node_id" exit 0 EOt
Afterwards, use update-initramfs -u
to generate a new initramfs.
You can verify that the new files are included in the initramfs by using
lsinitramfs /boot/initrd.img-$(uname -r)
.
Without any further changes, nothing will change when you reboot.
Creating the metadata (once)
An easy way to create the metadata is to stop booting in the initramfs and use the
provided shell. Reboot the machine, then, in grub, add the parameters
break=premount boot=drbd drbdroot=/dev/drbd0
, then run the
following commands in the resulting shell:
modprobe drbd ifconfig eth0 up ifconfig eth0 192.168.1.10 netmask 255.255.255.0 hostname master drbdadm up root drbdadm -- --overwrite-data-of-peer primary root mount -t ext4 /dev/drbd0 /root exit
Afterwards, your system should boot normally.
Boot parameters
To make the changes persistent, modify GRUB_CMDLINE_LINUX_DEFAULT
in /etc/default/grub
to include boot=drbd
drbdroot=/dev/drbd0
. Afterwards, run update-grub
.
That’s it. Enjoy your root on DRBD :-).
I run a blog since 2005, spreading knowledge and experience for almost 20 years! :)
If you want to support my work, you can buy me a coffee.
Thank you for your support! ❤️