I decided to put together this howto because I spent practically an entire day at work trying to get this to work without needing a full resync after it was all said and done. So a quick run through of the setup and then off we go.
Note: This HOWTO describes shrinking a volume but will work exactly the same … no-resync necessary … for growing a DRBD volume backed by LVM 😀
I have 3 CentOS 5.1 domains (1x dom0 and 2x domU). The dom0 is running kernel 2.6.18-53.1.21.el5xen and Xen 3.0.3 and the 2 domUs are running kernel 2.6.18-53.1.19.el5xen and DRBD 8.2.5-1.el5. All of this came from the standard CentOS 5 yum repos. Each domU has two VBDs, namely /dev/xvda and /dev/xvdb. The OS is on /dev/xvda and the backing for DRBD is /dev/xvdb. Both of these are mapped to individual LVs on the dom0, namely /dev/vg0/node1 -> /dev/xvda and /dev/vg0/node1-drbd -> /dev/xvdb on node1 and /dev/vg0/node2 -> /dev/xvda and /dev/vg0/node2-drbd -> /dev/xvdb on node2. The filesystem on /dev/drbd0 is ext3 and is currently at 512MB for the duration of this howto. The current size of /dev/drbd0 is 3G and we want to shrink it to 2G.
Here’s my drbd.conf
for reference:
resource res { protocol C; startup { wfc-timeout 0; degr-wfc-timeout 120; } disk { on-io-error detach; } net { } syncer { rate 100M; # we're using 1Gbps crossover link } on node1.domain.com { device /dev/drbd0; disk /dev/xvdb; address 10.0.0.1:7788; meta-disk internal; } on node2.domain.com { device /dev/drbd0; disk /dev/xvdb; address 10.0.0.2:7788; meta-disk internal; } }
Whew … ready, steady go!
!!! READ ME — YOU ARE PROCEEDING BEYOND THIS POINT AT YOUR OWN RISK !!! THE IT DEPARTMENT CAN NOT AND WILL NOT BE HELD RESPONSIBLE / LIABLE / ACCOUNTABLE FOR THE RESULTS OF ANY OF YOUR ACTIONS. THIS IS PURELY AN EDUCATIONAL WRITING AND SHOULD NOT BE USED IN PRODUCTION — READ ME !!!
Now that I’ve scared the $h!t out of you … please let us continue.
0. Ensure your filesystem is smaller than the target size of the resource
In this example, our filesystem is 512MB which is already smaller than our target resource size of 2GB. Whenever you attempt this, simply resize your filesystem down smaller than your target (say -1GB or so than your target size) and when everything is said and done, you can simply grow the filesystem up to it’s maximum size possible.
1. Take down the resource on each node
[root@node1 ~]# drbdadm down res [root@node2 ~]# drbdadm down res
2. Dump the resource metadata to disk on each node
[root@node1 ~]# drbdadm dump-md res > /tmp/metadata [root@node2 ~]# drbdadm dump-md res > /tmp/metadata
Do not dump the meta data on one node, and simply copy the dump file to the peer. This will not work. You must dump the resource metadata on each node!
3. Detach the VBDs from the domUs
To actively resize the LVM without restarting the domU you have to detach the VBD from the domU / resize the LV in the dom0 / reattach the VBD to the domU. To detach a VBD, you need it’s dev-id which can be found using xm block-list
[root@dom0 ~]# xm block-list node1 Vdev BE handle state evt-ch ring-ref BE-path 51713 0 0 4 6 8 /local/domain/0/backend/vbd/57/51713 51728 0 0 4 8 1377 /local/domain/0/backend/vbd/57/51728
[root@dom0 ~]# cat /etc/xen/node1 | grep disk disk = [ "phy:/dev/vg0/node1,xvda,w", "phy:/dev/vg0/node1-drbd,xvdb,w" ]
From what I can tell, this is linear to your domU configuration. The first device in the xm block-list
output corresponds to the first device in your configuration, the second corresponds to the second device, and so forth. Therefore, in my configuration, my domU node1’s VBD /dev/xvda
has dev-id 51713 and it’s VBD /dev/xvdb
has dev-id 51728.
Same thing goes for domU node2:
[root@dom0 ~]# xm block-list node2 Vdev BE handle state evt-ch ring-ref BE-path 51713 0 0 4 6 8 /local/domain/0/backend/vbd/57/51713 51728 0 0 4 8 1377 /local/domain/0/backend/vbd/57/51728
[root@dom0 ~]# cat /etc/xen/node2 | grep disk disk = [ "phy:/dev/vg0/node2,xvda,w", "phy:/dev/vg0/node2-drbd,xvdb,w" ]
Now you can detach…
[root@dom0 ~]# xm block-detach node1 51728 [root@dom0 ~]# xm block-detach node2 51728
4. Resize the LVs which are the backing for the VBDs
In my case, I have /dev/vg0/node1-drbd and /dev/vg0/node2-drbd. In this example, I am shrinking my DRBD device down to 2GB:
[root@dom0 ~]# lvresize -L2G /dev/vg0/node1-drbd WARNING: Reducing active logical volume to 2.00 GB THIS MAY DESTROY YOUR DATA (filesystem etc.) Do you really want to reduce node1-drbd? [y/n]: y Reducing logical volume node1-drbd to 2.00 GB Logical volume node1-drbd successfully resized
[root@dom0 ~]# lvresize -L2G /dev/vg0/node2-drbd WARNING: Reducing active logical volume to 2.00 GB THIS MAY DESTROY YOUR DATA (filesystem etc.) Do you really want to reduce node2-drbd? [y/n]: y Reducing logical volume node2-drbd to 2.00 GB Logical volume node2-drbd successfully resized
Enter yes when it asks you if “you really want to reduce” though you “MAY DESTROY YOUR DATA” … the force “The IT Department” is with you
5. Attach the VBDs back to the domUs
This step is simple enough:
[root@dom0 ~]# xm block-attach node1 phy:/dev/vg0/node1-drbd xvdb w [root@dom0 ~]# xm block-attach node2 phy:/dev/vg0/node2-drbd xvdb w
6. Restore the resource metadata
This is probably the most complicated step. The jist of it is you have to create new metadata, then overwrite it with the old metadata you dumped earlier with a slight modification. You’ll see …
First create new metadata on each node
[root@node1 ~]# drbdadm create-md res md_offset 2147479552 al_offset 2147446784 bm_offset 2147381248 Found ext3 filesystem which uses 524288 kB current configuration leaves usable 2097052 kB ==> This might destroy existing data! <== Do you want to proceed? [need to type 'yes' to confirm] yes [root@node2 ~]# drbdadm create-md res md_offset 2147479552 al_offset 2147446784 bm_offset 2147381248 Found ext3 filesystem which uses 524288 kB current configuration leaves usable 2097052 kB ==> This might destroy existing data! <== Do you want to proceed? [need to type 'yes' to confirm] yes
If you get a warning about there already being a v08 style flexible-size internal meta data block …
v07 Magic number not found v07 Magic number not found You want me to create a v08 style flexible-size internal meta data block. There apears to be a v08 flexible-size internal meta data block already in place on /dev/xvdb at byte offset 2147479552 Do you really want to overwrite the existing v08 meta-data? [need to type 'yes' to confirm] yes
Go ahead and type ‘yes’ to overwrite it … again “The IT Department” is with you
Now do you see the line where DRBD says “current configuration leaves usuable 2097052 kB” … well we need that number. That number is the answer to life the amount of data the device can use minus the size of the internal metadata DRBD needs. The only problem is it’s in kB and we need it in sectors. You’re right … simple conversion: x KB / 1024 = y MB … y MB x 2048 = z Sectors
Therefore: 2097052 KB / 1024 = 2047.90234375 MB * 2048 = 4194104 Sectors. Now before we can restore the original metadata dump, we have to update it with the new size. Then we can restore it:
[root@node1 ~]# sed -i -e 's/la-size-sect.*/la-size-sect 4194104;/g' /tmp/metadata [root@node1 ~]# drbdmeta_cmd=$(drbdadm -d dump-md res) [root@node1 ~]# ${drbdmeta_cmd/dump-md/restore-md} /tmp/metadata Valid meta-data in place, overwrite? [need to type 'yes' to confirm] yes reinitialising Successfully restored meta data
[root@node2 ~]# sed -i -e 's/la-size-sect.*/la-size-sect 4194104;/g' /tmp/metadata [root@node2 ~]# drbdmeta_cmd=$(drbdadm -d dump-md res) [root@node2 ~]# ${drbdmeta_cmd/dump-md/restore-md} /tmp/metadata Valid meta-data in place, overwrite? [need to type 'yes' to confirm] yes reinitialising Successfully restored meta data
7. Bring up the resource on each node
If all goes well … you should simply be able to issue a drbdadm up
on each node
[root@node1 ~]# drbdadm up res [root@node2 ~]# drbdadm up res
And then verify everything is connected and UpToDate…
[root@node1 ~]# service drbd status version: 8.2.5 ( api:88/proto:86-88 ) GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by buildsvn@c5-i386-build, 2008-05-11 03:43:50 m:res cs st ds p mounted fstype 0:res Connected Secondary/Secondary UpToDate/UpToDate C
[root@node2 ~]$ service drbd status version: 8.2.5 ( api:88/proto:86-88 ) GIT-hash: 9faf052fdae5ef0c61b4d03890e2d2eab550610c build by buildsvn@c5-i386-build, 2008-05-11 03:43:50 m:res cs st ds p mounted fstype 0:res Connected Secondary/Secondary UpToDate/UpToDate C
Everything is a-ok and no need for a full-resync!!!
8. Conclusion
This was a simple example with a filesystem of 512MB and we only shrank the resource from 3GB to 2GB but this will work for larger configurations (I personally did it from 50GB down to 25GB earlier today). Maybe one day this won’t be as complicated but for now this works and I have no complaints.
Nice writeup!
Very helpful. Looking forward to more howto’s.
Thanks for that, currently trying to get DRBD working in the cloud on ec2 and having problems, an going down the LVM path now.