Saturday 13 November 2010

Disk management with Logical Volume Manager (LVM)

There is a lot of documentation on how to use Logical Volume Manager (LVM) Online but I'd like to just go over how I've been using LVM to illustrate some of the strengths and weaknesses.

The initial driving issue which made LVM a killer app was for handling large disks. This one system had an older SCSI RAID attached which only supported 2TB drives (a limitation of 32bit LBA, I think) but the sum of the disks (14 x 300GB) was, well, bigger. The equipment basically let me carve the array into 2TB disks. Using LVM, I can add those Physical Volumes (PVs) to a Volume Group (VG) and create Logical Volumes (LV) of any size desired including, ultimately, the total capacity of the RAID.

Another great feature of LVM is snapshots. Generally, a snapshot means you get a temporally fixed view of the file system for special purposes while general use continues unimpeded by storing the subsequent changes separately. So I can take a snapshot and then backup the snapshot which will assure that the filesystem (in the snapshot) is consistent from the time the backup starts to the time the backup finishes. Snapshots can also be used as a facility to simply roll-back files to a previous state. For example, I take a snapshot, run a test application which modifies a file, then restore that file from the snapshot to revert back.

However, LVM snapshots aren't as elegant as they are on some platforms. To create a snapshot, you must first have some unallocated space in your VG. You then allocate that space to the "snapshot" where disk changes since the snapshot can be stored. The bummer, man, is that this is a fixed amount of space you have to have on-hand and if it fills up, your "snapshot" device fails and if you had say a long backup running, you have to restart that backup. Even with this limitation, however, snapshots are still pretty useful. You can sortof figure out what the minimum size you need for a snapshot and ultimately, if you have snapshot space equal to the live system space, you're snapshot will never fill up.

The last feature I'd like to rant about is Online filesystem resizing. Now this is just absolutely great and very useful especially in concert with handling large volumes and managing snapshots. First of all, if you have a hardware RAID controller which lets you add drives and expand existing arrays as an Online operation, LVM is the layer which will let you expand your volumes to suit. There's two ways of doing this and first is to expand an existing block device (e.g. grow your sda from 1TB to 1.5TB) and you have to do this by modifying the partition table. This is slightly tricky but can be done online. The other way is by adding additional devices. Some RAID controllers (good ones) would let you add a second "logical disk" (or "virtual disk" depending on your vendor's jargon). If you add that additional disk, you simply initialize it as a new PV, add it to your VG and then add whatever you want to your LV.

Take the first example I had where the equipment would only allow 2TB devices. So first, you put all your disks in an array, and because you've got a lot of disks, maybe reserve 1 as a hot spare. So your total capacity is (14 disks - 1 hot spare - 1 for RAID-5 parity) * 300GB = 3600GB. You carve out your first LD and it's 2TB and appears in the OS as /dev/sda. Now generally, you should be putting a partition on your drives, to my knowledge, it's not required, but generally accepted that most disk applications will behave saner if they see a partition. Anyhow, so you've got /dev/sda1, so you initialize it (pvcreate /dev/sda1), you create a volume group (vgcreate myvgblah /dev/sda1), and you spin out your first LV (lvcreate -l 100%FREE -n mylv myvgblah). Hooray, you create your filesystem (mke2fs -j -L bigfs /dev/myvgblah/mylv) and mount it for regular use. Now sometime later you fill up that 2TB and realize that there's a pile of unused space. Well, you carve out another LD with the remaining 1.6TB which appears to the OS as /dev/sdb. Generally, I would expect this device to just show up, no rebooting or any crap like that. So you throw a partition on there, initialize the PV (pvcreate /dev/sdb1), add it to the existing volume group (pvextend myvgblah /dev/sdb1). With this free space, you can either add it all (lvextend -l 100%FREE /dev/myvgblah/mylv) or you could add it incrementally (lvextend -L +100G /dev/myvgblah/mylv) reserving free space for snapshots, additional LVs, and future growth.

Very handy to have all your disks in a pool (your VG) and be able to add logical drives (LVs), snapshot your drives, and incrementally expand your drives.

- Arch

Popular Posts