Thursday, August 6, 2009

Debian Raid 5 With LVM

I'm mostly posting this here for myself, but I sure hope someone else finds it useful.

It took us like three days to finally get this whole thing working, so I'll post what I did here in the hopes of saving someone else the trouble. There is a lot of conflicting and straight up wrong information out there in articles all over the interwebs.

In our scenario, we originally had two 1TB drives sitting in a box to hold multimedia stuff, but it quickly filled up. Our solution was to do a software raid-5 on Debian. There were multiple reasons for this.

Obviously fault tolerance was the reason for raid-5 rather than raid-0.

Our choice of Linux was because, even on Server 2008 DataCenter SuperAwesome Edition you can't grow your raid-5. Actually, you can, but you have to backup your data, break and then re-create the raid. With 2TB of data that just wasn't feasible to go out and buy hard drives just to back it up. And in the future when I have 3TB, I would need to buy even MORE drives. Really dumb. Seriously, Microsoft failed at this.

I've heard you can grow your raid without breaking it with some hardware raid cards, but we felt doing a software raid for our situation, i.e. just a home file server, would be much more appropriate. So, we used Linux, and Debian has a particularly nice interface for doing this, and doing everything on the command-line is very h4x0r-ish.

So let's define some terms to get them out of the way.

Raid 5: Requires a minimum of three drives. One drive's worth of space will go unused in the array, but if any one drive fails, you can rebuild your data (pretty easily actually) when you add another third drive. The raid will actually continue to function, albeit slowly, with only 2 drives. The biggest drive can only be as big as the smallest drive. That means if you have two 1TB drives and you decide to add a 1.5TB drive, only 1TB of that 1.5 will be used. The other 500GB is completely wasted. I don't think there is a good way around this.

: Logica l Volume Manager. This is a nice diagram of what LVM does by itself without a raid present. It's essentially like raid-0, but you don't lose all your data if one drive fails, you just lose that drive' s data. The idea is if you have x number of drives, LVM will make it look like y number of drives, of however much space, whatever you specify.

Volume Group: In LVM, this is a "container" for Logical Volumes.

Logical Volume: This is the top part of that picture. It is the logical version of whatever you specify. The OS sees it as a drive, but it doesn't know that its actually pulling space from x number of drives.

Setting Up The Raid

First thing you need to do is set up your raid. You can do this on the command-line, but Debian's install CD has a really nice interface for doing this. I generally followed this article:

This guy is doing it a _little_ differently. He's setting a raid-1 or someting, but you can read it and get the idea of what he's doing. Basically you partition your three (or more) disks you want on the raid and specify them to be used as "physical volume for raid". Then you configure your raid to be Raid 5. Then on the Raid 5 thing that gets created, you select Use as: physical volume for LVM. Then you create your volume group and logical volume (in our case we just had one logical volume that spanned the whole size of the raid). Then on top of THAT, you choose Use as: . We used XFS, but you could use ext3 or whatever else you wanted.

So basically you are taking all your drives and raiding them together so the OS sees it as one big drive. Then, LVM lays on top of that and sees that raid as if it were one big physical disk, and gives you the ability to expose it as multiple logical volumes. In the picture above, instead of 3 disks, LVM is pointing to one big one, and the raid is pointing to the three smaller disks below.

So that's it! Your raid is created. Now comes the tricky part: growing the raid without blowing up your data.

Growing Your Raid 5

In our scenario we originally had (before the raid setup) 1.25TB of data and 2TB of space. I bought two more1TB drives. One to be in the initial raid setup, and one to backup(most of) the data while we made the raid (the rest of it went on our own computers).

Later after the raid was created, we copied all the data from the spare drive to the raid.

Before you grow your raid, you need to wait for it to resync itself. When you first create the raid it will resync in the background. You can watch its progress with

cat /proc/mdstat

Or if you really want to WATCH it, you can do:

watch -n 0.1 cat /proc/mdstat

Anyway, when this finishes, then you can continue. It took a little over 10-ish hours to finish building it on our 3TB array (of which 2TB was usuable (see raid 5 definition above)).

First you have to add the drive to the raid like this:

mdadm --add /dev/md0 /dev/sde

/dev/md0 in our case points to the raid itself. If you have multiple or if you want to make sure its pointing to the right one, you can do

mdadm --detail /dev/md0

To make sure. After you to --add, the drive is seen by the raid as a spare. This means if one drive failed, the spare would automagically take over and the raid would start rebuilding itself without you having to do anything.

But, since we want to add that drive to the array, not use it as a spare, you have to grow the array, like this:

mdadm --grow /dev/md0 -n 4

-n specifies the new number of devices in the array. If you had three and you want to add two, it would be -n 5.

Now the raid needs to rebuild again. Our now 4TB array (of which 3TB is usable) took about 20 hours to rebuild. Seriously. Again, you can check its progress with cat /proc/mdstat.

By the way, this can all be done without even unmounting the raid. All the data is still accessible while all this is being done.

Once it's finished rebuilding, now you need to extend the logical volume on top of it. You can do this with the following commands:

pvresize /dev/md0
lvextend -L 3T /dev/mapper/vgn-lvn
xfs_growfs /raid

/dev/mapper/vgn-lvn is where your logical volume is located.