I have a 2005 vintage server (dual 3GHz Xeons, LSI53C1030T RAID/SCSI controller with 256MB cache, 8GB RAM) and I'm re-purposing it for some light VM storage duty.
First try to was to put 4x300GB drives into a hardware RAID5, and then install Openfiler's LVM and iSCSI on top of it. That resulted in very inconsistent read speeds (20MB/sec to 2GB/sec, but that's probably caching), and a horrible but consistent 8MB/sec write. All these results were measured with both local dd and an actual big file transfer over the network, and both yielded similar results.
So after much reading I found that the aforementioned LSI controller isn't that great for hardware RAID, so I turned off the RAID functionality on the channel with the 4x300GB drives, made the RAID array with mdadm software RAID, and put LVM on top of it. I did more tests, and the results improved (20MB/sec writes), but that's still rather horrible. I spent another day aligning partitions, optimizing chunk, stripe-width, stride sizes, playing with ext4 options, different journaling options, etc, without much observable improvement.
Another experiment I did was running hdparm -tT
on /dev/md0 vs /dev/mapper/vg0-lv0 (which was simply a mapping of the entire md0) and I got 2x slowdown when going through the LVM. I've read that LVM can introduce some speed penalties, but cutting the speed in half is not acceptable.
Since none of this was making sense, I went back to basics, made a single partition on a single drive, no LVM, RAID, just plain old SCSI320 and ran some tests on it. I got ~75MB/sec read and ~55MB/sec write with multiple runs and multiple programs.
So if one drive can do 75MB/sec read and 55MB/sec write, why does RAID5 (hardware or software!) of 3 of them gets such horrible speeds? What am I doing wrong? What else should I try?
UPDATE 1:
While continuing with experiments, I noticed that one of the disks sometimes didn't want to be partitioned; parted and fdisk would simply refuse to actually write out the partitions to it. So I tried the same commands on all the other disks to make sure it's not a systemic problem, and it looked to be isolated only to that one disk. I proceeded to run smartctl
's health tests on it, and everything checked out fine. dmesg
was the only source of any indication that there might be something wrong with the drive, albeit with rather cryptic and not particularly helpful messages. Out of sheer curiosity, I pulled out the drive, rebooted, and redid everything I've done so far for software RAID5 without LVM but with ext4 on it. On first try, I got 200MB/sec reads, and 120MB/sec writes, to a five drive array (I found two more 300GB drives in the meantime) when testing with dd dumping 4.2GB files in 64kB blocks onto the new partition. Apparently the drive, while not completely dead, wasn't particularly cooperative, and once out of the equation, everything ran MUCH better.
I feel saner now, 8MB/sec just didn't feel right, no matter which RAID level.
Tomorrow: testing with LVM and maybe going back to hardware RAID.
Answer
RAID5 is notoriously bad for write performance. The reason for this is that every write to a particular disk needs to update the parity block, so therefore every write requires reads from every other disk in the array + a computation of the parity which is then re-written to the disk where the parity is kept for that particular block.
This takes a long time, compared to just writing a single block.
If you want fast writes, a mirrored configuration is better, such as RAID1 or RAID10.
No comments:
Post a Comment