Tuesday, June 23, 2015

performance - Fiberchannel SAN RAID is slow (DS4100)



I have set up two DS4100 SAN linked with fiberchannel. I've set up multiple RAID configurations: RAID10 spanning 4 disks in each SAN, a few RAID5's. (hardware RAID handled by DS4100)




I'm running 32-bit Windows 2008 Datacenter edition. I've connected server to SAN through fibre channel. I've disabled "write caching with mirroring".



Now whatever I do I can't seem to get more than 40MB/s performance out of the SAN storage. I would expect more of 4 striped disks (RAID10, 8 total) or 6 disks in RAID5.



What performance can I expect from 6x400GB SATA-disks in RAID5?
Any idea what I can check to find why it is so slow?


Answer



Lets talk about I/O Operations Per Second (IOPS)



If you assume that you have these disks, their average latency is 4.17ms (which is determined by the platter size & their rotational speed (in this case, 7200RPM)). You also need to know the average read/write seek time to really calculate IOPS. This site claims the average seek time is 12ms (which is horrible, and will be the cause of your problems, as we'll see...)




Determining IOPS is pretty imprecise because to get it right, you need to know what your read percentage is vs your write percentage (writes are slower than reads because, apparently, the head needs to be more precisely placed).



The calculation for IOPS is 1 / (avg latency + avg seek time), so each drive would be capable of 1 / (0.00417s + 0.012s), or 1/.01617, or right around 60 IOPS.



So that's one drive. But you've got several!



You mentioned an 8-disk RAID-10 array. That's great, because while you've got to write the data twice, you can read from all 8 at once.



Assuming a 100% read workload, 60 IOPS X 8 drives = 480 IOPS.




How do IOPS relate to throughput, though? Well, we have to go back to the "imprecise" part, because it depends on what percentage of your disk I/O is random.



On a 100% random workload, you can kind of assume that one operation gives you one block. So then, how big is the block size?



According to this PDF, the DS4100 had a 16k block size.



We can use that to calculate the sheer amount of output you can get.



At around 480 IOPS, each of which is getting 16KB, you'll be pulling 7.68MB/s with a purely random workload. Because your workload isn't random, you're getting ~5.25x this speed.







Look, I seriously doubt that you're ever going to be pulling great numbers with these drives. 12ms seek time is almost criminal, if that's really the case (find out), plus you're probably writing to the array, too, even if you don't know you are.



My advice: learn what your I/O profile looks like. Minimize the amount of writes (mount it noatime if you don't care about POSIX compatibility). Get better disks.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...