Friday, April 26, 2019

kvm virtualization - KVM + NFS poor disk performance



Situation: We have an Ubuntu server that hosts three VMs using KVM. All guests as well as the host need to access the same files in a certain subfolder of /var . Thus, the subfolder is exported via NFS. Our problem is that the guest can read from/write to the directory with only half the speed of the host. The export table looks like this



alice@host:~$ cat /etc/exports

/home/videos 192.168.10.0/24(rw,sync,no_root_squash)


where the host has IP 192.168.10.2 and the VMs 192.168.10.1{1..3}. /home/videos is a symlink to that certain subfolder in /var. In particular, it's /var/videos/genvids.



This is the relevant line from the VM's fstab:



192.168.10.2:/home/videos /mnt/nfs nfs auto,noatime,rsize=4096,wsize=4096  0 0



The hard disk has a sustained data rate of ~155 MB/s which is verified by outputs of hdparm -tT as well as dd:



alice@host:~$ dd if=/home/videos/4987_1359358478.mp4 of=/dev/null bs=1024k count=300
300+0 records in
300+0 records out
314572800 bytes (315 MB) copied, 2.04579 s, 154 MB/s


From within a VM things look differently:




bob@guest:~$ dd if=/mnt/nfs/4959_3184629068.mp4 of=/dev/null bs=1024k count=300
300+0 records in
300+0 records out
314572800 bytes (315 MB) copied, 4.60858 s, 68.3 MB/


Fitting the blocksize to the file system's page size had no satisfying effect:



bob@guest:~$ dd if=/mnt/nfs/4925_1385624470.mp4 of=/dev/null bs=4096 count=100000
100000+0 records in

100000+0 records out
409600000 bytes (410 MB) copied, 5.77247 s, 71.0 MB/s


I consulted various pages on NFS performance, most relevant the NFS FAQs Part B, and the respective Performance Tuning Howto. Most of the hints do not apply. The others did not improve the results. There are threads here that deal with disk performance and KVM. However they do not cover the NFS aspect. This thread does, but network speed seems not the limiting factor in our case.



To give a complete picture this is the content of the exports etab with symlinks resolved and all active export options shown:



alice@host:~$ cat /var/lib/nfs/etab
/var/videos/genvids 192.168.10.0/24(rw,sync,wdelay,hide,nocrossmnt,secure,

no_root_squash,no_all_squash,no_subtree_check,secure_locks,acl,
anonuid=65534,anongid=65534)


What also bothers me in this context - and what I do not understand - is the nfsd's procfile output:



alice@host:~$ cat /proc/net/rpc/nfsd
...
th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
...



For the third column and beyond I would have expected values other than zeros after reading from the disk within the VMs. However, nfsstat tells me that there were indeed read operations:



alice@host:~$ nfsstat
...
Server nfs v3:
null getattr ...
9 0% 15106 3% ...
read write ...

411971 95% 118 0% ...
...


So, the topic is quite complex and I'd like to know where else to look or whether there is an easy solution for this.


Answer



As it turns out the problem was easier to resolve than expected. Tuning the rsize and wsize option in the VM's fstab did the trick. The respective line is now



192.168.10.2:/home/videos /mnt/nfs nfs auto,noatime,rsize=32768,wsize=32768  0 0



For me this was not obvious since I had expected best performance if values for rsize and wsize meet the disk's block size (4096) and are not greater than the NIC's MTU (9000). Apparently, this assumption was wrong.



It is notable that the exact sustained disk data rate depends on the very file: For two similar files of size 9 GB I observed rates between 155 MB/s (file 1) and 140 MB/s (file 2). So a reduced data rate with one file may still result in full data rate with another file.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...