Slow throughput coupled with high system cpu time in KVM guests accessing host nfs share #12

nefilim · 2013-02-11T04:02:07Z

Scenario:

SmartOS nfs exports /zones/content
Linux (CentOS6/Ubuntu 12.04/Fedora14) KVM guests (running on the same physical server as the SmartOS host) are mounting this share over nfs. I tried to normalize options:

mount -t nfs4 -o rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.100.2,local_lock=none 192.168.100.22:/zones/content /content

to keep things consistent.

Throughput is extremely slow:

root@plex-ubuntu:/content/tv/weeds/season2# dd if=tpz-weeds208.avi of=/dev/null bs=4096k
58+1 records in
58+1 records out
244094976 bytes (244 MB) copied, 93.9973 s, 2.6 MB/s

coupled with high system cpu both in host & guest, here's an example from the ubuntu guest (1 vcpu):

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0      0 409072  42076 517372    0    0     0     0  105   14  0 100  0  0
 4  0      0 408660  42076 517372    0    0     0     0   67    8  0 100  0  0
 2  1      0 375860  42076 548092    0    0     0     0   67   15  0 74  0 26
 4  0      0 375984  42076 548092    0    0     0     0  111    7  0 100  0  0
 2  1      0 375960  42076 548092    0    0     0     0   84   10  0 100  0  0
 3  1      0 376132  42076 548092    0    0     0     0   97   12  0 86  0 14
 3  0      0 376496  42076 548092    0    0     0     0  273   28  0 97  0  3

and from the centos guest (2 vcpus):

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0  21608 488124  29940 374448    0    0     0     0   90  187  0  0 100  0  0   
 0  0  21608 488124  29940 374448    0    0     0     0   89  182  0  0 100  0  0   
 9  1  21608 472616  29940 389808    0    0     0     0  157   92  1 32 56 10  0    
 2  0  21608 472336  29940 389808    0    0     0     0  173  153  2 29 59  9  0    
 3  1  21608 471724  29940 389808    0    0     0     0  138  162  0 33 26 41  0    
 4  1  21608 452404  29940 405108    0    0     0     0   99  110  0 25 18 56  0    
 1  1  21608 453024  29940 405168    0    0     0     0  117  132  1 30 48 21  0    
 7  0  21608 453396  29940 405168    0    0     0     0  156  137  1 31 69  0  0    
 2  1  21608 437400  29940 420468    0    0     0     0   90  133  0 23 26 51  0    
 1  1  21608 437524  29940 420468    0    0     0     0  101  147  0 26  0 74  0    
 4  1  21608 437524  29940 420528    0    0     0     0  160  137  1 27  7 65  0    
 1  1  21608 437276  29940 420528    0    0     0     0   88  146  2  6  1 91  0    
 1  1  21608 437276  29940 420528    0    0     0     0  130  148  0 28  5 67  0    
 1  1  21608 437648  29940 420528    0    0     0     0   72  128  2 13  0 85  0    
 2  0  21608 437616  29940 420528    0    0     0     0  118  128  0  8 74 18  0    
 5  0  21608 437756  29940 420528    0    0     0     0   96  113  0 31 69  0  0    
 1  0  21608 437896  29940 420528    0    0     0     0   84  134  0 19 81  0  0

When I mount the same nfs share on a separate dedicated physical linux host (fedora 18) connected to the SmartOS host via gigE, I get significantly better results:

[root@gatekeeper season2]# dd if=tpz-weeds211.avi of=/dev/null bs=4096k
58+1 records in
58+1 records out
244111360 bytes (244 MB) copied, 2.20326 s, 111 MB/s

just as a sanity check on the SmartOS host itself:

[root@smartos /zones/content]# dd if=tpz-weeds212.avi of=/dev/null bs=4096k
58+1 records in
58+1 records out
243603456 bytes (244 MB) copied, 1.24322 s, 196 MB/s

I will try to repeat this test using guests running under a separate physical SmartOS host.

nefilim · 2013-02-13T03:40:54Z

I have created a SmartOS installation on the other physical linux server, it does not support KVM (no EPT support, Q6600 CPU). I exported some content from the zones filesystem via zfs set sharenfs as before.

I mounted this export in one of the KVM VMs used above. The two SmartOS hosts are connected via gigE. Repeated the test:

[root@nexus-centos stuff]# dd if=file.mkv of=/dev/null bs=4096k 
154+1 records in
154+1 records out
647099382 bytes (647 MB) copied, 14.1884 s, 45.6 MB/s

This problem appears to be limited to access between guest and host.

nefilim · 2013-02-13T03:42:31Z

Please let me know if there's any other information you need or anything else you need me to try.

nefilim · 2013-02-13T03:47:49Z

I forgot the accompanying vmstat on the KVM guest doing the transfer (in this case from another physical SmartOS host):

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0   6024  23580 414536    0    0     0     0   14   23  0  0 100  0  0   
 0  0      0   6024  23580 414536    0    0     0     0   52  145  1  2 97  0  0    
 0  1      0   6216  23580 412936    0    0     0    40  322   78  0  1 88 11  0    
 0  1      0   6944  23580 408944    0    0     0     0 6999  233  0 22  0 78  0    
 0  1      0   6876  23580 409132    0    0     0     0 5542  161  0  8  0 92  0    
 0  1      0   7792  23580 408708    0    0     0     0 7411  179  0 14  0 86  0    
 0  1      0   6760  23580 408876    0    0     0     0 4419  157  0  7  0 93  0    
 3  0      0   4800  23572 402156    0    0     0     0 4746  161  0 13  0 87  0    
 0  1      0   6768  23572 409200    0    0     0     0 3396  105  0 10  0 90  0    
 0  1      0   7244  23572 408800    0    0     0     0 5014  146  0  5  0 95  0    
 0  1      0   6824  23572 409080    0    0     0     0 5125  163  0  8  0 92  0    
 0  1      0   6864  23172 409268    0    0     0     0 5799  174  0 12  0 88  0    
 0  1      0   6592  23172 409588    0    0     0     0 5646  160  0 11  0 89  0    
 0  1      0   7120  23172 409520    0    0     0     0 5413  149  0  4  0 96  0

UX-admin · 2013-03-15T16:59:47Z

On a Solaris 10 KVM reading from a Solaris 10 KVM NFS server, reads with the mount option "proto=tcp" are okay, with the throughput between the server and the client being roughly ~40 MB/s (Solaris 10 bare metal NFS server to SmartOS hypervisor NFS client observed at ~340 MB/s, peaking at 539 MB/s, so there is an almost ninefold performance penalty in reads).

Writes to the NFS server using bg,hard,intr,vers=[2-4],proto=tcp mount options write out exactly 2580 bytes, followed by a hang, whereby the shell stops reacting to SIGINT or SIGTSTP.

Work-around for Solaris 10 KVM's is to mount with "proto=udp". Performance as measured between the aforementioned KVM's falls down to 2-5 MB/s.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow throughput coupled with high system cpu time in KVM guests accessing host nfs share #12

Slow throughput coupled with high system cpu time in KVM guests accessing host nfs share #12

nefilim commented Feb 11, 2013

nefilim commented Feb 13, 2013

nefilim commented Feb 13, 2013

nefilim commented Feb 13, 2013

UX-admin commented Mar 15, 2013

Slow throughput coupled with high system cpu time in KVM guests accessing host nfs share #12

Slow throughput coupled with high system cpu time in KVM guests accessing host nfs share #12

Comments

nefilim commented Feb 11, 2013

nefilim commented Feb 13, 2013

nefilim commented Feb 13, 2013

nefilim commented Feb 13, 2013

UX-admin commented Mar 15, 2013