Saturday, January 2, 2016

linux - Latency in TCP/IP-over-Ethernet networks



What resources (books, Web pages etc) would you recommend that:




  • explain the causes of latency in TCP/IP-over-Ethernet networks;

  • mention tools for looking out for things that cause latency (e.g. certain entries in netstat -s);

  • suggest ways to tweak the Linux TCP stack to reduce TCP latency (Nagle, socket buffers etc).




The closest I am aware of is this document, but it's rather brief.



Alternatively, you're welcome to answer the above questions directly.



edit To be clear, the question isn't just about "abnormal" latency, but about latency in general. Additionally, it is specifically about TCP/IP-over-Ethernet and not about other protocols (even if they have better latency characteristics.)


Answer



In regards to kernel tunables for latency, one sticks out in mind:



echo 1 > /proc/sys/net/ipv4/tcp_low_latency



From the documentation:




If set, the TCP stack makes decisions that prefer lower
latency as opposed to higher throughput. By default, this
option is not set meaning that higher throughput is preferred.
An example of an application where this default should be
changed would be a Beowulf compute cluster.

Default: 0




You can also disable Nagle's algorithm in your application (which will buffer TCP output until maximum segment size) with something like:



#include 
#include
#include
#include
#include

#include

int optval = 1;
int mysock;

void main() {
void errmsg(char *msg) {perror(msg);exit(1);}

if((mysock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) {
errmsg("setsock failed");

}

if((setsockopt(mysock, SOL_SOCKET, TCP_NODELAY, &optval, sizeof(optval))) < 0) {
errmsg("setsock failed");
}

/* Some more code here ... */

close(mysock);
}



The "opposite" of this option is TCP_CORK, which will "re-Nagle" packets. Beware, however, as TCP_NODELAY might not always do what you expect, and in some cases can hurt performance. For example, if you are sending bulk data, you will want to maximize throughput per-packet, so set TCP_CORK. If you have an application that requires immediate interactivity (or where the response is much larger than the request, negating the overhead), use TCP _NODELAY. On another note, this behavior is Linux-specific and BSD is likely different, so caveat administrator.



Make sure you do thorough testing with your application and infrastructure.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...