Friday, April 17, 2015

networking - How do I know if my linux server can't keep up with network traffic and what to do about it?




Suppose I have a web-server serving html-pages under high load. Let's assume that for some reason the bottleneck is not cpu, not ram and not disk but rather the network itself. How can I tell that a Linux server simply either sends or receives too much traffic and won't keep up? How do I know if network bandwidth is, let's say, over 60% of its capacity? If it's over capacity how do I scale it?


Answer



In general, analyze the entire system for where the limits are. For example, the USE methodology checks every resource for utilization, saturation, and errors.



All environments can collect basic easy to measure performance metrics like CPU utilization and interface bandwith utilization. On Linux, tooling like netdata or perf can show quite a lot of metrics in fine detail.



A deep understanding of your environment helps find where the bottleneck is. Bandwidth maxing out at 95 Mbit/s may be due to an old 100 Mbit switch in the path, or the Internet service is 100 Mbit/s. Or the storage system is quite slow. Or the NICs are reporting overruns because packet buffers aren't being emptied fast enough.



Where you can, try scaling out web servers to more hosts on different hardware. Total resources of more than one VM may help. Can try things on one host at a time with a control. And as a bonus, load balancing could be considered a high availability feature.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...