Sunday, February 19, 2017

apache 2.2 - How to track down an io-bound botleneck

I'm currently working on optimization of a web server, however I'm quite stuck at one particular problem.
I'm using jmeter to simulate simulate load. jmeter is configured as follows:




  • 400 threads

  • Ramp-up 30 seconds

  • Loop count 1

  • Each thread visits 17 different pages on the server with a 1 - 5 second delay between each request.




What I'm experiencing is, up to 350 threads everything seems to be working as it should.
The load and cpu usage increases, the site becomes noticeable slower but is till usable.



Somewhere between 350 - 400 threads however something happens. The load drops to nearly nothing
, cpu is idling about 75 - 85% and the site hangs for several minutes for everyone.



What I have ruled out:





  • The server does not swap, at least it does not show in top and collectd graphs.

  • There is no MySQL queries that are waiting to finish (as reported by MySQL Administrator). Although I'm seeing a lot of open connections.

  • max_connections in MySQL is 1600 (1 MySQL connection per request, so this limit is far from reached)

  • wait is so to speak non existent in cpu graphs (collectd)

  • We are using memcached, but the timeout is set to 1 second.

  • memcached is run on the same server so network latency should not be an issue.

  • MaxClients and ServerLimit is not reached in apache




I'm running out of ideas of how to track this issue down.
Any tips, tricks or ideas to help pin the reason down?



Thanks

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...