Tuesday, December 27, 2016

performance - How can I tell which page is creating a high-CPU-load httpd process?



I have a LAMP server (CentOS-based MediaTemple (DV) Extreme with 2GB RAM) running a customized Wordpress+bbPress combination .




At about 30k pageviews per day the server is starting to groan. It stumbled earlier today for about 5 minutes when there was an influx of traffic. Even under normal conditions I can see that the virtual server is sometimes at 90%+ CPU load. Using Top I can often see 5-7 httpd processes that are each using 15-30% (and sometimes even 50%) CPU.



Before we do a big optimization pass (our use of MySQL is probably the culprit) I would love to find the pages that are the main offenders and deal with them first. Is there a way that I can find out which specific requests were responsible for the most CPU-hungry httpd processes? I have found a lot of info on optimization in general, but nothing on this specific question.



Secondly, I know there are a million variables, but if you have any insight on whether we should be at the boundaries of performance with a single dedicated virtual server with a site of this size, then I would love to hear your opinion. Should we be thinking about moving to a more powerful server, or should we be focused on optimization on the current server?


Answer



strace is a good way to start debugging this kind of problem. Try to strace the pid of one of the Apache processes consuming more CPU:



strace -f -t -o strace.output -p PID



This will show you the system calls made within that process. Take a look at strace.output and see what the process was doing. This might enlighten the way and show you where the process is hanging. The "-t" flag is very important here as it will prefix each line of the strace output with the time of the day. So, search for a leap.



On the other hand and as you think MySQL is probably the culprit, I'd enable the slow query log, take a look at it and try to optimize that queries. More info about the slow query log here.



Also, don't forget to take a look at the logfiles of your webserver.



Regarding your second question, I think it's hard to tell with only this info. Separating the frontend (webserver) from the backend (database) is always a good practice if you have the budget for it. On the other hand, I think that before adding more hardware, one should focus on trying to optimize the performance using the current hardware. Otherwise, the problem is probably just being postponed.




Hope this helps.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...