Our production web server has gone down a few times over the course of the last half year. In the end, we've needed to contact the web host and have them restart as I'm unable to even SSH in. This appears to only affect the web server and not the MySQL database server which is separate. When it affects the web server, all hosted websites time out.
I'd like to examine web server optimization/corrections to get to the root of this issue. Any recommendations on how to proceed with that? I'm sure log files would play a role. I'm able to find my way around a Linux-based server and make needed changes, but would be interested in any tips I may not have thought of yet. It may be best for us to speak with an outside consultant as another option.
Thanks.
Answer
This sounds like a classic case of swapping. If you have any metrics/monitoring system at all available check the memory reports (sar, cacti, munin, etc). If not, time to pick one and set it up.
Odds are its the simple case of (number of apache children) x (average memory size of an apache child) > available memory. You can attack this in several ways, first see if you can trim down your php scripts. Don't go nuts, but if there's some simple include/require/classloader fixes you can make you might be able to chop their footprint in half with a quick afternoons worth of profiling work. After that, whatever your average apache child size is do the math to figure out how much would fill up all available ram, then back off ~20% and make that your MaxClients setting.
No comments:
Post a Comment