Wednesday, November 9, 2016

apache 2.2 - Website performance sluggish - High CPU and Disk Usage

Over the past couple of days my website's performance has been very sluggish, with queries taking a lot of time to execute. My CPU usage hit around 100% 4 times this week. Here is the output of top at one such time




top - 00:08:03 up 3 days, 21:47,  2 users,  load average: 6.06, 1.95, 0.84
Tasks: 92 total, 2 running, 90 sleeping, 0 stopped, 0 zombie
%Cpu(s): 86.1 us, 12.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 1.0 hi, 0.0 si, 0.0 st
KiB Mem: 1017948 total, 773520 used, 244428 free, 107200 buffers
KiB Swap: 0 total, 0 used, 0 free. 257228 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28433 www-data 20 0 854660 69288 5608 S 98.7 6.8 0:47.36 apache2
28469 www-data 20 0 529692 7692 3012 S 0.7 0.8 0:00.13 apache2
28514 root 20 0 24820 1488 1064 R 0.7 0.1 0:00.08 top

25 root 20 0 0 0 0 S 0.3 0.0 1:00.70 kworker/0:1
28518 postgres 20 0 370016 6984 4276 S 0.3 0.7 0:00.01 postgres
1 root 20 0 33384 1288 0 S 0.0 0.1 0:11.70 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.13 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:09.40 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root 20 0 0 0 0 S 0.0 0.0 0:45.06 rcu_sched
8 root 20 0 0 0 0 R 0.0 0.0 1:54.47 rcuos/0
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0

11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0


Apache seems to be taking up a lot of CPU but I have no idea why. It was working perfectly up until a couple of days ago. I've optimized Apache by removing unused modules, tuned it to only have a small number of spare children running but that doesn't seem to have made a difference. I've also installed mod-evasive and mod-qos to protect against DDOS. Here is my apache configuration



Timeout 30
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5



StartServers 1
MinSpareServers 1
MaxSpareServers 3
MaxClients 10
MaxRequestsPerChild 3000



StartServers 1

MinSpareThreads 5
MaxSpareThreads 15
ThreadLimit 25
ThreadsPerChild 5
MaxClients 25
MaxRequestsPerChild 200



StartServers 1

MinSpareThreads 5
MaxSpareThreads 15
ThreadLimit 25
ThreadsPerChild 5
MaxClients 25
MaxRequestsPerChild 200



MS_METHODS POST,PUT,OPTIONS,CONNECT

MS_WhiteList /etc/spamhaus.wl
MS_CacheSize 256



Here is my VirtualHost configuration





RewriteEngine On

RewriteCond %{HTTP_HOST} ^example.com [nocase]
RewriteRule ^(.*) http://www.example.com$1 [last,redirect=301]

ServerName example.com
ServerAlias www.example.com
ServerAdmin admin@example.com

WSGIDaemonProcess example python-path=/home/abc/example:/home/abc/example/env/lib/python2.7/site-packages
WSGIProcessGroup example
WSGIApplicationGroup %{GLOBAL}

WSGIScriptAlias / /home/abc/example/wsgi.py

DocumentRoot /home/abc/example


Require all granted


Alias /static/ /home/abc/example/static/



Order deny,allow
Allow from all


Alias /media/ /home/abc/example/media/


Order deny,allow
Allow from all



ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined




Here is my .htaccess file





Header set Cache-Control "max-age=31536000, public"



Header set Cache-Control "max-age=604800, public"



AddType application/javascript js

AddType application/vnd.ms-fontobject eot
AddType application/x-font-ttf ttf ttc
AddType font/opentype otf
AddType application/x-font-woff woff
AddType image/svg+xml svg svgz
AddEncoding gzip svgz



AddOutputFilterByType DEFLATE text/html text/plain text/css application/json

AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE text/xml application/xml text/x-component
AddOutputFilterByType DEFLATE application/xhtml+xml application/rss+xml application/atom+xml
AddOutputFilterByType DEFLATE image/x-icon image/svg+xml application/vnd.ms-fontobject application/x-font-ttf font/opentype



I'm using memcached to cache most of the queries. Web pages with few basic queries are faster (still not as fast as before though) while pages with complex queries take a lot of time. The server response time for such pages has increased from 0.2 seconds to 4 seconds (measured using Google PageSpeed Insights).



I'm using a PostgreSQL 9.3 database. Following is my postgresql.conf tuned using PgTune.




default_statistics_target = 50
maintenance_work_mem = 60MB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 704MB
work_mem = 6MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 240MB

max_connections = 80


Here is the graph of CPU, Disk and Bandwidth usage over the past month



Monthly CPU Usage
Monthly Disk Usage
Monthly Bandwidth Usage



Though the bandwidth shows an increase in the past week or so, but the actual traffic hasn't increased. I am getting an average of 1500 visitors per day for the past 15-20 days. The increased bandwidth usage could probably be an increase in bot activity.




My website is a Django application hosted on droplet with the configuration - 1GB Ram, 30GB SSD Disk, Ubuntu 14.04 x64. I have tried every possible thing I could think and cannot for the life of me figure out what's wrong here. I am not very good at handling servers and the only thing I can think of now is to switch from Apache to nginx and from PostgreSQL to MySQL. Any suggestions that can help me figure out how to fix this would be greatly appreciated.

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...