I have been experiencing very strange server load, but for no obvious reason. Could anyone explain the cause/how to debug further?
One Minute - 22.9
Five Minutes - 17.98
Fifteen Minutes - 10.02
top - 20:34:28 up 22 days, 7:51, 0 users, load average: 22.55, 22.49, 14.51
Tasks: 131 total, 3 running, 128 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 98.6%id, 1.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097152k total, 596576k used, 1500576k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11854 root 18 0 2444 980 720 R 2.0 0.0 0:00.01 top
11856 root 18 0 2444 988 720 R 2.0 0.0 0:00.01 top
1 root 15 0 2156 592 564 S 0.0 0.0 0:10.26 init
3393 apache 18 0 50276 33m 1888 S 0.0 1.6 0:00.00 httpd
3445 pegpro 18 0 17872 3304 2368 D 0.0 0.2 0:00.03 php-cgi
3446 root 18 0 5040 1056 852 S 0.0 0.1 0:00.00 crond
3723 apache 15 0 50276 33m 1896 S 0.0 1.6 0:00.01 httpd
3735 pegpro 18 0 17872 3308 2368 D 0.0 0.2 0:00.03 php-cgi
3752 root 18 0 9152 2068 1740 S 0.0 0.1 0:00.01 dataskq
3956 root 18 0 5040 1128 852 S 0.0 0.1 0:00.00 crond
5138 root 18 0 20380 15m 1712 S 0.0 0.8 0:00.05 lfd
5279 root 18 0 9152 2084 1752 S 0.0 0.1 0:00.05 dataskq
5331 root 18 0 5040 1108 852 S 0.0 0.1 0:00.00 crond
5496 admin 18 0 17872 3308 2368 D 0.0 0.2 0:00.01 php-cgi
5637 root 18 0 9152 2080 1752 S 0.0 0.1 0:00.01 dataskq
5641 apache 16 0 50276 33m 1896 S 0.0 1.6 0:00.03 httpd
5648 root 18 0 49988 33m 2036 S 0.0 1.6 0:00.67 httpd
5702 apache 18 0 50280 33m 1820 S 0.0 1.6 0:00.03 httpd
5851 admin 18 0 17872 3304 2368 D 0.0 0.2 0:00.01 php-cgi
7256 mail 16 0 10364 2700 2176 D 0.0 0.1 0:00.02 exim
7287 apache 15 0 50276 33m 1876 S 0.0 1.6 0:00.00 httpd
7379 root 18 0 5040 1128 860 S 0.0 0.1 0:00.02 crond
7474 apache 16 0 50280 33m 1836 S 0.0 1.6 0:00.00 httpd
One Minute - 22.9
Five Minutes - 17.98
Fifteen Minutes - 10.02
top - 20:34:28 up 22 days, 7:51, 0 users, load average: 22.51, 22.49, 14.55
Tasks: 131 total, 3 running, 128 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 98.6%id, 1.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2097152k total, 596576k used, 1500576k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11856 root 18 0 2444 988 720 R 2.0 0.0 0:00.01 top
1 root 15 0 2156 592 564 S 0.0 0.0 0:10.26 init
3393 apache 18 0 50276 33m 1888 S 0.0 1.6 0:00.00 httpd
3445 pegpro 18 0 17872 3304 2368 D 0.0 0.2 0:00.03 php-cgi
3446 root 18 0 5040 1056 852 S 0.0 0.1 0:00.00 crond
3723 apache 15 0 50276 33m 1896 S 0.0 1.6 0:00.01 httpd
3735 pegpro 18 0 17872 3308 2368 D 0.0 0.2 0:00.03 php-cgi
3752 root 18 0 9152 2068 1740 S 0.0 0.1 0:00.01 dataskq
3956 root 18 0 5040 1128 852 S 0.0 0.1 0:00.00 crond
5138 root 18 0 20380 15m 1712 S 0.0 0.8 0:00.05 lfd
5279 root 18 0 9152 2084 1752 S 0.0 0.1 0:00.05 dataskq
5331 root 18 0 5040 1108 852 S 0.0 0.1 0:00.00 crond
5496 admin 18 0 17872 3308 2368 D 0.0 0.2 0:00.01 php-cgi
5637 root 18 0 9152 2080 1752 S 0.0 0.1 0:00.01 dataskq
5641 apache 16 0 50276 33m 1896 S 0.0 1.6 0:00.03 httpd
5648 root 18 0 49988 33m 2036 S 0.0 1.6 0:00.67 httpd
5702 apache 18 0 50280 33m 1820 S 0.0 1.6 0:00.03 httpd
5851 admin 18 0 17872 3304 2368 D 0.0 0.2 0:00.01 php-cgi
7256 mail 16 0 10364 2700 2176 D 0.0 0.1 0:00.02 exim
7287 apache 15 0 50276 33m 1876 S 0.0 1.6 0:00.00 httpd
7379 root 18 0 5040 1128 860 S 0.0 0.1 0:00.02 crond
7474 apache 16 0 50280 33m 1836 S 0.0 1.6 0:00.00 httpd
7550 apache 18 0 50276 33m 1924 S 0.0 1.6 0:00.00 httpd
Answer
If you look at both top outputs you'll notice a fair number of processes in state(S) 'D'. This means they are waiting for disk io. In most modern UNIX environments the load average number is a combination of both runable processes(waiting for CPU) and processes waiting for disk io. It appears as if your server has probably saturated the available IO subsystem for the demand placed on it. You can verify this with tools like iostat (try to add the iostat or sysstat packages on your system). Then run:
# iostat -x 1
and watch for the busy %
No comments:
Post a Comment