We use EC2 Auto Scaling and recently decided to change Instance type from m2.2xlarge to c1.xlarge (High Memory to High CPU) because average amount of used RAM per Instance is 2G, thus we don't need 34G provided by m2.2xlarge, and having more CPU power of c1.xlarge for the same price would be good idea.
But after switching to c1.xlarge, we have the issue:
- Load average became 50 while CPU Utilization dropped from %70 to %60.
- Scaling in from 6 Instances to 4 doesn't affect CPU Utilization Cloud Watch metric.
- Response time appeared to be very slow and Instances been substituting constantly with Auto Scaling because of ELB Health Check.
- Auto Scaling reduced the number of Instances from 8 to 4 because CPU Utilization dropped.
Can you explain me what might be the reason of such behavior and what can I do with it?
EC2 Instance Types Info:
High-Memory Double Extra Large Instance
34.2 GB of memory
13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
850 GB of instance storage
64-bit platform
I/O Performance: High
API name: m2.2xlarge
High-CPU Extra Large Instance
7 GB of memory
20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)
1690 GB of instance storage
64-bit platform
I/O Performance: High
API name: c1.xlarge
EDIT:
$ iostat -x
Linux 2.6.38-13-virtual 02/17/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
1.34 0.00 0.13 0.02 0.29 98.23
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
xvdap1 0.04 0.09 0.08 0.13 1.50 0.87 22.99 0.01 36.59 23.42 44.75 4.04 0.08
xvdb 0.00 0.00 0.01 0.00 0.03 0.00 9.37 0.00 1.04 0.95 15.00 1.04 0.00
$ iostat
Linux 2.6.38-13-virtual 02/17/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
1.45 0.00 0.14 0.02 0.31 98.08
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
xvdap1 0.21 1.50 0.87 93689 54728
xvdb 0.01 0.03 0.00 1575 8
$ top
top - 05:30:08 up 17:20, 3 users, load average: 15.13, 10.24, 9.66
Tasks: 166 total, 20 running, 146 sleeping, 0 stopped, 0 zombie
Cpu(s): 65.3%us, 4.7%sy, 0.0%ni, 13.5%id, 0.0%wa, 0.0%hi, 0.7%si, 15.8%st
Mem: 7130236k total, 463440k used, 6666796k free, 19100k buffers
Swap: 0k total, 0k used, 0k free, 95136k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6457 ubuntu 20 0 257m 11m 4820 S 24 0.2 0:16.73 apache2
6416 ubuntu 20 0 257m 11m 4820 R 23 0.2 0:17.36 apache2
6375 ubuntu 20 0 257m 11m 4820 R 22 0.2 0:17.62 apache2
6402 ubuntu 20 0 257m 11m 4820 R 22 0.2 0:16.85 apache2
6472 ubuntu 20 0 257m 11m 4820 S 22 0.2 0:08.95 apache2
6311 ubuntu 20 0 257m 11m 4820 S 21 0.2 0:24.91 apache2
6446 ubuntu 20 0 257m 11m 4820 R 21 0.2 0:16.91 apache2
6372 ubuntu 20 0 257m 11m 4820 R 21 0.2 0:17.89 apache2
6460 ubuntu 20 0 257m 11m 4820 R 21 0.2 0:16.73 apache2
6379 ubuntu 20 0 257m 11m 4820 R 20 0.2 0:16.24 apache2
6380 ubuntu 20 0 257m 11m 4820 S 20 0.2 0:17.20 apache2
6450 ubuntu 20 0 257m 11m 4820 S 20 0.2 0:16.89 apache2
6426 ubuntu 20 0 257m 11m 4820 R 20 0.2 0:16.96 apache2
6432 ubuntu 20 0 257m 11m 4820 S 20 0.2 0:17.78 apache2
6433 ubuntu 20 0 257m 11m 4820 R 20 0.2 0:14.37 apache2
6476 ubuntu 20 0 257m 11m 4816 R 20 0.2 0:02.92 apache2
6386 ubuntu 20 0 257m 11m 4824 S 20 0.2 0:17.94 apache2
6475 ubuntu 20 0 257m 11m 4820 S 19 0.2 0:03.41 apache2
6355 ubuntu 20 0 257m 11m 4820 S 19 0.2 0:24.39 apache2
6417 ubuntu 20 0 257m 11m 4820 R 18 0.2 0:16.66 apache2
6455 ubuntu 20 0 257m 11m 4820 R 18 0.2 0:16.27 apache2
6393 ubuntu 20 0 257m 11m 4820 S 18 0.2 0:16.60 apache2
6325 ubuntu 20 0 257m 11m 4820 R 18 0.2 0:25.66 apache2
6403 ubuntu 20 0 257m 11m 4820 S 18 0.2 0:15.61 apache2
6474 ubuntu 20 0 257m 11m 4812 S 18 0.2 0:04.37 apache2
6477 ubuntu 20 0 257m 11m 4800 S 18 0.2 0:01.43 apache2
6315 ubuntu 20 0 257m 11m 4820 S 17 0.2 0:25.27 apache2
6376 ubuntu 20 0 257m 11m 4820 R 17 0.2 0:17.53 apache2
6478 ubuntu 20 0 257m 11m 4800 S 15 0.2 0:00.45 apache2
6359 ubuntu 20 0 257m 11m 4820 R 15 0.2 0:23.60 apache2
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.9G 1.4G 6.1G 19% /
none 3.4G 112K 3.4G 1% /dev
none 3.4G 0 3.4G 0% /dev/shm
none 3.4G 72K 3.4G 1% /var/run
none 3.4G 0 3.4G 0% /var/lock
/dev/xvdb 414G 199M 393G 1% /mnt
XXXX.compute.internal:/share_0
99G 28G 66G 30% /data_0
XXXX.compute.internal:/share_17
99G 30G 64G 33% /data_17
XXXX.compute.internal:/share_13
99G 30G 64G 33% /data_13
XXXX.compute.internal:/share_18
99G 31G 64G 33% /data_18
XXXX.compute.internal:/share_15
99G 28G 66G 30% /data_15
XXXX.compute.internal:/share_10
99G 28G 67G 30% /data_10
XXXX.compute.internal:/share_16
99G 30G 64G 32% /data_16
XXXX.internal:/share_3
99G 29G 66G 31% /data_3
XXXX.compute.internal:/share_11
99G 30G 64G 32% /data_11
XXXX.compute.internal:/share_7
99G 28G 66G 30% /data_7
XXXX.compute.internal:/share
99G 58G 37G 62% /share
XXXX.compute.internal:/share_2
99G 28G 66G 30% /data_2
XXXX.compute.internal:/share_8
99G 28G 67G 30% /data_8
XXXX.compute.internal:/share_19
99G 28G 66G 30% /data_19
XXXX.compute.internal:/share_14
99G 31G 64G 33% /data_14
XXXX.compute.internal:/share_5
99G 28G 66G 30% /data_5
XXXX.compute.internal:/share_6
99G 28G 67G 30% /data_6
XXXX.compute.internal:/share_1
99G 28G 66G 30% /data_1
XXXX.compute.internal:/share_12
99G 31G 64G 33% /data_12
XXXX.compute.internal:/share_4
99G 29G 66G 31% /data_4
XXXX.compute.internal:/share_9
99G 28G 66G 30% /data_9
$ free -g
total used free shared buffers cached
Mem: 6 0 6 0 0 0
-/+ buffers/cache: 0 6
Swap: 0 0 0
sar 1
Linux 2.6.38-13-virtual 02/17/2012 _x86_64_ (8 CPU)
05:33:02 AM CPU %user %nice %system %iowait %steal %idle
05:33:03 AM all 69.27 0.00 5.90 0.00 13.83 11.00
05:33:04 AM all 70.88 0.00 7.62 0.00 16.50 5.01
05:33:05 AM all 64.41 0.00 5.35 0.00 17.90 12.34
05:33:06 AM all 66.41 0.00 9.16 0.00 13.09 11.34
05:33:07 AM all 74.55 0.00 7.06 0.00 11.21 7.17
05:33:08 AM all 62.31 0.00 7.49 0.00 13.38 16.81
05:33:09 AM all 73.65 0.00 5.61 0.00 16.04 4.70
05:33:10 AM all 76.79 0.00 8.20 0.00 9.70 5.31
05:33:11 AM all 70.91 0.00 5.86 0.00 14.21 9.02
05:33:12 AM all 73.95 0.00 6.37 0.00 12.51 7.17
05:33:13 AM all 63.50 0.00 6.03 0.00 17.52 12.95
05:33:14 AM all 61.92 0.00 4.42 0.00 17.66 16.00
05:33:15 AM all 63.56 0.00 6.42 0.00 15.11 14.91
05:33:16 AM all 72.63 0.00 7.51 0.00 14.90 4.97
05:33:17 AM all 60.68 0.00 6.17 0.00 15.09 18.06
$ sar -w 1
Linux 2.6.38-13-virtual 02/17/2012 _x86_64_ (8 CPU)
09:34:23 AM proc/s cswch/s
09:34:24 AM 0.00 4795.00
09:34:25 AM 0.00 4174.00
09:34:26 AM 0.00 4194.23
09:34:27 AM 1.00 3645.00
09:34:28 AM 0.00 4564.00
09:34:29 AM 0.00 4473.00
09:34:30 AM 0.00 4225.00
09:34:31 AM 0.00 4064.36
09:34:32 AM 0.00 4740.00
09:34:33 AM 0.00 4589.22
09:34:34 AM 0.00 3887.00
09:34:35 AM 0.00 4579.00
09:34:36 AM 0.00 4408.00
09:34:37 AM 1.00 4390.00
09:34:38 AM 0.00 4628.00
Answer
Please add sar -w 1
output. I suppose a number of context switches per second is killing your performance, because there are much more processes running than available processors. I think context switches on a virtual machine are expensive.
If it's true, then there are some kernel tunables that can help you lower number of context switches:
Check value of
systctl kernel.sched_min_granularity_ns
. Double it with a command similar tosystctl kernel.sched_min_granularity_ns=2000000
. Retest. Double it again. Retest. Repeat. Try to find a value which will not cripple interactivity too much but won't allow too many context switches and write it to/etc/sysctl.conf
so it will be set at startup.Set
apache
scheduling policy to SCHED_BATCH - start it withchrt -b 0 apache2
No comments:
Post a Comment