Saturday, January 10, 2015

php - MySQL server experiencing high stress/max connections

The Short:
We're hitting max connections and seeing very poor performance from MySQL during peak business hours and cannot determine if it is a load problem or a configuration problem.



The Long:
To begin, here’s our server specs hosted on Digital Ocean:



Cores: 20




Memory:



total used free shared buff/cache available
Mem: 62G 48G 4.7G 216M 10G 14G



MySQL WorkBench Dashboard Snapshot:



MySQL Workbench Dashboard



We typically have over 200 entries in processlist, almost all of which are "Opening tables", "closing tables", and, slightly less frequent, "cleaning up". The cleaning up entries have COMMAND killed and NULL INFO (SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST).

The only queries we really see taking very long are some INSERTS into tables with heavy use, and UPDATE commands with the primary key in the WHERE for tables with heavy us. Their state is “update” in the processlist.



MySQL .cnf settings:



key_buffer_size=1G
myisam_sort_buffer_size=1073741824
max_length_for_sort_data=8388608
max_sort_length=8388608
sort_buffer_size=1073741824
join_buffer_size=1073741824

preload_buffer_size=1073741824
read_buffer_size=1073741824
read_rnd_buffer_size=1073741824
thread_cache_size=450
max_allowed_packet=100M
max_connections=400
lock_wait_timeout=50
wait_timeout=120
table_open_cache_instances=10
table_open_cache=6000

table_definition_cache=6000
innodb_flush_method=O_DIRECT
innodb_buffer_pool_size=46G
innodb_buffer_pool_instances=46


We have 800+ databases on our single live server, and many thousands of users using an app that makes frequent server requests with lots of individual record updates or inserts that need to sync to multiple users.



The Question:




Given the .cnf settings and our server specs, in relation to the MySQL WorkBench dashboard snapshot, should we be experiencing the kind of slowdown/connections we're seeing with this configuration?
We constantly hit our Max Connections and get SQLSTATE[HY000] [2002] Resource temporarily unavailable errors in PHP from PDO.



Does anyone have any suggestions to improve the performance? Different config settings? We know long term solutions, like putting large clients on separate servers, among others—but for the short term, is there any way to optimize our server? Our app and website are barely usable during the rush hours.



Additional Stat:



Linux 4.0.4-301.fc22.x86_64
04/25/2016 _x86_64_ (20 CPU)




avg-cpu: %user %nice %system %iowait %steal %idle



8.51 0.01 0.67 0.28 0.35 90.18



Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
vda
269.78 1681.90 2180.52 37679232421 48849706700

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...