Sunday, July 7, 2019

linux - Very high RAM buffers usage following instance resize/reboot

Yesterday afternoon, we resized one of our Linode instances (CentOS 5.7, 64-bit) from a 4GB instance to 12GB. Immediately following that reboot, I noticed that memory usage for buffers was insanely high - higher than I've ever seen on any machine I've touched. Even on my most heavily-used servers, I rarely see buffer usage exceed ~200MB. On this server, the current buffer usage is two orders of magnitude higher than before we resized and rebooted.




Here is the a munin memory graph with data pre and post migration:



Munin Mempory Graph



Data that munin is displaying is corroborated by the output of "free":



[erik@host ~]$ free -m
total used free shared buffers cached
Mem: 11967 10146 1820 0 7374 1132
-/+ buffers/cache: 1639 10327

Swap: 255 0 255


Now, I'm well aware of the kernel's usage of unused memory for cache, but my understanding of buffers is that buffers are different. They're used to temporarily store writes until they've been committed to disk. Is that a correct understanding? This server has very little disk IO (it's an apache/php webserver, DB is elsewhere, so only IO of substance are access_logs), and as such, I'd expect buffer usage to be quite low.



Here is a network traffic graph for the same time period:



enter image description here



As you can see, there is no substantive change in traffic before and after the resize.




During the reboot, three things changed that I know of:




  1. We picked up 4 additional cores that Linode gave out earlier this week, bring the total cores to 8.

  2. We're on the "Latest 64 bit" kernel, which is now 3.7.10-x86_64-linode30. Previously we were on 3.0.18 I believe.

  3. We went from 4GB RAM to 12GB.



Of these changes, my hunch is that it was the new kernel that is causing the increased buffer usage. Unfortunately, at the moment, we can't take another downtime hit to downgrade to an earlier kernel, though that may end up being necessary if I can't get this buffer usage sorted out.




With that, I have a couple questions:




  1. Are any of you running the 3.7.10 kernel and if so, have you seen a similar change?

  2. What tools are available to inspect the kernel buffers and their sizes?

  3. I assume that, like cache, the kernel will release this memory when other applications need it. Is this correct?

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...