This is a headless server with 8GB RAM (kernel 3.12)... even after only a few days, i get low on memory. in fact, this server has OOMed a few days ago... something is losing memory, but i don't know where...
see the output below:
in short:
- 64bit system & OS
- not a hypervisor nor a virtual machine
- low free mem
- swap in use
- low cache
- low buffer
- inactive+active == 1GB ???
- low ipcs
- low shm
- low slab
- ~500MB tmpfs usage
- in fact total RSS of all processes is 262MB
- and HWM of all processes is less than 600MB
- i lost more than 6GB somewhere...?
[root@localhost ~]# cat /proc/meminfo
MemTotal: 8186440 kB
MemFree: 251188 kB
Buffers: 144 kB
Cached: 853548 kB
SwapCached: 9988 kB
Active: 480036 kB
Inactive: 529456 kB
Active(anon): 256196 kB
Inactive(anon): 333072 kB
Active(file): 223840 kB
Inactive(file): 196384 kB
Unevictable: 13656 kB
Mlocked: 0 kB
SwapTotal: 4194300 kB
SwapFree: 4092540 kB
Dirty: 356 kB
Writeback: 0 kB
AnonPages: 161576 kB
Mapped: 50116 kB
Shmem: 419812 kB
Slab: 72680 kB
SReclaimable: 50648 kB
SUnreclaim: 22032 kB
KernelStack: 1824 kB
PageTables: 10260 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 8287520 kB
Committed_AS: 1883404 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 91804 kB
VmallocChunk: 34359637332 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 83180 kB
DirectMap2M: 8296448 kB
[root@localhost ~]# ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x01123bac 0 root 600 1000 8
[root@localhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs 4.0G 393M 3.6G 10% /run
[root@localhost ~]# for i in /proc/*/status ; do grep VmRSS $i; done | awk '{ s = s + $2 } END { print s / 1024 }'
262.375
[root@localhost ~]# for i in /proc/*/status ; do grep VmHWM $i; done | awk '{ s = s + $2 } END { print s / 1024 }'
526.77
Edit: i've set overcommit=2 (disabled) just in case (i rebooted 2 days ago)
[root@localhost linux]# cat /proc/sys/vm/overcommit_memory
2
[root@localhost linux]# df -h | grep tmpfs
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 4.0G 0 4.0G 0% /dev/shm
tmpfs 4.0G 532K 4.0G 1% /run
tmpfs 4.0G 0 4.0G 0% /sys/fs/cgroup
tmpfs 4.0G 0 4.0G 0% /tmp
tmpfs 4.0G 532K 4.0G 1% /var/spool/postfix/run/saslauthd
[root@localhost linux]# for i in /proc/*/status ; do grep VmRSS $i; done | awk '{ s = s + $2 } END { print s / 1024 }'
434.188
[root@localhost linux]# for i in /proc/*/status ; do grep VmHWM $i; done | awk '{ s = s + $2 } END { print s / 1024 }'
545.551
[root@localhost linux]# cat /proc/meminfo
MemTotal: 8186440 kB
MemFree: 146576 kB
Buffers: 1728 kB
Cached: 5212588 kB
SwapCached: 0 kB
Active: 2560112 kB
Inactive: 2874464 kB
Active(anon): 94464 kB
Inactive(anon): 136528 kB
Active(file): 2465648 kB
Inactive(file): 2737936 kB
Unevictable: 9772 kB
Mlocked: 0 kB
SwapTotal: 4194300 kB
SwapFree: 4194300 kB
Dirty: 1436 kB
Writeback: 0 kB
AnonPages: 230032 kB
Mapped: 50540 kB
Shmem: 960 kB
Slab: 316804 kB
SReclaimable: 291712 kB
SUnreclaim: 25092 kB
KernelStack: 1880 kB
PageTables: 11184 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 8287520 kB
Committed_AS: 1160812 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 91676 kB
VmallocChunk: 34359582672 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 91372 kB
DirectMap2M: 8288256 kB
so, i'm using 8GB:
- 5GB is cached
- 0.5MB tmpfs
- 450MB RSS
- ~1GB slab+pages+whatever (in meminfo)
i'm still short 1.5GB ... is this a kernel leak? or what is going on here???
Edit2: i have the same issue on another atom board
I also checked if kmemleak saw something, but nothing... i'm out of ideas...
Edit3: updating to kernel 3.17.2 seems to have resolved this issue, but i still don't know how to trace these memory leaks...
Answer
lkml thinks that it might have been https://lkml.org/lkml/2014/10/15/447 , but that patch wasn't in 3.17.2 and the thp allocation don't point that way
however, /proc kpageflags might show what part allocated what pages, so that might help. in tools/vm/page-types.c in kernel sources, that might hold some info on the structure of the kpageflags binary output.
No comments:
Post a Comment