Tuesday, August 2, 2016

linux - VmRSS used to only about 25%, yet oom-killer strikes




I have a dedicated MySQL server equipped with 128 GB RAM. MySQL recently gets killed by the oom-killer, although MySQL is configured to use 95 GB in the worst case. In my research I came across this:



# cat /proc/11895/status
Name: mysqld
State: S (sleeping)
Tgid: 11895
Pid: 11895
PPid: 24530
TracerPid: 0
Uid: 27 27 27 27

Gid: 27 27 27 27
Utrace: 0
FDSize: 1024
Groups: 27
VmPeak: 72188044 kB
VmSize: 72122508 kB
VmLck: 0 kB
VmHWM: 33294036 kB
VmRSS: 32829668 kB
VmData: 72076496 kB

VmStk: 88 kB
VmExe: 11800 kB
VmLib: 3608 kB
VmPTE: 73388 kB
VmSwap: 4139376 kB
Threads: 59


I'm wondering, why is the VmHWM and VmRSS at only around 33 GB whereas on another server (also a slave to the same master, configured almost the same (except for buffer pool), except, that it has 256 GB RAM), the output is as follows:




# cat /proc/51298/status
Name: mysqld
State: S (sleeping)
Tgid: 51298
Pid: 51298
PPid: 50443
TracerPid: 0
Uid: 27 27 27 27
Gid: 27 27 27 27
Utrace: 0

FDSize: 2048
Groups: 27
VmPeak: 243701128 kB
VmSize: 239628932 kB
VmLck: 0 kB
VmHWM: 209331200 kB
VmRSS: 205515868 kB
VmData: 239582156 kB
VmStk: 88 kB
VmExe: 11800 kB

VmLib: 3608 kB
VmPTE: 409600 kB
VmSwap: 0 kB
Threads: 281


Here the memory is used to about 80%, whereas on the oom-killed server it's only about 25% (note, that these values were observed shortly before the oom-killer striked again). What could be the reason? There is no competing process. And what can I do about it?



EDIT: Here's what dmesg tells me:




mysqld invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
mysqld cpuset=/ mems_allowed=0-1
Pid: 11902, comm: mysqld Not tainted 2.6.32-573.7.1.el6.x86_64 #1
Call Trace:
[] ? cpuset_print_task_mems_allowed+0x91/0xb0
[] ? dump_header+0x90/0x1b0
[] ? security_real_capable_noaudit+0x3c/0x70
[] ? oom_kill_process+0x82/0x2a0
[] ? select_bad_process+0xe1/0x120
[] ? out_of_memory+0x220/0x3c0

[] ? __alloc_pages_nodemask+0x93c/0x950
[] ? ext4_get_block+0x0/0x120 [ext4]
[] ? alloc_pages_current+0xaa/0x110
[] ? __page_cache_alloc+0x87/0x90
[] ? find_get_page+0x1e/0xa0
[] ? filemap_fault+0x1a7/0x500
[] ? __do_fault+0x54/0x530
[] ? handle_pte_fault+0xf7/0xb20
[] ? wake_up_state+0x10/0x20
[] ? wake_futex+0x3c/0x60

[] ? handle_mm_fault+0x299/0x3d0
[] ? perf_event_task_sched_out+0x33/0x70
[] ? __do_page_fault+0x146/0x500
[] ? default_wake_function+0x0/0x20
[] ? timeout_func+0x0/0x20
[] ? do_page_fault+0x3e/0xa0
[] ? page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0

CPU 1: hi: 0, btch: 1 usd: 0
CPU 2: hi: 0, btch: 1 usd: 0
CPU 3: hi: 0, btch: 1 usd: 0
CPU 4: hi: 0, btch: 1 usd: 0
CPU 5: hi: 0, btch: 1 usd: 0
CPU 6: hi: 0, btch: 1 usd: 0
CPU 7: hi: 0, btch: 1 usd: 0
CPU 8: hi: 0, btch: 1 usd: 0
CPU 9: hi: 0, btch: 1 usd: 0
CPU 10: hi: 0, btch: 1 usd: 0

CPU 11: hi: 0, btch: 1 usd: 0
Node 0 DMA32 per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 0
CPU 3: hi: 186, btch: 31 usd: 0
CPU 4: hi: 186, btch: 31 usd: 0
CPU 5: hi: 186, btch: 31 usd: 0
CPU 6: hi: 186, btch: 31 usd: 0
CPU 7: hi: 186, btch: 31 usd: 0

CPU 8: hi: 186, btch: 31 usd: 0
CPU 9: hi: 186, btch: 31 usd: 0
CPU 10: hi: 186, btch: 31 usd: 0
CPU 11: hi: 186, btch: 31 usd: 0
Node 0 Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 0
CPU 3: hi: 186, btch: 31 usd: 0
CPU 4: hi: 186, btch: 31 usd: 0

CPU 5: hi: 186, btch: 31 usd: 0
CPU 6: hi: 186, btch: 31 usd: 0
CPU 7: hi: 186, btch: 31 usd: 0
CPU 8: hi: 186, btch: 31 usd: 0
CPU 9: hi: 186, btch: 31 usd: 0
CPU 10: hi: 186, btch: 31 usd: 0
CPU 11: hi: 186, btch: 31 usd: 0
Node 1 Normal per-cpu:
CPU 0: hi: 186, btch: 31 usd: 29
CPU 1: hi: 186, btch: 31 usd: 0

CPU 2: hi: 186, btch: 31 usd: 30
CPU 3: hi: 186, btch: 31 usd: 0
CPU 4: hi: 186, btch: 31 usd: 0
CPU 5: hi: 186, btch: 31 usd: 0
CPU 6: hi: 186, btch: 31 usd: 15
CPU 7: hi: 186, btch: 31 usd: 0
CPU 8: hi: 186, btch: 31 usd: 0
CPU 9: hi: 186, btch: 31 usd: 0
CPU 10: hi: 186, btch: 31 usd: 0
CPU 11: hi: 186, btch: 31 usd: 0

active_anon:8256706 inactive_anon:760868 isolated_anon:0
active_file:280 inactive_file:51 isolated_file:0
unevictable:0 dirty:166 writeback:0 unstable:0
free:87964 slab_reclaimable:6774 slab_unreclaimable:11217
mapped:138 shmem:2 pagetables:21060 bounce:0
Node 0 DMA free:15736kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15340kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 2955 64565 64565
Node 0 DMA32 free:248508kB min:2060kB low:2572kB high:3088kB active_anon:1292624kB inactive_anon:388708kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3026080kB mlocked:0kB dirty:16kB writeback:0kB mapped:76kB shmem:0kB slab_reclaimable:5776kB slab_unreclaimable:556kB kernel_stack:0kB pagetables:1824kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 0 61610 61610
Node 0 Normal free:42672kB min:42960kB low:53700kB high:64440kB active_anon:14828712kB inactive_anon:1241992kB active_file:244kB inactive_file:476kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63088640kB mlocked:0kB dirty:20kB writeback:0kB mapped:8kB shmem:8kB slab_reclaimable:11740kB slab_unreclaimable:27576kB kernel_stack:5568kB pagetables:42220kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1136 all_unreclaimable? yes

lowmem_reserve[]: 0 0 0 0
Node 1 Normal free:44940kB min:45076kB low:56344kB high:67612kB active_anon:16905488kB inactive_anon:1412772kB active_file:880kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:66191360kB mlocked:0kB dirty:628kB writeback:0kB mapped:468kB shmem:0kB slab_reclaimable:9580kB slab_unreclaimable:16736kB kernel_stack:1472kB pagetables:40196kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 2*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15736kB
Node 0 DMA32: 1410*4kB 1324*8kB 1136*16kB 941*32kB 659*64kB 332*128kB 162*256kB 51*512kB 25*1024kB 1*2048kB 1*4096kB = 248520kB
Node 0 Normal: 1169*4kB 619*8kB 477*16kB 304*32kB 132*64kB 48*128kB 8*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 43628kB
Node 1 Normal: 939*4kB 637*8kB 446*16kB 296*32kB 165*64kB 53*128kB 4*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 45364kB
60924 total pagecache pages
60506 pages in swap cache
Swap cache stats: add 16616696, delete 16556190, find 18336133/18806499

Free swap = 0kB
Total swap = 4194300kB
33554431 pages RAM
527380 pages reserved
497 pages shared
9240689 pages non-shared
[ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
[ 1010] 0 1010 2842 1 2 -17 -1000 udevd
[ 3086] 0 3086 23289 42 8 -17 -1000 auditd
[ 3174] 0 3174 16556 26 1 -17 -1000 sshd

[ 3253] 0 3253 20217 24 2 0 0 master
[ 3265] 0 3265 29216 22 0 0 0 crond
[ 3474] 0 3474 249065 355 0 0 0 dsm_sa_datamgrd
[ 3684] 0 3684 73206 86 1 0 0 dsm_sa_eventmgr
[ 3685] 0 3685 115300 9 0 0 0 dsm_sa_datamgrd
[ 3712] 0 3712 109424 428 7 0 0 dsm_sa_snmpd
[ 3757] 0 3757 159830 53 2 0 0 dsm_om_shrsvcd
[ 3781] 0 3781 1016 2 2 0 0 mingetty
[ 3783] 0 3783 1016 2 8 0 0 mingetty
[ 3789] 0 3789 1016 2 9 0 0 mingetty

[ 3795] 0 3795 1016 2 11 0 0 mingetty
[ 3798] 0 3798 1016 2 6 0 0 mingetty
[ 3800] 0 3800 1016 2 2 0 0 mingetty
[ 6104] 0 6104 2841 1 0 -17 -1000 udevd
[ 6106] 0 6106 2841 1 0 -17 -1000 udevd
[24530] 0 24530 26550 7 1 0 0 mysqld_safe
[31635] 0 31635 80599 197 0 0 0 bacula-fd
[38121] 0 38121 55171 26624 0 0 0 puppetd
[ 4546] 0 4546 62368 1465 0 0 0 rsyslogd
[19221] 0 19221 1530 1 3 0 0 collectdmon

[19222] 0 19222 443447 3024 4 0 0 collectd
[ 1460] 500 1460 125934 60 1 0 0 icinga2
[ 1473] 500 1473 688892 1098 1 0 0 icinga2
[18124] 89 18124 20280 19 0 0 0 qmgr
[41035] 0 41035 25232 27 2 0 0 rhnsd
[41116] 0 41116 52038 401 0 0 0 osad
[11895] 27 11895 18030627 8921592 1 0 0 mysqld
[41068] 0 41068 24993 253 2 0 0 sshd
[41070] 0 41070 25142 405 0 0 0 sshd
[14243] 0 14243 24993 244 0 0 0 sshd

[14245] 0 14245 27110 120 1 0 0 bash
[22904] 0 22904 24993 253 2 0 0 sshd
[22906] 0 22906 27108 124 2 0 0 bash
[25586] 38 25586 7684 154 0 0 0 ntpd
[58468] 89 58468 20237 221 1 0 0 pickup
Out of memory: Kill process 11895 (mysqld) score 292 or sacrifice child
Killed process 11895, UID 27, (mysqld) total-vm:72122508kB, anon-rss:35686344kB, file-rss:48kB


Output of cat /proc/self/mountinfo:




16 21 0:3 / /proc rw,relatime - proc proc rw
17 21 0:0 / /sys rw,relatime - sysfs sysfs rw
18 21 0:5 / /dev rw,relatime - devtmpfs devtmpfs rw,size=66042096k,nr_inodes=16510524,mode=755
19 18 0:11 / /dev/pts rw,relatime - devpts devpts rw,gid=5,mode=620,ptmxmode=000
20 18 0:16 / /dev/shm rw,relatime - tmpfs tmpfs rw
21 1 253:0 / / rw,relatime - ext4 /dev/mapper/vg_ods055-lv_root rw,barrier=1,data=ordered
22 16 0:15 / /proc/bus/usb rw,relatime - usbfs /proc/bus/usb rw
23 21 8:1 / /boot rw,relatime - ext4 /dev/sda1 rw,barrier=1,data=ordered
24 21 253:3 / /home rw,relatime - ext4 /dev/mapper/vg_ods055-lv_home rw,barrier=1,data=ordered

25 21 253:2 / /var/lib/mysql rw,relatime - ext4 /dev/mapper/vg_ods055_mysql-mysql_data rw,barrier=1,data=ordered
26 16 0:17 / /proc/sys/fs/binfmt_misc rw,relatime - binfmt_misc none rw


Output of lsmod:



Module                  Size  Used by
tcp_diag 1041 0
inet_diag 8735 1 tcp_diag
vfat 10584 0

fat 54992 1 vfat
usb_storage 49228 0
mpt3sas 191659 1
mpt2sas 189883 1
scsi_transport_sas 35588 2 mpt3sas,mpt2sas
raid_class 4388 2 mpt3sas,mpt2sas
mptctl 31785 1
mptbase 93647 1 mptctl
dell_rbu 9414 0
ipv6 335525 144

ipmi_devintf 7729 2
sg 29318 0
joydev 10480 0
power_meter 9009 0
acpi_ipmi 3745 1 power_meter
ipmi_si 44751 2 acpi_ipmi
ipmi_msghandler 38701 3 ipmi_devintf,acpi_ipmi,ipmi_si
iTCO_wdt 7115 0
iTCO_vendor_support 3056 1 iTCO_wdt
tg3 161289 0

ptp 9614 1 tg3
pps_core 10690 1 ptp
dcdbas 8707 0
sb_edac 17888 0
edac_core 46645 3 sb_edac
lpc_ich 12963 0
mfd_core 1895 1 lpc_ich
shpchp 29130 0
ext4 378683 4
jbd2 93252 1 ext4

mbcache 8193 1 ext4
sd_mod 37030 5
crc_t10dif 1209 1 sd_mod
sr_mod 15049 0
cdrom 39085 1 sr_mod
megaraid_sas 109375 5
wmi 6287 0
ahci 42738 0
dm_mirror 14384 0
dm_region_hash 12085 1 dm_mirror

dm_log 9930 2 dm_mirror,dm_region_hash
dm_mod 99168 14 dm_mirror,dm_log





# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_ods055-lv_root
50G 9.7G 37G 21% /

tmpfs 63G 0 63G 0% /dev/shm
/dev/sda1 477M 57M 395M 13% /boot
/dev/mapper/vg_ods055-lv_home
81G 56M 77G 1% /home
/dev/mapper/vg_ods055_mysql-mysql_data
1.1T 700G 344G 68% /var/lib/mysql


Output of cat /proc/meminfo:




MemTotal:       132108204 kB
MemFree: 380404 kB
Buffers: 214256 kB
Cached: 16000720 kB
SwapCached: 22964 kB
Active: 26156736 kB
Inactive: 9329756 kB
Active(anon): 16971328 kB
Inactive(anon): 2300216 kB
Active(file): 9185408 kB

Inactive(file): 7029540 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 4194300 kB
SwapFree: 4039224 kB
Dirty: 816 kB
Writeback: 0 kB
AnonPages: 19261368 kB
Mapped: 17860 kB
Shmem: 20 kB

Slab: 662020 kB
SReclaimable: 617680 kB
SUnreclaim: 44340 kB
KernelStack: 7008 kB
PageTables: 43768 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 22869968 kB
Committed_AS: 69472720 kB

VmallocTotal: 34359738367 kB
VmallocUsed: 484276 kB
VmallocChunk: 34291771192 kB
HardwareCorrupted: 0 kB
AnonHugePages: 16736256 kB
HugePages_Total: 46268
HugePages_Free: 46268
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB

DirectMap4k: 5056 kB
DirectMap2M: 2045952 kB
DirectMap1G: 132120576 kB


Output of cat /proc/zoneinfo:



Node 0, zone      DMA
pages free 3934
min 2

low 2
high 3
scanned 0
spanned 4095
present 3835
nr_free_pages 3934
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0

nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 0
nr_page_table_pages 0

nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
numa_hit 1
numa_miss 0

numa_foreign 0
numa_interleave 0
numa_local 0
numa_other 1
nr_anon_transparent_hugepages 0
protection: (0, 2955, 64565, 64565)
pagesets
cpu: 0
count: 0
high: 0

batch: 1
vm stats threshold: 8
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 8
cpu: 2
count: 0
high: 0

batch: 1
vm stats threshold: 8
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 8
cpu: 4
count: 0
high: 0

batch: 1
vm stats threshold: 8
cpu: 5
count: 0
high: 0
batch: 1
vm stats threshold: 8
cpu: 6
count: 0
high: 0

batch: 1
vm stats threshold: 8
cpu: 7
count: 0
high: 0
batch: 1
vm stats threshold: 8
cpu: 8
count: 0
high: 0

batch: 1
vm stats threshold: 8
cpu: 9
count: 0
high: 0
batch: 1
vm stats threshold: 8
cpu: 10
count: 0
high: 0

batch: 1
vm stats threshold: 8
cpu: 11
count: 0
high: 0
batch: 1
vm stats threshold: 8
all_unreclaimable: 1
prev_priority: 12
start_pfn: 1

inactive_ratio: 1
Node 0, zone DMA32
pages free 62344
min 515
low 643
high 772
scanned 0
spanned 1044480
present 756520
nr_free_pages 62344

nr_inactive_anon 86498
nr_active_anon 226241
nr_inactive_file 14165
nr_active_file 14186
nr_unevictable 0
nr_mlock 0
nr_anon_pages 5484
nr_mapped 4
nr_file_pages 28412
nr_dirty 0

nr_writeback 0
nr_slab_reclaimable 80093
nr_slab_unreclaimable 251
nr_page_table_pages 250
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 2329633
nr_writeback_temp 0
nr_isolated_anon 0

nr_isolated_file 0
nr_shmem 0
numa_hit 30968949
numa_miss 12412436
numa_foreign 0
numa_interleave 0
numa_local 30968183
numa_other 12413202
nr_anon_transparent_hugepages 600
protection: (0, 0, 61610, 61610)

pagesets
cpu: 0
count: 21
high: 186
batch: 31
vm stats threshold: 48
cpu: 1
count: 0
high: 186
batch: 31

vm stats threshold: 48
cpu: 2
count: 75
high: 186
batch: 31
vm stats threshold: 48
cpu: 3
count: 0
high: 186
batch: 31

vm stats threshold: 48
cpu: 4
count: 199
high: 186
batch: 31
vm stats threshold: 48
cpu: 5
count: 0
high: 186
batch: 31

vm stats threshold: 48
cpu: 6
count: 16
high: 186
batch: 31
vm stats threshold: 48
cpu: 7
count: 0
high: 186
batch: 31

vm stats threshold: 48
cpu: 8
count: 24
high: 186
batch: 31
vm stats threshold: 48
cpu: 9
count: 0
high: 186
batch: 31

vm stats threshold: 48
cpu: 10
count: 167
high: 186
batch: 31
vm stats threshold: 48
cpu: 11
count: 0
high: 186
batch: 31

vm stats threshold: 48
all_unreclaimable: 0
prev_priority: 12
start_pfn: 4096
inactive_ratio: 4
Node 0, zone Normal
pages free 13426
min 10740
low 13425
high 16110

scanned 0
spanned 15990784
present 15772160
nr_free_pages 13426
nr_inactive_anon 277373
nr_active_anon 2623458
nr_inactive_file 422756
nr_active_file 652474
nr_unevictable 0
nr_mlock 0

nr_anon_pages 304740
nr_mapped 2764
nr_file_pages 1079208
nr_dirty 32
nr_writeback 0
nr_slab_reclaimable 46034
nr_slab_unreclaimable 6793
nr_page_table_pages 6298
nr_kernel_stack 334
nr_unstable 0

nr_bounce 0
nr_vmscan_write 8578494
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 5
numa_hit 788900341
numa_miss 44641935
numa_foreign 291882873
numa_interleave 678929

numa_local 788587089
numa_other 44955187
nr_anon_transparent_hugepages 5067
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 37
high: 186
batch: 31
vm stats threshold: 80

cpu: 1
count: 1
high: 186
batch: 31
vm stats threshold: 80
cpu: 2
count: 30
high: 186
batch: 31
vm stats threshold: 80

cpu: 3
count: 3
high: 186
batch: 31
vm stats threshold: 80
cpu: 4
count: 114
high: 186
batch: 31
vm stats threshold: 80

cpu: 5
count: 0
high: 186
batch: 31
vm stats threshold: 80
cpu: 6
count: 161
high: 186
batch: 31
vm stats threshold: 80

cpu: 7
count: 0
high: 186
batch: 31
vm stats threshold: 80
cpu: 8
count: 85
high: 186
batch: 31
vm stats threshold: 80

cpu: 9
count: 0
high: 186
batch: 31
vm stats threshold: 80
cpu: 10
count: 184
high: 186
batch: 31
vm stats threshold: 80

cpu: 11
count: 0
high: 186
batch: 31
vm stats threshold: 80
all_unreclaimable: 0
prev_priority: 12
start_pfn: 1048576
inactive_ratio: 24
Node 1, zone Normal

pages free 15668
min 11269
low 14086
high 16903
scanned 0
spanned 16777216
present 16547840
nr_free_pages 15668
nr_inactive_anon 210225
nr_active_anon 1405678

nr_inactive_file 1312467
nr_active_file 1626153
nr_unevictable 0
nr_mlock 0
nr_anon_pages 332657
nr_mapped 1691
nr_file_pages 2940312
nr_dirty 97
nr_writeback 0
nr_slab_reclaimable 27961

nr_slab_unreclaimable 4145
nr_page_table_pages 4417
nr_kernel_stack 104
nr_unstable 0
nr_bounce 0
nr_vmscan_write 7441636
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0

numa_hit 568448792
numa_miss 291882873
numa_foreign 57054371
numa_interleave 678973
numa_local 567389682
numa_other 292941983
nr_anon_transparent_hugepages 2505
protection: (0, 0, 0, 0)
pagesets
cpu: 0

count: 44
high: 186
batch: 31
vm stats threshold: 80
cpu: 1
count: 98
high: 186
batch: 31
vm stats threshold: 80
cpu: 2

count: 171
high: 186
batch: 31
vm stats threshold: 80
cpu: 3
count: 28
high: 186
batch: 31
vm stats threshold: 80
cpu: 4

count: 55
high: 186
batch: 31
vm stats threshold: 80
cpu: 5
count: 32
high: 186
batch: 31
vm stats threshold: 80
cpu: 6

count: 100
high: 186
batch: 31
vm stats threshold: 80
cpu: 7
count: 204
high: 186
batch: 31
vm stats threshold: 80
cpu: 8

count: 30
high: 186
batch: 31
vm stats threshold: 80
cpu: 9
count: 154
high: 186
batch: 31
vm stats threshold: 80
cpu: 10

count: 30
high: 186
batch: 31
vm stats threshold: 80
cpu: 11
count: 91
high: 186
batch: 31
vm stats threshold: 80
all_unreclaimable: 0

prev_priority: 12
start_pfn: 17039360
inactive_ratio: 25

Answer



Your problem is here:



HugePages_Total:   46268
HugePages_Free: 46268
HugePages_Rsvd: 0

HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 5056 kB


You have enabled some huge page assignments and aren't using them. These hugepages have taken up a big slice of your memory. Disable this assignment.



Run



sysctl -w vm.nr_hugepages=0



Then check /etc/sysctl.conf and remove the assignment you have set.



This should free up about 90GiB of memory you are wasting.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...