I have a problem where my Java process gets killed by the kernel OOM killer. I'm not sure why that is happening, because according to syslog I still had free swap space:
Jan 15 08:52:24 xyz-server kernel: Free swap = 3885844kB
Jan 15 08:52:24 xyz-server kernel: Total swap = 4194296kB
I have set vm.swappiness setting to 0. I've understood that it means that kernel will swap only if it can prevent an OOM situation so I thought it would be ok. Was that a bad idea?
I'm running Centos 6 and I attached the full syslog below:
Jan 15 08:52:24 xyz-server kernel: shibd invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0
Jan 15 08:52:24 xyz-server kernel: shibd cpuset=/ mems_allowed=0
Jan 15 08:52:24 xyz-server kernel: Pid: 18630, comm: shibd Tainted: G W --------------- 2.6.32-358.14.1.el6.x86_64 #1
Jan 15 08:52:24 xyz-server kernel: Call Trace:
Jan 15 08:52:24 xyz-server kernel: [] ? cpuset_print_task_mems_allowed+0x91/0xb0
Jan 15 08:52:24 xyz-server kernel: [] ? dump_header+0x90/0x1b0
Jan 15 08:52:24 xyz-server kernel: [] ? security_real_capable_noaudit+0x3c/0x70
Jan 15 08:52:24 xyz-server kernel: [] ? oom_kill_process+0x82/0x2a0
Jan 15 08:52:24 xyz-server kernel: [] ? select_bad_process+0xe1/0x120
Jan 15 08:52:24 xyz-server kernel: [] ? out_of_memory+0x220/0x3c0
Jan 15 08:52:24 xyz-server kernel: [] ? __alloc_pages_nodemask+0x8ac/0x8d0
Jan 15 08:52:24 xyz-server kernel: [] ? alloc_pages_current+0xaa/0x110
Jan 15 08:52:24 xyz-server kernel: [] ? __get_free_pages+0xe/0x50
Jan 15 08:52:24 xyz-server kernel: [] ? copy_process+0xe4/0x1450
Jan 15 08:52:24 xyz-server kernel: [] ? __do_page_fault+0x1ec/0x480
Jan 15 08:52:24 xyz-server kernel: [] ? do_fork+0x94/0x460
Jan 15 08:52:24 xyz-server kernel: [] ? sys_clone+0x28/0x30
Jan 15 08:52:24 xyz-server kernel: [] ? stub_clone+0x13/0x20
Jan 15 08:52:24 xyz-server kernel: [] ? system_call_fastpath+0x16/0x1b
Jan 15 08:52:24 xyz-server kernel: Mem-Info:
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA per-cpu:
Jan 15 08:52:24 xyz-server kernel: CPU 0: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 1: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 2: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 3: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA32 per-cpu:
Jan 15 08:52:24 xyz-server kernel: CPU 0: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 1: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 2: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 3: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: Node 0 Normal per-cpu:
Jan 15 08:52:24 xyz-server kernel: CPU 0: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 1: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 2: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 3: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: active_anon:3079090 inactive_anon:392870 isolated_anon:10
Jan 15 08:52:24 xyz-server kernel: active_file:51 inactive_file:131 isolated_file:0
Jan 15 08:52:24 xyz-server kernel: unevictable:0 dirty:0 writeback:2 unstable:0
Jan 15 08:52:24 xyz-server kernel: free:30217 slab_reclaimable:13388 slab_unreclaimable:11090
Jan 15 08:52:24 xyz-server kernel: mapped:81 shmem:142 pagetables:13866 bounce:0
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA free:15528kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15136kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jan 15 08:52:24 xyz-server kernel: lowmem_reserve[]: 0 3000 5020 5020
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA32 free:22312kB min:14224kB low:17780kB high:21336kB active_anon:2192960kB inactive_anon:559136kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:1228kB slab_unreclaimable:1940kB kernel_stack:616kB pagetables:716kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 15 08:52:24 xyz-server kernel: lowmem_reserve[]: 0 0 2020 2020
Jan 15 08:52:24 xyz-server kernel: Node 0 Normal free:83028kB min:53284kB low:66604kB high:79924kB active_anon:10123400kB inactive_anon:1012344kB active_file:204kB inactive_file:524kB unevictable:0kB isolated(anon):40kB isolated(file):0kB present:11505664kB mlocked:0kB dirty:0kB writeback:8kB mapped:324kB shmem:568kB slab_reclaimable:52324kB slab_unreclaimable:42420kB kernel_stack:3688kB pagetables:54748kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 15 08:52:24 xyz-server kernel: lowmem_reserve[]: 0 0 0 0
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA: 2*4kB 2*8kB 1*16kB 2*32kB 1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15528kB
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA32: 99*4kB 114*8kB 141*16kB 108*32kB 85*64kB 49*128kB 12*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 22316kB
Jan 15 08:52:24 xyz-server kernel: Node 0 Normal: 14355*4kB 455*8kB 209*16kB 118*32kB 70*64kB 27*128kB 15*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 83028kB
Jan 15 08:52:24 xyz-server kernel: 7840 total pagecache pages
Jan 15 08:52:24 xyz-server kernel: 7519 pages in swap cache
Jan 15 08:52:24 xyz-server kernel: Swap cache stats: add 2995034, delete 2987515, find 314611560/314790727
Jan 15 08:52:24 xyz-server kernel: Free swap = 3885844kB
Jan 15 08:52:24 xyz-server kernel: Total swap = 4194296kB
Jan 15 08:52:24 xyz-server kernel: 3670000 pages RAM
Jan 15 08:52:24 xyz-server kernel: 71979 pages reserved
Jan 15 08:52:24 xyz-server kernel: 21635 pages shared
Jan 15 08:52:24 xyz-server kernel: 3528964 pages non-shared
Jan 15 08:52:24 xyz-server kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Jan 15 08:52:24 xyz-server kernel: [ 442] 0 442 2659 2 0 -17 -1000 udevd
Jan 15 08:52:24 xyz-server kernel: [ 1109] 0 1109 17425 81 3 0 0 vmtoolsd
Jan 15 08:52:24 xyz-server kernel: [ 1253] 0 1253 23299 2 1 -17 -1000 auditd
Jan 15 08:52:24 xyz-server kernel: [ 1269] 0 1269 62464 37 1 0 0 rsyslogd
Jan 15 08:52:24 xyz-server kernel: [ 1287] 32 1287 4743 1 0 0 0 rpcbind
Jan 15 08:52:24 xyz-server kernel: [ 1323] 29 1323 5836 1 0 0 0 rpc.statd
Jan 15 08:52:24 xyz-server kernel: [ 1351] 0 1351 6290 1 1 0 0 rpc.idmapd
Jan 15 08:52:24 xyz-server kernel: [ 1372] 81 1372 5895 1 1 0 0 dbus-daemon
Jan 15 08:52:24 xyz-server kernel: [ 1383] 70 1383 7434 2 3 0 0 avahi-daemon
Jan 15 08:52:24 xyz-server kernel: [ 1384] 70 1384 7434 1 1 0 0 avahi-daemon
Jan 15 08:52:24 xyz-server kernel: [ 1423] 0 1423 113067 1 0 0 0 automount
Jan 15 08:52:24 xyz-server kernel: [ 1443] 0 1443 16563 15 3 -17 -1000 sshd
Jan 15 08:52:24 xyz-server kernel: [ 1564] 0 1564 29312 8 1 0 0 crond
Jan 15 08:52:24 xyz-server kernel: [ 1572] 0 1572 6281 1 1 0 0 oddjobd
Jan 15 08:52:24 xyz-server kernel: [ 1604] 0 1604 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1606] 0 1606 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1608] 0 1608 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1610] 0 1610 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1612] 0 1612 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 2942] 0 2942 258414 1 1 0 0 console-kit-dae
Jan 15 08:52:24 xyz-server kernel: [23467] 38 23467 8059 17 1 0 0 ntpd
Jan 15 08:52:24 xyz-server kernel: [27532] 0 27532 2658 2 3 -17 -1000 udevd
Jan 15 08:52:24 xyz-server kernel: [ 2172] 0 2172 65647 620 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [ 8027] 0 8027 1014 1 1 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [18630] 498 18630 636245 14471 3 0 0 shibd
Jan 15 08:52:24 xyz-server kernel: [18719] 0 18719 49858 30 0 0 0 sssd
Jan 15 08:52:24 xyz-server kernel: [18720] 0 18720 74218 2279 0 0 0 sssd_be
Jan 15 08:52:24 xyz-server kernel: [18721] 0 18721 50428 56 3 0 0 sssd_nss
Jan 15 08:52:24 xyz-server kernel: [18722] 0 18722 48008 5 2 0 0 sssd_pam
Jan 15 08:52:24 xyz-server kernel: [18723] 0 18723 48703 1 3 0 0 sssd_ssh
Jan 15 08:52:24 xyz-server kernel: [18724] 0 18724 47528 4 1 0 0 sssd_sudo
Jan 15 08:52:24 xyz-server kernel: [18725] 0 18725 52553 1 2 0 0 sssd_pac
Jan 15 08:52:24 xyz-server kernel: [18749] 0 18749 15560 1 1 0 0 certmonger
Jan 15 08:52:24 xyz-server kernel: [18849] 0 18849 20820 14 1 0 0 master
Jan 15 08:52:24 xyz-server kernel: [18852] 89 18852 20883 2 0 0 0 qmgr
Jan 15 08:52:24 xyz-server kernel: [23143] 500 23143 4285995 3415938 2 0 0 java
Jan 15 08:52:24 xyz-server kernel: [23198] 0 23198 2658 2 0 -17 -1000 udevd
Jan 15 08:52:24 xyz-server kernel: [18831] 0 18831 8062 61 0 0 0 rotatelogs
Jan 15 08:52:24 xyz-server kernel: [18838] 0 18838 8062 62 0 0 0 rotatelogs
Jan 15 08:52:24 xyz-server kernel: [21841] 48 21841 104759 1396 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22089] 48 22089 104759 1382 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22178] 48 22178 104759 1365 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22234] 48 22234 104759 1367 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22241] 48 22241 104759 1359 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22261] 48 22261 104759 1368 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22272] 48 22272 104759 1363 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22296] 48 22296 104759 1375 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22336] 48 22336 104759 1364 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22348] 48 22348 104759 1354 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22349] 48 22349 104759 1365 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22356] 48 22356 104759 1361 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22361] 48 22361 104759 1364 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22372] 48 22372 104759 1356 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22375] 48 22375 104759 1352 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22389] 48 22389 104759 1362 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22390] 48 22390 104759 1360 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22397] 48 22397 104759 1357 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22398] 48 22398 104759 1359 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22401] 48 22401 104759 1358 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22430] 89 22430 20840 218 0 0 0 pickup
Jan 15 08:52:24 xyz-server kernel: [22435] 48 22435 104759 1354 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22441] 48 22441 104759 1348 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22457] 48 22457 104759 1345 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22461] 48 22461 104759 1337 3 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22464] 48 22464 104713 1307 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22465] 48 22465 104759 1332 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22470] 48 22470 104759 1338 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22471] 48 22471 104759 1337 3 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22472] 48 22472 104759 1347 3 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22473] 48 22473 104713 1308 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22483] 48 22483 104713 1408 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22487] 48 22487 104759 1430 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22488] 48 22488 104713 1397 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22490] 48 22490 104759 1472 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22496] 48 22496 85768 1404 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22497] 48 22497 85768 1404 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22498] 48 22498 85768 1404 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22504] 48 22504 88329 1408 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: Out of memory: Kill process 23143 (java) score 748 or sacrifice child
Jan 15 08:52:24 xyz-server kernel: Killed process 23143, UID 500, (java) total-vm:17143980kB, anon-rss:13663732kB, file-rss:16kB
Answer
Your problem is clear anon-rss:13663732kB
, the kernel allocation can sleep using GFP(GET FREE PAGE) depend of who is do the memory allocation, for example if the server is tight of memory and one user request 1M of memory, the kernel can sleep and try to free memory for satisfy the memory request, migrating the less use page in swap, but in your case the kernel try to allocate two pages creating a process do_fork, for the kernel that is critical path and it cannot sleep in that area.
No comments:
Post a Comment