Can you please tell me what cause this Oracle process killed? Seems like plenty of RAM free, and plenty of SWAP free. There followed few other oracle processes killed.
The VM has the 16G of vMem and 8 vCPU.
But I am posting here the first oracle process that got killed:
Mar 1 20:00:58 ******* kernel: oracle invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
Mar 1 20:00:58 ******* kernel: oracle cpuset=/ mems_allowed=0
Mar 1 20:00:58 ******* kernel: Pid: 2370, comm: oracle Not tainted 2.6.32-431.el6.x86_64 #1
Mar 1 20:00:58 ******* kernel: Call Trace:
Mar 1 20:00:58 ******* kernel: [] ? cpuset_print_task_mems_allowed+0x91/0xb0
Mar 1 20:00:58 ******* kernel: [] ? dump_header+0x90/0x1b0
Mar 1 20:00:58 ******* kernel: [] ? security_real_capable_noaudit+0x3c/0x70
Mar 1 20:00:58 ******* kernel: [] ? oom_kill_process+0x82/0x2a0
Mar 1 20:00:58 ******* kernel: [] ? select_bad_process+0xe1/0x120
Mar 1 20:00:58 ******* kernel: [] ? out_of_memory+0x220/0x3c0
Mar 1 20:00:58 ******* kernel: [] ? __alloc_pages_nodemask+0x8ac/0x8d0
Mar 1 20:00:58 ******* kernel: [] ? alloc_pages_vma+0x9a/0x150
Mar 1 20:00:58 ******* kernel: [] ? handle_pte_fault+0x73d/0xb00
Mar 1 20:00:58 ******* kernel: [] ? free_pgtables+0xce/0x120
Mar 1 20:00:58 ******* kernel: [] ? unmap_region+0xcd/0x130
Mar 1 20:00:58 ******* kernel: [] ? vma_prio_tree_add+0x75/0xd0
Mar 1 20:00:58 ******* kernel: [] ? handle_mm_fault+0x22a/0x300
Mar 1 20:00:58 ******* kernel: [] ? __do_page_fault+0x138/0x480
Mar 1 20:00:58 ******* kernel: [] ? do_mmap_pgoff+0x335/0x380
Mar 1 20:00:58 ******* kernel: [] ? do_page_fault+0x3e/0xa0
Mar 1 20:00:58 ******* kernel: [] ? page_fault+0x25/0x30
Mar 1 20:00:58 ******* kernel: Mem-Info:
Mar 1 20:00:58 ******* kernel: Node 0 DMA per-cpu:
Mar 1 20:00:58 ******* kernel: CPU 0: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 1: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 2: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 3: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 4: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 5: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 6: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 7: hi: 0, btch: 1 usd: 0
Mar 1 20:00:58 ******* kernel: Node 0 DMA32 per-cpu:
Mar 1 20:00:58 ******* kernel: CPU 0: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 1: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 2: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 3: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 4: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 5: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 6: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 7: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: Node 0 Normal per-cpu:
Mar 1 20:00:58 ******* kernel: CPU 0: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 1: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 2: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 3: hi: 186, btch: 31 usd: 20
Mar 1 20:00:58 ******* kernel: CPU 4: hi: 186, btch: 31 usd: 32
Mar 1 20:00:58 ******* kernel: CPU 5: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: CPU 6: hi: 186, btch: 31 usd: 184
Mar 1 20:00:58 ******* kernel: CPU 7: hi: 186, btch: 31 usd: 0
Mar 1 20:00:58 ******* kernel: active_anon:2673615 inactive_anon:368657 isolated_anon:0
Mar 1 20:00:58 ******* kernel: active_file:3541 inactive_file:3962 isolated_file:32
Mar 1 20:00:58 ******* kernel: unevictable:0 dirty:3 writeback:2770 unstable:0
Mar 1 20:00:58 ******* kernel: free:33763 slab_reclaimable:16555 slab_unreclaimable:28221
Mar 1 20:00:58 ******* kernel: mapped:1517627 shmem:1730877 pagetables:906135 bounce:0
Mar 1 20:00:58 ******* kernel: Node 0 DMA free:15132kB min:60kB low:72kB high:88kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:14740kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Mar 1 20:00:58 ******* kernel: lowmem_reserve[]: 0 3000 16130 16130
Mar 1 20:00:58 ******* kernel: Node 0 DMA32 free:64904kB min:12556kB low:15692kB high:18832kB active_anon:2064816kB inactive_anon:516452kB active_file:492kB inactive_file:188kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB dirty:0kB writeback:0kB mapped:2319432kB shmem:2352892kB slab_reclaimable:7420kB slab_unreclaimable:3620kB kernel_stack:832kB pagetables:24672kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1 all_unreclaimable? no
Mar 1 20:00:58 ******* kernel: lowmem_reserve[]: 0 0 13130 13130
Mar 1 20:00:58 ******* kernel: Node 0 Normal free:55016kB min:54964kB low:68704kB high:82444kB active_anon:8629644kB inactive_anon:958176kB active_file:13672kB inactive_file:15660kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:13445120kB mlocked:0kB dirty:12kB writeback:11080kB mapped:3751076kB shmem:4570616kB slab_reclaimable:58800kB slab_unreclaimable:109264kB kernel_stack:5360kB pagetables:3599868kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:160 all_unreclaimable? no
Mar 1 20:00:58 ******* kernel: lowmem_reserve[]: 0 0 0 0
Mar 1 20:00:58 ******* kernel: Node 0 DMA: 3*4kB 2*8kB 2*16kB 3*32kB 2*64kB 2*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB 3*4096kB = 15132kB
Mar 1 20:00:58 ******* kernel: Node 0 DMA32: 1225*4kB 859*8kB 878*16kB 547*32kB 184*64kB 34*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 1*4096kB = 65596kB
Mar 1 20:00:58 ******* kernel: Node 0 Normal: 9165*4kB 1804*8kB 46*16kB 2*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 55924kB
Mar 1 20:00:58 ******* kernel: 1760824 total pagecache pages
Mar 1 20:00:58 ******* kernel: 22460 pages in swap cache
Mar 1 20:00:58 ******* kernel: Swap cache stats: add 6636857, delete 6614397, find 15635455/16141480
Mar 1 20:00:58 ******* kernel: Free swap = 33548340kB
Mar 1 20:00:58 ******* kernel: Total swap = 36184056kB
Mar 1 20:00:58 ******* kernel: 4194288 pages RAM
Mar 1 20:00:58 ******* kernel: 111808 pages reserved
Mar 1 20:00:58 ******* kernel: 59252583 pages shared
Mar 1 20:00:58 ******* kernel: 2502605 pages non-shared
Mar 1 20:00:58 ******* kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Mar 1 20:00:58 ******* kernel: [ 612] 0 612 2769 42 2 -17 -1000 udevd
Mar 1 20:00:58 ******* kernel: [ 1872] 0 1872 47365 204 7 0 0 vmtoolsd
Mar 1 20:00:58 ******* kernel: [ 1980] 0 1980 23294 109 6 -17 -1000 auditd
Mar 1 20:00:58 ******* kernel: [ 1996] 0 1996 62898 842 4 0 0 rsyslogd
Mar 1 20:00:58 ******* kernel: [ 2025] 0 2025 2738 93 3 0 0 irqbalance
Mar 1 20:00:58 ******* kernel: [ 2039] 32 2039 4744 68 4 0 0 rpcbind
Mar 1 20:00:58 ******* kernel: [ 2071] 29 2071 5837 61 3 0 0 rpc.statd
Mar 1 20:00:58 ******* kernel: [ 2092] 0 2092 5773 31 1 0 0 rpc.idmapd
Mar 1 20:00:58 ******* kernel: [ 2211] 0 2211 39323 127 5 0 0 pbx_exchange
Mar 1 20:00:58 ******* kernel: [ 2223] 0 2223 48106 158 5 0 0 winbindd
Mar 1 20:00:58 ******* kernel: [ 2237] 0 2237 1020 48 4 0 0 acpid
Mar 1 20:00:58 ******* kernel: [ 2323] 0 2323 49766 281 0 0 0 winbindd
Mar 1 20:00:58 ******* kernel: [ 2540] 0 2540 26827 11 5 0 0 rpc.rquotad
Mar 1 20:00:58 ******* kernel: [ 2544] 0 2544 5414 41 5 0 0 rpc.mountd
Mar 1 20:00:58 ******* kernel: [ 2580] 0 2580 1570 23 0 0 0 mcelog
Mar 1 20:00:58 ******* kernel: [ 2592] 0 2592 16651 78 5 -17 -1000 sshd
Mar 1 20:00:58 ******* kernel: [ 2600] 0 2600 5545 105 3 0 0 xinetd
Mar 1 20:00:58 ******* kernel: [ 2608] 38 2608 7147 132 5 0 0 ntpd
Mar 1 20:00:58 ******* kernel: [ 2618] 498 2618 25741 57 2 0 0 uuidd
Mar 1 20:00:58 ******* kernel: [ 2630] 0 2630 43170 139 3 0 0 vnetd
Mar 1 20:00:58 ******* kernel: [ 2638] 0 2638 52398 158 2 0 0 bpcd
Mar 1 20:00:58 ******* kernel: [ 2655] 0 2655 198335 478 4 0 0 nbdisco
Mar 1 20:00:58 ******* kernel: [ 2676] 0 2676 76958 82 2 0 0 mtstrmd
Mar 1 20:00:58 ******* kernel: [ 2707] 0 2707 22314 141 0 0 0 sendmail
Mar 1 20:00:58 ******* kernel: [ 2716] 51 2716 19658 80 4 0 0 sendmail
Mar 1 20:00:58 ******* kernel: [ 2734] 0 2734 200856 353 7 0 0 avagent.bin
Mar 1 20:00:58 ******* kernel: [ 2747] 0 2747 44287 178 3 0 0 tuned
Mar 1 20:00:58 ******* kernel: [ 2757] 0 2757 29333 103 6 0 0 crond
Mar 1 20:00:58 ******* kernel: [ 2778] 0 2778 27431 167 7 0 0 saphostexec
Mar 1 20:00:58 ******* kernel: [ 2805] 600 2805 545016 4031 5 0 0 sapstartsrv
Mar 1 20:00:58 ******* kernel: [ 2885] 834 2885 100602 294 3 0 0 sapstartsrv
Mar 1 20:00:58 ******* kernel: [ 2904] 0 2904 5385 31 6 0 0 atd
Mar 1 20:00:58 ******* kernel: [ 2928] 0 2928 26005 69 5 0 0 rhsmcertd
Mar 1 20:00:58 ******* kernel: [ 2935] 0 2935 8154 1110 0 0 0 saposcol
Mar 1 20:00:58 ******* kernel: [ 3098] 834 3098 13538 50 3 0 0 sapstart
Mar 1 20:00:58 ******* kernel: [ 3128] 834 3128 43278 119 5 0 0 jc.sapDAA_SMDA9
Mar 1 20:00:58 ******* kernel: [ 3144] 834 3144 1276839 57796 4 0 0 jstart
Mar 1 20:00:58 ******* kernel: [ 3211] 703 3211 33752 378 5 0 0 perl
Mar 1 20:00:58 ******* kernel: [ 3288] 703 3288 1181563 62355 0 0 0 java
Mar 1 20:00:58 ******* kernel: [ 3497] 0 3497 1016 34 1 0 0 mingetty
Mar 1 20:00:58 ******* kernel: [ 3499] 0 3499 1016 34 1 0 0 mingetty
Mar 1 20:00:58 ******* kernel: [ 3502] 0 3502 1016 34 1 0 0 mingetty
Mar 1 20:00:58 ******* kernel: [ 3504] 0 3504 1016 34 2 0 0 mingetty
Mar 1 20:00:58 ******* kernel: [ 3506] 0 3506 1016 34 1 0 0 mingetty
Mar 1 20:00:58 ******* kernel: [ 3508] 0 3508 1016 34 1 0 0 mingetty
Mar 1 20:00:58 ******* kernel: [ 3515] 0 3515 3098 41 2 -17 -1000 udevd
Mar 1 20:00:58 ******* kernel: [ 3516] 0 3516 3098 41 4 -17 -1000 udevd
Mar 1 20:00:58 ******* kernel: [13764] 0 13764 48089 89 7 0 0 winbindd
Mar 1 20:00:58 ******* kernel: [13765] 0 13765 48089 92 7 0 0 winbindd
Mar 1 20:00:58 ******* kernel: [13873] 703 13873 2403434 6196 5 0 0 oracle
Mar 1 20:00:58 ******* kernel: [13875] 703 13875 2402873 651 3 0 0 oracle
Mar 1 20:00:58 ******* kernel: [13880] 703 13880 2402873 423 4 0 0 oracle
Mar 1 20:00:58 ******* kernel: [13875] 703 13875 2402873 651 3 0 0 oracle
Mar 1 20:00:58 ******* kernel: [13880] 703 13880 2402873 423 4 0 0 oracle
.. Note: Removed bunch of oracle processes here so as to limit the character length for the posting here. Total of 296 oracle process running.
..
Mar 1 20:00:59 ******* kernel: [18644] 0 18644 44207 371 1 0 0 bpclntcmd
Mar 1 20:00:59 ******* kernel: [18647] 703 18647 57442 240 3 0 0 oracle
Mar 1 20:00:59 ******* kernel: [18656] 703 18656 57442 185 6 0 0 oracle
Mar 1 20:00:59 ******* kernel: [18657] 54329 18657 9279 196 1 0 0 nrpe
Mar 1 20:00:59 ******* kernel: [18660] 54329 18660 9314 255 2 0 0 nrpe
Mar 1 20:00:59 ******* kernel: [18662] 0 18662 39263 289 5 0 0 crond
Mar 1 20:00:59 ******* kernel: [18663] 0 18663 5745 341 1 0 0 saposcol
Mar 1 20:00:59 ******* kernel: [18664] 54329 18664 9315 146 3 0 0 nrpe
Mar 1 20:00:59 ******* kernel: [18665] 54329 18665 5730 76 0 0 0 check_open_file
Mar 1 20:00:59 ******* kernel: [18666] 54329 18666 6611 191 4 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18667] 0 18667 8389 183 1 0 0 sapcimb
Mar 1 20:00:59 ******* kernel: [18669] 0 18669 6610 171 0 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18670] 0 18670 6610 171 0 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18677] 0 18677 6610 177 5 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18678] 703 18678 29497 275 4 0 0 perl
Mar 1 20:00:59 ******* kernel: [18682] 703 18682 29497 252 7 0 0 perl
Mar 1 20:00:59 ******* kernel: [18683] 703 18683 29497 231 0 0 0 perl
Mar 1 20:00:59 ******* kernel: [18687] 0 18687 2620 92 1 0 0 .SAPOSCOL_00000
Mar 1 20:00:59 ******* kernel: [18688] 0 18688 6610 186 5 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18689] 0 18689 6610 189 2 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18690] 0 18690 6610 191 3 0 0 xinetd
Mar 1 20:00:59 ******* kernel: [18691] 0 18691 6610 194 2 0 0 xinetd
Mar 1 20:00:59 ******* kernel: Out of memory: Kill process 13900 (oracle) score 77 or sacrifice child
Mar 1 20:00:59 ******* kernel: Killed process 13900, UID 703, (oracle) total-vm:9622308kB, anon-rss:5180kB, file-rss:4028040kB
From above, I think these lines says I have plenty of RAM and swap. right?:
Node 0 DMA free:15132kB
Node 0 DMA32 free:64904kB
Node 0 Normal free:55016kB
Free swap = 33548340kB
Total swap = 36184056kB
Wondering what does it mean by "all_unreclaimable? yes" for Node 0 DMA, and "all_unreclaimable? no"Node 0 DMA31 and Node 0 Normal !
Also, here are the info that might give more info about the server settings:
$sudo sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
error: "net.bridge.bridge-nf-call-ip6tables" is an unknown key
error: "net.bridge.bridge-nf-call-iptables" is an unknown key
error: "net.bridge.bridge-nf-call-arptables" is an unknown key
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.msgmni = 1024
kernel.sem = 1250 256000 100 8192
vm.max_map_count = 1000000
kernel.shmall = 1152921504606846720
fs.file-max = 19801952
net.core.rmem_default = 1048576
net.core.wmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
vm.swappiness = 0
vm.dirty_background_ratio = 3
vm.dirty_ratio = 15
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
kernel.shmmni = 4096
Answer
You don't have much free memory at all.
First, vm.swappiness = 0
Only do this if you are definitely sure you have enough. Setting it low to 10 or so might prevent an out of memory condition. And will actually make use of your paging space.
From the summary of node 0, your 16 GB is roughly a quarter page tables, a quarter shared memory, half anonymous program memory, and some various odds and ends. Notice that the readily available file memory, plus free, is only tens of MB, not large. It won't be able to give you another GB or so of shared memory.
Page tables are eating you alive. You may not have huge pages enabled, which Oracle recommends for databases, and Red Hat does too.
No comments:
Post a Comment