DMP IO statistic thread may cause out of memory issue resulting in Linux OOM (Out Of Memory) killer and causes system panic.

Article:TECH197980  |  Created: 2012-10-06  |  Updated: 2012-11-17  |  Article URL http://www.symantec.com/docs/TECH197980
Article Type
Technical Solution


Environment

Issue



System panics after the process of expanding DMP IO statistic queue size. The following stack message can be observed in syslog before panic:

oom_kill_process+0x8a/0x2c0
select_bad_process+0xe1/0x120
out_of_memory+0x220/0x3c0
__alloc_pages_nodemask+0x89e/0x940
alloc_pages_current+0xaa/0x110
__vmalloc_area_node+0xe6/0x190
dmp_alloc+0x176/0x220 [vxdmp]
__vmalloc_node+0xa2/0xb0
dmp_alloc+0x176/0x220 [vxdmp]
vmalloc_32+0x2c/0x30
dmp_alloc+0x176/0x220
dmp_zalloc+0x1e/0x50
dmp_iostatq_add+0xef/0x690
dmp_iostatq_op+0x2cc/0x800
dmp_process_stats+0x0/0xe60
dmp_daemons_loop+0x1d7/0x260


Error



From syslog:

Sep 19 12:31:55 node4 kernel: dmp_daemon invoked oom-killer: gfp_mask=0xd4, order=0, oom_adj=0, oom_score_adj=0
Sep 19 12:31:55 node4 kernel: Pid: 12488, comm: dmp_daemon Tainted: P           ----------------   2.6.32-220.7.1.el6.x86_64#1
Sep 19 12:31:55 node4 kernel: Call Trace:
Sep 19 12:31:55 node4 kernel: [<ffffffff810c2c61>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Sep 19 12:31:55 node4 kernel: [<ffffffff811139e0>] ? dump_header+0x90/0x1b0
Sep 19 12:31:55 node4 kernel: [<ffffffff8120d7ac>] ? security_real_capable_noaudit+0x3c/0x70
Sep 19 12:31:55 node4 kernel: [<ffffffff81113e6a>] ? oom_kill_process+0x8a/0x2c0
Sep 19 12:31:55 node4 kernel: [<ffffffff81113da1>] ? select_bad_process+0xe1/0x120
Sep 19 12:31:55 node4 kernel: [<ffffffff811142c0>] ? out_of_memory+0x220/0x3c0
Sep 19 12:31:55 node4 kernel: [<ffffffff81123fde>] ? __alloc_pages_nodemask+0x89e/0x940
Sep 19 12:31:55 node4 kernel: [<ffffffff81158b2a>] ? alloc_pages_current+0xaa/0x110
Sep 19 12:31:55 node4 kernel: [<ffffffff81149c36>] ? __vmalloc_area_node+0xe6/0x190
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c6656>] ? dmp_alloc+0x176/0x220 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81149b42>] ? __vmalloc_node+0xa2/0xb0
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c6656>] ? dmp_alloc+0x176/0x220 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81149d9c>] ? vmalloc_32+0x2c/0x30
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c6656>] ? dmp_alloc+0x176/0x220 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa07c671e>] ? dmp_zalloc+0x1e/0x50 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa07f8b5f>] ? dmp_iostatq_add+0xef/0x690 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81279000>] ? __bitmap_shift_right+0x130/0x160
Sep 19 12:31:55 node4 kernel: [<ffffffffa07fa66c>] ? dmp_iostatq_op+0x2cc/0x800 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa07fb845>] ? dmp_process_stats+0xaa5/0xe60 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff81012b59>] ? read_tsc+0x9/0x20
Sep 19 12:31:55 node4 kernel: [<ffffffff8109b310>] ? getnstimeofday+0x60/0xf0
Sep 19 12:31:55 node4 kernel: [<ffffffffa07fada0>] ? dmp_process_stats+0x0/0xe60 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffffa0801ca7>] ? dmp_daemons_loop+0x1d7/0x260 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff8100c14a>] ? child_rip+0xa/0x20
Sep 19 12:31:55 node4 kernel: [<ffffffffa0801ad0>] ? dmp_daemons_loop+0x0/0x260 [vxdmp]
Sep 19 12:31:55 node4 kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20
 


Environment



This issue is only noticed on Linux systems running:

-VxVM 5.1SP1 and above


Cause



This issue is tracked via Symantec etrack incident # 2943637.

In the process of expanding DMP IO statistic queue size, memory is allocated in sleep/block way. When Linux kernel can’t satisfy the memory allocation request, i.e. system under high load and the amount of per-CPU memory chunk can be large since amounts of CPU, it will invoke OOM killer to kill other processes/threads to free more memory, which may cause system panic.


Solution



Symantec has made code changes to allocate memory in non-sleep way in the process of expanding DMP IO statistic queue size, hence, it will return fail quickly if the system can’t satisfy the request but not invoke OOM killer.

The fix will be available in VxVM 5.1SP1RP3P1 patch which is available from SORT website.

https://sort.symantec.com/patch/detail/6984

 

Until the patch is installed, it is suggested to implement the below workaround.

Workaround

To stop DMP IO statistics collection:

# vxdmpadm iostat stop

 


Supplemental Materials

SourceETrack
Value2943637
Description

DMP IO statistic thread may cause out of memory issue resulting in Linux OOM (Out Of Memory) killer is invoked and causes system panic



Article URL http://www.symantec.com/docs/TECH197980


Terms of use for this information are found in Legal Notices