How to prepare a crash dump analysis on the linux system

Article:TECH147736  |  Created: 2011-01-08  |  Updated: 2011-01-28  |  Article URL http://www.symantec.com/docs/TECH147736
Article Type
Technical Solution


Environment

Problem



System panic'ed with panic strings "vx_worklist_thr"


Error



[ ERROR MESSAGES ]
At the time of panic the current process was vx_worklist_thr:

PID: 4916 TASK: ffff81187c0d1820 CPU: 13 COMMAND: "vx_worklist_thr"
#0 [ffff81012bf07c88] crash_kexec at ffffffff800acb03
#1 [ffff81012bf07d48] panic at ffffffff80090e7c
#2 [ffff81012bf07e88] gab_kill_process at ffffffff88d2f91b
#3 [ffff81012bf07ec8] gab_timerscan at ffffffff88d20483
#4 [ffff81012bf07f28] run_timer_softirq at ffffffff80096e0a
#5 [ffff81012bf07f58] __do_softirq at ffffffff8001235a
#6 [ffff81012bf07f88] call_softirq at ffffffff8005e2fc
#7 [ffff81012bf07fa0] do_softirq at ffffffff8006cb20
#8 [ffff81012bf07fb0] apic_timer_interrupt at ffffffff8005dc8e
--- <IRQ stack> ---
#9 [ffff81187d561798] apic_timer_interrupt at ffffffff8005dc8e
[exception RIP: shrink_inactive_list+2172]
RIP: ffffffff800ca128 RSP: ffff81187d561840 RFLAGS: 00000202
RAX: 0000000000000020 RBX: ffff810ca77f9bb0 RCX: ffff810009041580
RDX: 0000000000000020 RSI: ffff810c80001f88 RDI: ffff810ca77f9bb0
RBP: ffffffff800ce991 R8: ffff810c80001600 R9: 000000000000002e
R10: ffff810c80001600 R11: 000000007d561a90 R12: ffff81181289cb80
R13: ffff8107a8ead7c8 R14: ffff810cd7841ad8 R15: 000000000d4eb000
ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018
#10 [ffff81187d561a28] shrink_zone at ffffffff80013043
#11 [ffff81187d561a68] try_to_free_pages at ffffffff800ca8bb


Environment



[ VERSION  OF  OS/PACKAGE  ]
- RELEASE: 2.6.18-164.2.1.el5
- SFHA5.0MP3RP2


Cause



- Low Memory (Ran out of memory)


Solution



[ How to prepare for crash dump analysis on the linux system ]

1. The first thing to do is to locate a machine with the same architecture as the customer.
 
2. Need to check the customer’s vxexplorer output, specifically “uname –a” output.
#uname –a
Linux symc-linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:54:55 EDT 2009 ia64 ia64 ia64 GNU/Linux
 
The above output indicates :
Architecture = i64 (itanium)
Kernel version = 2.6.18-164.el5 
 
So need to locate a server with itanium processor with Kernel version = 2.6.18-164.el5.
 
3. Once a machine (i64) is located, need to download the debug kernels of the kernel version above.
It could be either a RHEL 5 or older kernel (this one is RHEL 5 … see the kernel version above)
 
1) The RHEL5 debug kernels can be found from URL below;
 
The RHEL 4 or earlier debug kernels can be found from following URL.
 
2) Locate the debug kernels based on the server architecture as below ;
 
3) Download the 2 debug kernels above with “wget” command as below;
 
/--------------------------------------------------------------------------\
           => `kernel-debuginfo-2.6.18-164.el5.ia64.rpm'
Resolving ftp.redhat.com... 209.132.183.61
Connecting to ftp.redhat.com|209.132.183.61|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/redhat/linux/enterprise/5Server/en/os/ia64/Debuginfo ... done.
==> SIZE kernel-debuginfo-2.6.18-164.el5.ia64.rpm ... 167591790
==> PASV ... done.    ==> RETR kernel-debuginfo-2.6.18-164.el5.ia64.rpm ... done.
Length: 167591790 (160M)
 
 1% [                                       ] 1,947,520   76.3K/s  eta 30m 38s
\--------------------------------------------------------------------------/
 
4. Once the 2 debug kernels were downloaded, install them as below on the target machine located from step 2.
The debug kernel will be placed in /usr/lib/debug/lib/modules/<version>vmlinux
Note: Please install both kernel-debuginfo-2.6.18-164.el5.ia64.rpm and kernel-debuginfo-common-2.6.18-164.el5.ia64.rpm
  
 
1) Install the debuginfo-common kernel.
 
[symc-internal-linux/]# rpm -ivh kernel-debuginfo-common-2.6.18-164.el5.ia64.rpm
warning: kernel-debuginfo-common-2.6.18-164.el5.ia64.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
        file /usr/src/debug/kernel-2.6.18/xen/common/domain.c from install of kernel-debuginfo-common-2.6.18-164.el5.ia64 conflicts with file from package kernel-debuginfo-common-2.6.18-128.el5.x86_64
        file /usr/src/debug/kernel-2.6.18/xen/common/domctl.c from install of kernel-debuginfo-common-2.6.18-164.el5.ia64 conflicts with file from package kernel-debuginfo-common-2.6.18-128.el5.x86_64
        file /usr/src/debug/kernel-2.6.18/xen/common/event_channel.c from install of kernel-debuginfo-common-2.6.18-164.el5.ia64 conflicts with file from package kernel-debuginfo-common-2.6.18-128.el5.x86_64
        file /usr/src/debug/kernel-2.6.18/xen/common/keyhandler.c from install of kernel-debuginfo-common-2.6.18-164.el5.ia64 conflicts with file from package kernel-debuginfo-common-2.6.18-128.el5.x86_64
 
Note: If this debug kernel version conflicted with an already installed one, force install it as below;
 
[symc-internal-linux/]# rpm -ivh --force kernel-debuginfo-common-2.6.18-164.el5.ia64.rpm
warning: kernel-debuginfo-common-2.6.18-164.el5.ia64.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:kernel-debuginfo-common########################################### [100%]
 
2) Now install the debuginfo kernel.
 
[symc-internal-linux/]# rpm -ivh kernel-debuginfo-2.6.18-164.el5.ia64.rpm
warning: kernel-debuginfo-2.6.18-164.el5.ia64.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:kernel-debuginfo       ########################################### [100%]
  
3) Now confirm these modules were installed as below;
 
[root@symc-internal-linux]# pwd
/usr/lib/debug/lib/modules/2.6.18-164.el5
 
[root@symc-internal-linux]# ls
kernel  vmlinux
 
 
5. Start crash dump anlaysis.
   The map file is not required for 2.6.* kernels, BUT is required for 2.4.* kernels   
 
[symc-internal-linux127.0.0.1-2010-05-08-06:10:30]# crash /usr/lib/debug/lib/modules/2.6.18-164.el5/vmlinux ./vmcore
 
crash 4.0-7.2.3
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "ia64-unknown-linux-gnu"...                                     <-----------------
 
 
      KERNEL: /usr/lib/debug/lib/modules/2.6.18-164.el5/vmlinux
    DUMPFILE: ./vmcore  [PARTIAL DUMP]
        CPUS: 32
        DATE: Sat May  8 07:08:15 2010
      UPTIME: 37 days, 18:05:45
LOAD AVERAGE: 0.89, 0.89, 0.69
       TASKS: 1259
    NODENAME: symc-linux
     RELEASE: 2.6.18-164.el5
     VERSION: #1 SMP Tue Aug 18 15:54:55 EDT 2009
     MACHINE: ia64  (1598 Mhz)
      MEMORY: 127.6 GB
       PANIC: " <0>Kernel panic - not syncing: Fatal exception"
         PID: 9504
     COMMAND: "vx_naio_worker"
        TASK: e000001057728000  [THREAD_INFO: e000001057729040]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)
 
 
Note: For more detailed crash dump analysis, please refer to TECH147735 ( How to do a basic linux system crash dump analysis prior to backine escalation.)




Article URL http://www.symantec.com/docs/TECH147736


Terms of use for this information are found in Legal Notices