Steps to be followed before escalating a case for application core analysis

Article:HOWTO31291  |  Created: 2010-09-08  |  Updated: 2012-07-21  |  Article URL http://www.symantec.com/docs/HOWTO31291
Article Type
How To

Product(s)

Environment


Details:

Analysis of an application core file requires us to have the exact same application environment as the customer's system. Hence it's imperative that we gather all the associated shared libraries for the application binary. Let's see why it's so important. The following is using the vxconfigd as an example, use the binary that caused the core dump so if "had" caused the dump then replace the vxconfigd with had when collecting the information.



Customers system:

==============

/var/tmp ->ps -eaf | grep -i vxconfigd | grep -v grep

   root    14     1  0   Oct 05 ?        0:17 vxconfigd -x syslog -m boot

/var/tmp ->gcore 14

gcore: core.14 dumped

/var/tmp ->pstack core.14

core 'core.14' of 14:   vxconfigd -x syslog -m boot

ff19caa4 poll     (624b28, 17, 493e0)

ff21d738 poll     (624b28, 17, 493e0, 0, 3b9aca00, 0) + 5c

00233850 vold_getrequest (34f000, 0, 56c718, ffffffff, 6c46c0, 0) + 3b0

001a1018 request_loop (357384, ffbffc9c, 303354, 6f670000, 6c6f6700, 0) + 108

0015beec main     (5, ffbffc9c, ffbffcb4, 349c00, 0, 0) + 1484

00045cb0 _start   (0, 0, 0, 0, 0, 0) + b8

/var/tmp ->pmap core.14

core 'core.14' of 14:   vxconfigd -x syslog -m boot

00010000    3224K r-x--  /sbin/vxconfigd

00344000     128K rwx--  /sbin/vxconfigd

00364000    5520K rwx--    [ heap ]

FAB7A000       8K rwx--

FAC7A000       8K rwx--

FB0C0000       8K r-x--  /etc/vx/lib/discovery.d/libvxsena.so.1

FB0D0000       8K rwx--  /etc/vx/lib/discovery.d/libvxsena.so.1

FB0E0000       8K r-x--  /etc/vx/lib/discovery.d/libvxibmsvc.so.1

..

FB300000       8K r-x--  /etc/vx/lib/discovery.d/libvxengenio.so.1

FB310000       8K rwx--  /etc/vx/lib/discovery.d/libvxengenio.so.1

FB320000      16K r-x--  /etc/vx/lib/discovery.d/libvxemc.so.1

FB332000       8K rwx--  /etc/vx/lib/discovery.d/libvxemc.so.1

FB340000       8K r-x--  /etc/vx/lib/discovery.d/libvxcscovrts.so.1

FB350000       8K rwx--  /etc/vx/lib/discovery.d/libvxcscovrts.so.1

FB360000       8K r-x--  /etc/vx/lib/discovery.d/libvxCLARiiON.so.1

FB370000       8K rwx--  /etc/vx/lib/discovery.d/libvxCLARiiON.so.1

..

FF000000      16K rw---

FF010000      24K r-x--  /etc/vx/slib/libnvpair.so.1

FF026000       8K rwx--  /etc/vx/slib/libnvpair.so.1

FF030000      16K r-x--  /etc/vx/slib/libdevice.so.1

..

FF3F6000      16K rwx--  /etc/lib/ld.so.1

FFBF6000      40K rwx--    [ stack ]

total     11952K

/var/tmp ->



Lab system:

=========

# pstack core.14

core 'core.14' of 14:   vxconfigd -x syslog -m boot

ff19caa4 poll     (624b28, 17, 493e0)

ff21d738 pause    (624b28, 17, 493e0, 0, 3b9aca00, 0) + 38

00233850 e4d_sccs (34f000, 0, 56c718, ffffffff, 6c46c0, 0) + 18

001a1018 lic_symbol_537 (357384, ffbffc9c, 303354, 6f670000, 6c6f6700, 0) + 90

0015beec wwn_to_phys_path (5, ffbffc9c, ffbffcb4, 349c00, 0, 0) + 304

00045cb0 vold_change_common (0, 0, 0, 0, 0, 0) + 2eb4

# pmap core.14

core 'core.14' of 14:   vxconfigd -x syslog -m boot

00010000   3224K read/exec         /sbin/vxconfigd

00344000    128K read/write/exec

00364000   5520K read/write/exec     [ heap ]

FAB7A000      8K read/write/exec

..

FF010000     24K read/exec         /etc/vx/slib/libnvpair.so.1

FF026000      8K read/write/exec   /etc/vx/slib/libnvpair.so.1

FF030000     16K read/exec         /etc/vx/slib/libdevice.so.1

FF044000      8K read/write/exec   /etc/vx/slib/libdevice.so.1

..

FFBF6000     40K read/write/exec     [ stack ]

total    11952K

#



It's the same core file, but because the cores runtime environment doesn't match customers environment, we could be way off the track when performing the analysis. Hence we need all the associated libraries which can be found by running the following command



# ->ldd /sbin/vxconfigd



Bryan Wood has created a perl script that gathers all the required information for an application core file and its located at grab_ldd_libraries.sh



How to run the script:

================

/var/tmp ->./grab_ldd_libraries.sh /sbin/vxconfigd core.14

2663362135      4492015 vxconfigd.grab_ldd_libraries.sh.27307.tar.Z

-rw-r--r--   1 anand    sysadmin 4492015 Oct 25 11:54 /tmp/vxconfigd.grab_ldd_libraries.sh.27307.tar.Z

please send this console output

along with /tmp/vxconfigd.grab_ldd_libraries.sh.27307.tar.Z

to your support representative

/var/tmp ->



Steps to be followed after receiving the above tar ball:

=======================================

·              Unzip and untar the tar ball in evidence

·              The most important step of all is to rename the core file. Most of the time, the core file is named "core". Gdb is notorious in one aspect. While trying to open the core file for analysis, if gdb crashes, it dumps a core in the PWD. This will overwrite customers core file. Hence we should rename the core as something else.

·              Verify the core file using gb or dbx. Confirm that the stack of the core file matches the pstack output that's available in the output directory of the tar ball. For example

# pwd

/var/tmp/vxconfigd.grab_ldd_libraries.sh.27307/output

# ls pstack*

pstack.corefile.out

# more pstack*

core 'core.14' of 14:   vxconfigd -x syslog -m boot

ff19caa4 poll     (624b28, 17, 493e0)

ff21d738 poll     (624b28, 17, 493e0, 0, 3b9aca00, 0) + 5c

00233850 vold_getrequest (34f000, 0, 56c718, ffffffff, 6c46c0, 0) + 3b0

001a1018 request_loop (357384, ffbffc9c, 303354, 6f670000, 6c6f6700, 0) + 108

0015beec main     (5, ffbffc9c, ffbffcb4, 349c00, 0, 0) + 1484

00045cb0 _start   (0, 0, 0, 0, 0, 0) + b8

#


SCRIPT:
#!/bin/ksh
#
# 1.0 - 2006/11/12 - initial alpha version
#


BINFILE=${1:-none_specified}
COREFILE=${2:-none_specified}
ADB=/usr/bin/adb
PSTACK=/usr/bin/pstack
PMAP=/usr/bin/pmap
export BINFILE COREFILE ADB PSTACK PMAP

if [ -f ${BINFILE.EN_US} ];then
 OUTPUTDIR=/tmp/$(basename ${BINFILE.EN_US}).$(basename $0).$$
 export OUTPUTDIR
 mkdir -p ${OUTPUTDIR.EN_US}/output
 if [ $? == 0 ];then
   cp -p $0 ${OUTPUTDIR.EN_US}
   cp -p ${BINFILE.EN_US} ${OUTPUTDIR.EN_US}
   rm -rf output/cmds_file 2>/dev/null
   if [ -f ${COREFILE.EN_US} ];then
     cp -p ${COREFILE.EN_US} ${OUTPUTDIR.EN_US}
     file ${COREFILE.EN_US} > ${OUTPUTDIR.EN_US}/output/file.corefile.out
     ${PSTACK.EN_US} ${COREFILE.EN_US} > ${OUTPUTDIR.EN_US}/output/pstack.corefile.out
     ${PMAP.EN_US} ${COREFILE.EN_US} > ${OUTPUTDIR.EN_US}/output/pmap.corefile.out
     ${PMAP.EN_US} ${COREFILE.EN_US} | awk '/\/.* \// {print "mkdir -p .`dirname " $(NF) "` ; cp -p " $(NF) " ." $(NF)}' >> ${OUTPUTDIR.EN_US}/output/cmds_file
     echo '<sp$<stackcalls' | ${ADB.EN_US} ${BINFILE.EN_US} ${COREFILE.EN_US} > ${OUTPUTDIR.EN_US}/output/stackcalls_adb.corefile.out 2>&1
     echo '$c' | ${ADB.EN_US} ${BINFILE.EN_US} ${COREFILE.EN_US} > ${OUTPUTDIR.EN_US}/output/stacktrace_adb.corefile.out 2>&1
   else
     echo
     echo "you did not specify the full path to the corefile, so some libraries will not be captured correctly."
     echo "you could use \"gcore -o /path/to/generated/corefile pid_of_running_binary\" to create one if needed."
     echo
   fi
   cd ${OUTPUTDIR.EN_US}
   ldd $(basename ${BINFILE.EN_US}) > output/ldd.out
   ldd $(basename ${BINFILE.EN_US}) | awk '{print "mkdir -p .`dirname " $(NF) "` ; cp -p " $(NF) " ." $(NF)}' >> output/cmds_file
   sort -u output/cmds_file > output/cmds_file.sorted
   /bin/ksh -x ./output/cmds_file.sorted > output/cmds_file.out 2>&1
   cd /tmp
   tar cf $(basename ${OUTPUTDIR.EN_US}).tar $(basename ${OUTPUTDIR.EN_US})
   if [ $? == 0 ];then
     rm -rf ./$(basename ${OUTPUTDIR.EN_US})
     compress $(basename ${OUTPUTDIR.EN_US}).tar
     echo
     cksum $(basename ${OUTPUTDIR.EN_US}).tar.Z
     ls -la /tmp/$(basename ${OUTPUTDIR.EN_US}).tar.Z
     echo
     echo please send this console output
     echo along with /tmp/$(basename ${OUTPUTDIR.EN_US}).tar.Z
     echo to your support representative
     echo
   else
     echo "problem creating /tmp/$(basename ${OUTPUTDIR.EN_US}).tar, some temporary files may still exist"
     exit 1
   fi
 else
   echo "problem creating directory ${OUTPUTDIR.EN_US}/output"
   exit 1
 fi
else
 echo
 echo "binary file ${BINFILE.EN_US} does not exist, usage:"
 echo "  $0 /path/to/binary [ /path/to/corefile ]"
 echo
 exit 1
fi



 

 
 
Acknowledgements
http://iwww.veritas.com/~avengada/app_core_analysis_tips.htm


Article URL http://www.symantec.com/docs/HOWTO31291


Terms of use for this information are found in Legal Notices