Veritas Cluster Server (VCS) Asynchronous Monitoring Framework (AMF) causing "ps" commands to hang after upgrading SuSE Linux Enterprise Server (SLES) kernel to a certain patch level

Article:TECH194083  |  Created: 2012-07-30  |  Updated: 2012-10-22  |  Article URL http://www.symantec.com/docs/TECH194083
Article Type
Technical Solution


Environment

Issue



AMF causing "ps" command to hang after upgrading SLES kernel to a certain patch level.   Commands other than "ps" which accesses the /proc filesystem will also be affected.


Error



Programs trying to retrieve the process information from the /proc filesystem will hang.    From the kernel core, those programs are in uninterruptible state.

  2287      1   0  ffff8104629f77e0  UN   0.0    3564    924  ps          <<< "UN" - UNinterruptible
  2590   2528   0  ffff810432bf17a0  UN   0.0    3568    936  ps
  6423   2824   2  ffff810431cc5080  UN   0.0    3564    936  ps

The following is a sample thread stack of the above processes.

PID: 11951  TASK: ffff8106678c80c0  CPU: 3   COMMAND: "ps"
#0 [ffff810638171d08] schedule at ffffffff802ea7f8
#1 [ffff810638171de0] __down_read at ffffffff802eb937
#2 [ffff810638171e20] access_process_vm at ffffffff8013c3b0
#3 [ffff810638171e90] proc_pid_cmdline at ffffffff801bcff1
#4 [ffff810638171ed0] proc_info_read at ffffffff801bd8af
#5 [ffff810638171f10] vfs_read at ffffffff80186ba0
#6 [ffff810638171f40] sys_read at ffffffff80186f80
#7 [ffff810638171f80] system_call at ffffffff8010ae16
 


Environment



The problem affects SuSE Linux Enterprise edition (SLES) with a specific kernel change with Veritas Cluster Server configured with Asynchronous Monitoring Framework (AMF).   The specific Linux kernel change involves the introduction of a new kernel data structure "proc_file_private".   Please contact Novell for more information on the kernel change or refer to the following link.

http://www.mentby.com/djalal-harouni/patch-79-proc-protect-procpidmapssmapsnumamaps.html
- [PATCH 7/9] proc: protect /proc/<pid>/{maps,smaps,numa_maps}
Protect the /proc/<pid>/{maps,smaps,numa_maps} files from reader across execve by checking its exec_id.

The kernel change is incorporated in the following SLES patches and subsequent higher version patches.

SLES 10 SP2GA kernel version 2.6.16.60-0.42.11 and after
SLES 10 SP3GA kernel version 2.6.16.60-0.74.7 and after
SLES 10 SP4GA kernel version 2.6.16.60-0.97.1 and after

Please refer to the follownig Novell webpage for details on the kernel versions.

http://wiki.novell.com/index.php/Kernel_versions

 


Cause



Originally the AMF kernel module was coded for SLES kernel without the above mentioned kernel change.   After the kernel change was introduced, AMF doesn't work correctly.    The issue is tracked through Etrack incident listed in the Suppplemental Material section of this article.

SLES kernels without the "proc_file_private" change is not affected.


Solution



Currently a fix is available for SLES10 SP4 and subsequent kernel patches.   The required patch is the Storage Foundation High Availability for SLES 10 x86_64 version 5.1SP1RP2.  The patch can be downloaded from the Symantec Operation Readiness Tools (SORT) website.

https://sort.symantec.com//patch/detail/5514
- sfha-sles10_x86_64-5.1SP1RP2 

The patch is first available as VRTSamf patch 5.1SP1RP1P1 but it is obselted by the abve patch already.

https://sort.symantec.com/patch/detail/4964
- vcs-sles10_x86_64-VRTSamf-5.1SP1RP1P1  (obsoleted by the 5.1SP1RP2)

https://sort.symantec.com/patch/matrix
 

The fix is available in the Storage Foundation High Availability 5.1SP1 RP3.    After the patch installed, AMF will work correctly with the following kernel versions.

SLES 10 SP2GA kernel version 2.6.16.60-0.42.11 and after
SLES 10 SP3GA kernel version 2.6.16.60-0.74.7 and after

Please download the patch from the Symantec Operation Readiness Tools website.

https://sort.symantec.com/patch/matrix
https://sort.symantec.com/patch/detail/6818
 


Supplemental Materials

SourceETrack
Value2348721
Description

When enable the amf, cfsmount agent cannot start normally.  The basic event registration with AMF driver is failing.


SourceETrack
Value2907684
Description

AMF: Support all SLES10 kernel versions - SLES10SP2 w/ CVE-2011-2022 VCS 5.1SP1 - Customer requests fix for AMF Etrack 2348721]



Article URL http://www.symantec.com/docs/TECH194083


Terms of use for this information are found in Legal Notices