Filesystem hung when VxFS tries to process the Inode Extended Operations and the VxFS kernel inode cache (with its size controlled by vxfs_ninode) is not large enough

Article:TECH186855  |  Created: 2012-04-19  |  Updated: 2012-06-22  |  Article URL http://www.symantec.com/docs/TECH186855
Article Type
Technical Solution

Product(s)

Environment

Issue



Filesystem hung when VxFS (Veritas File System) tries to process the Inode Extended Operations in any one of the following conditions.

1. In a locally-mounted VxFS environment, after system crash or loss of storage access, when the file systems are mounted again, VxFS will process the pending Inode Extended Operations as part of the mount process.

2. A CFS (Clustered File System) node crashed and the surviving CFS nodes try to process the pending Inode Extended Operations that are left behind by the crashed CFS node.


Error



The following is a sample kernel thread stack when the problem happens.

crash> bt 3633
PID: 3633   TASK: ffff81042c4de7e0  CPU: 9   COMMAND: "vx_worklist_thr"
ffff81042bf2fc60:  ffff810829360c30 schedule_timeout+138
ffff81042bf2fc80:  0000000100212201 process_timeout
ffff81042bf2fd20:  ffff81080a72db80 vx_iget+456
ffff81042bf2fda0:  0000000000005850 vx_local_doextop_iau+290
ffff81042bf2fe10:  ffff810811502000 vx_doextop_fset_thread+158
ffff81042bf2fe60:  vx_worklist_hash vx_workitem_process+11
ffff81042bf2fe70:  0000000000000004 vx_worklist_process+485
 


Environment



VxFS or CFS file systems with a lot of files being operated on when the filesystem access is terminated as depicted in the above Problem Description section.   This abrupt loss of filesystem access may leave a lot of pending Inode Extended Operations behind.   Those Inode Extended Operations include file removal, file truncation, file shortening, etc.    (Normal file read/write do not require Extended Operations.)


Cause



Inode Extended Operation is a VxFS (Veritas File System) performance feature to process the inode operations (e.g. file removal, file truncation) at the background.  When the access to the VxFS filesystem is terminated abruptly, it may leave behind the Inode Extended Operations to be processed in the recovery process.  In the situation where the VxFS kernel inode cache is too small, the processing of the Extended Operations will hang.


Solution



The size of the VxFS kernel inode cache is controlled by the kernel variable vxfs_ninode.   Please refer to the Related Article section for tuning procedures on Solaris and Linux.

On HP-UX vxfs_ninode can be tuned by SAM.
SAM -> Kernel Configuration -> Tunables -> vx_ninode

On AIX vxfs_ninode can be tuned by editing the /etc/vx/vxfssystem file and reboot.   Please refer to the AIX VxFS Administrator's Guide for details.
/etc/vx/vxfssystem:
vxfs_ninode <new value>

As a general rule 1GB of kernel memory can accommodate about 655360 inodes.  (The actual number may vary with the VxFS versions and  OS platforms.)  Please increase the vxfs_ninode to a number which is big enough to process all the files that may have extended operations on them during VxFS filesystem recovery.   Please also consider the impact on the memory usage at the same time.

Generally VxFS will recover the filesystems all at the same, and the required inode cache size will be the total of all the inodes with pending Inode Extended Operations from all the filesystems added together.


Supplemental Materials

SourceETrack
Value2689326
Description

On 5.1SP1RP1/RHEL5, unable to mount a 67TB filesystem despite full fsck passing ok (The hang issue is expalined in this etrack incident.)




Article URL http://www.symantec.com/docs/TECH186855


Terms of use for this information are found in Legal Notices