Unresponsive system (hang) or possible data loss due to an adverse interoperability issue between Qlogic 2GB and 4GB HBAs and Veritas Volume Manager versions: 4.1 MP1 with 122059-02, 4.1 MP2, 5.0, or 5.0 MP1
| Article:TECH54291 | | | Created: 2008-01-12 | | | Updated: 2008-01-12 | | | Article URL http://www.symantec.com/docs/TECH54291 |
Problem
Unresponsive system (hang) or possible data loss due to an adverse interoperability issue between Qlogic 2GB and 4GB HBAs and Veritas Volume Manager versions: 4.1 MP1 with 122059-02, 4.1 MP2, 5.0, or 5.0 MP1
Error
qlc: [ID 262021 kern.warning] WARNING: qlc(0): isr, Internal Parity/Pause Error - hccr=0h, stat=428113h, count=710644882
-OR-
WARNING: /pci@3,700000/SUNW,qlc@0,1/fp@0,0/ssd@w50060e800327572c,5a (ssd138): undecodable sense information: 0x0 0x0 0x0 0x1 0x0 0x0 0x0 0x2 0xff 0xff 0xff 0xff 0x13 0x7c 0
Solution
This issue
only applies if you have:
Qlogic 2G or 4G
host bus adapters (HBAs) on Solaris 8, 9, or 10 with one of the following
releases of Veritas Volume Manager (VxVM):
Veritas Volume
Manager 4.1 MP1 with patch 122059-02
Veritas Volume
Manager 4.1 MP2 or later
Veritas Volume
Manager 5.0 or later
Detailed
Description:
Due to an issue
in DMP Fast Recovery procedures, interaction between Qlogic 2G and 4G HBAs
and VxVM may cause Solaris systems to become unresponsive (hang) under heavy
load conditions during dynamic multipathing (DMP) Fast Recovery IO failure
analysis.
DMP Fast
Recovery was introduced in the 5.0 release and back-ported to the 4.1 release
through Patch 122059-02 as well as the Maintenance Patch (MP2) patchset
(117080-07).
DMP Fast
Recovery functionality greatly enhances IO failure analysis by communicating
directly with the HBA driver, bypassing the SCSI disk (SD) driver which handles
normal IO traffic. By communicating directly with the HBA, failure
analysis can be conducted much more efficiently without suffering through
backlogged SD driver queues that typically accompany IO path failures during
heavy load.
Incident
e1123248 documents two defects:
- Incorrect Command Descriptor Block (CDB) tagging.
- Failure to reset b_resid (number of bytes not transferred) back to zero upon subsequent attempts to resubmit a given IO.
The
resulting behavior is HBA driver specific. Only Qlogic 2G or 4G HBAs have
been found to exhibit this adverse behavior.
Resolution
for 4.1 MP (x):
A binary hot
fix is available for 4.1 MP2 to fix this issue. The 4.1 MP2 patch
(117080-07) is a prerequisite for the binary hot fix.
A patchadd
patch (128045-01) is available for VxVM 4.1 MP2 RP2. The 4.1 MP2
(117080-07) plus RP2 (124358-04) patches are prerequisites for this patchadd
solution.
Please contact
Symantec Support to obtain either of these patches, referencing this TechNote
292445.
Availability of the DMP_Fast_Recovery tunable on 4.1
MP2:
The 4.1 MP2
Release
Notes http://support.veritas.com/docs/287682
describe a tunable to disable the DMP Fast Recovery functionality:
"The
dmp_fast_recovery tunable controls whether DMP should attempt to obtain SCSI
error information directly from the HBA interface. Setting the value to on can
potentially provide faster error recovery, provided that the HBA interface
supports the error enquiry feature. If set to off, the HBA interface is not
used. The default setting is off. Before enabling this tunable, make sure the
HBA firmware level is supported in the HCL. Enabling this tunable with
unsupported HBA firmware levels may result in a system panic."
There are three
discrepancies in that quote from the MP2 Release Notes:
- While the DMP Fast Recovery feature was included in 4.1 MP2, the dmp_fast_recovery tunable was not exposed.
- DMP Fast Recovery is "on" by default.
- The last two sentences referring to HBA firmware and risk of a system panic actually apply to the 'monitor_fabric' tunable. This tunable is 'off' by default in 4.1 MP2, specifically to protect users against those risks.
In
addition to repairing the two defects outlined above, Incident e1123248 also
exposes the dmp_fast_recovery tunable as documented in the 4.1 MP2 Release
Notes. The 5.0 release does include this tunable by default. The default value
for this tunable remains "on" for both releases.
Resolution for 5.x:
The next rolling patch for 5.0 MP1 will include a permanent fix for these issues. This patch is tentatively scheduled to be released by end of January (08). Until the release of this patch, you can disable DMP Fast Recovery as a temporary workaround as described below:
Workaround for 5.x:
1. Install VxVM 5.0 MP1:
http://support.veritas.com/docs/288505
2. Set dmp_fast_recovery=off:
root# vxdmpadm gettune all |grep fast_recovery
dmp_fast_recovery on on
root#
root# vxdmpadm settune dmp_fast_recovery=off
Tunable value will be changed immediately
root#
root# vxdmpadm gettune all |grep fast_recovery
dmp_fast_recovery off on
root#
root# cat /etc/vx/dmppolicy.info
arraytype
#
arrayname
#
enclosure
#
Tunables
dmp_fast_recovery=off
#
root#
|
|
| Source | ETrack |
| Value | 1123248 |
| Description | dmp_fast_recovery defects affecting Qlogic HBAs |
Related Articles
Legacy ID
292445
Article URL http://www.symantec.com/docs/TECH54291
Terms of use for this information are found in Legal Notices









Thank you.