VMDG resources fail to online in a Microsoft Windows Cluster environment

Article:TECH175874  |  Created: 2011-11-30  |  Updated: 2012-11-18  |  Article URL http://www.symantec.com/docs/TECH175874
Article Type
Technical Solution



Issue



When onlining the Volume Manager Disk Group (VMDG) resource in a Windows cluster, the resouces times out.  By default, onlining in Windows Failover Cluster (Windows 2008) there will be two timeouts in Pending Online Timeout at 180 seconds and the Deadlock Timeout of 300 seconds.  In MSCS (Windows 2003 cluster server) there is only the Pending Online Timeout.

This issue will be apparent by the expiration of these timeouts, and Windows event logs will indicate RHS.EXE failure with vxres.dll, or in Windows 2008 Windows Error Reporting will log WSFC Resouce Deadlock messages for ONLINERESOURCE.

The occurrence of these messages will need further investigation to determine if there is an issue with vxres.dll, or with the underlying resource taking too long to come online.

This document refers to a scenario where it can be seen that the dynamic disk group and volume arrives, which is visible in the application event log which logs these arrivals (from the Storage Foundation for Windows providers).

<date>    <time_t1>    INFORMATION       19867(0x65154d9b)    VxSvc_vxvm    <server>   
Importing dynamic disk group <vmdg_resource_name>.

<date>    <time_t2>    INFORMATION         800(0x65100320)    VxSvc_pnp    <server>    
Device \Device\HarddiskDmVolumes\<vmdg_resource_name>\<volume> has arrived.

Assuming there are no more volumes, or all of the volumes have corresponding events logged, then VxSvc_vxvm should log the disk group is imported, and the cluster resource should go online.

When 180 seconds elapse from time_t1 to time_t3, the following event will be logged on Windows Server 2008 R2

<date>    <time_t3>    INFORMATION        1001(0x000003e9)    Windows Error Reporting    <server>   
Fault bucket , type 0
Event Name: WSFC Resource Deadlock
Response: Not available
Cab Id: 0

Problem signature:
P1: <vmdg_resource_name>
P2: Volume Manager Disk Group
P3: ONLINERESOURCE

The problem is that the disk group import never finishes despite the arrival of the volumes.


Environment



Microsoft Cluster


Cause



During VMDG online, the disk group configuration is updated in the SFW VEA database.  This process will not finish if there is another process that is holding a database lock.

Note: this issue became apparent initially with a missing disk in the disk group on failover.


Solution



The Hotfix for this issue is available in SFW 5.1 SP2 CP7.


Supplemental Materials

SourceETrack
Value2610786
Description

Unable to failover, DLL 'vxres.dll' either crashed or deadlocked



Article URL http://www.symantec.com/docs/TECH175874


Terms of use for this information are found in Legal Notices