Storage Foundation for Windows logs reservation refresh is suspended for a disk group

Article:TECH50643  |  Created: 2007-01-26  |  Updated: 2013-04-26  |  Article URL http://www.symantec.com/docs/TECH50643
Article Type
Technical Solution


Environment

Issue



 During cluster operation Storage Foundation for Windows (SFW) may log in the system event log that it is suspending reservation refresh for the Volume Manager Disk Group (VMDG) resources that are currently online on this cluster node.


Error



Event ID: 52

Source: VXIO

Description: Cluster software communication timeout. Reservation refresh has been suspended for cluster disk group "Dgguid"


Environment



Microsoft Cluster Server and Failover Cluster


Cause



VXIO expects to receive heartbeat communications from the cluster when there are VMDG resources online, and when it does it will maintain a SCSI reservation thread with the disks making up the online disk groups.  The cluster service via the VMDG resource dll (vxres.dll) will maintain a record that the disks are SCSI reserved and the cluster monitoring cycle for LooksAlive / IsAlive will complete and show the VMDG resources as online.

If VXIO does not hear from the cluster for a set timeout period, then it will suspend the reservation refresh.


Solution



This error in general will not impact cluster operations.  The occurrence of this error does not cause the failure of the VMDG resources and the cluster service still has record that the VMDG resources are online and will not attempt to failover the service group.
 
While there are online VMDG resources on a cluster node, VXIO will issue a SCSI reservation request to each of the disks making up the online disk groups.  This is done every 3 seconds.  VXIO in turn will receive every 3-5 seconds heartbeats from cluster resource dll (vxres.dll) via the cluster resource monitor. If the VXIO doesn't hear from the resource monitor for 10 minutes, SFW suspends the reservation thread.  This means that the reservation is still held by this cluster node, and if there is no other issue relating to physical disk access, the VMDg resource will stay online and not cause a service group failover.
 
If there was an issue with the access to the disks from this cluster node, then VXIO would not be able reserve the disks and the VMDG resource would fault, or the application would not be able to read / write to the disks and the application resource would fault.   If there is a fault for a service group resource, then this node will not defend the challenge as the reservation thread is suspended, and the service group can failover.
 

If VXIO had assumed that it was not receiving the communications from the resource monitor because the cluster software had failed and terminated the SCSI reservation then the service group may failover unnecessarily.

Recommendations:

  • Verify that the cluster software is running, this error is indicative of high cluster node resource utilisation
  • Review event logs for indications that other applications are experiencing resource shortage
  • Review event logs for cluster service messages indicating that there are issues monitoring resources
  • Consider seperate resouce monitors for the VMDG resources

Further reading:

Further information about troubleshooting messages that are reported by vxio can be found in the technote that is linked in the "Related Documents" section.
 

Supplemental Materials

Valuea287196

SourceEvent ID
Value52
Description

vxio: Cluster software communication timeout. Reservation refresh has been suspended for cluster disk group "Dgguid"


Value52
Description

vxio: Cluster software communication timeout. Reservation refresh has been suspended for cluster disk group "Dgguid"


SourceUMI
ValueV-203-57349-52
Description

vxio: Cluster software communication timeout. Reservation refresh has been suspended for cluster disk group %2



Legacy ID



287196


Article URL http://www.symantec.com/docs/TECH50643


Terms of use for this information are found in Legal Notices