Disk write failure on shared LUNs during Volume Manager import. Diskgroup not importing in Sun Cluster.

Article:TECH156118  |  Created: 2011-03-21  |  Updated: 2012-11-19  |  Article URL http://www.symantec.com/docs/TECH156118
NOTE: If you are experiencing this particular known issue, we recommend that you Subscribe to receive email notification each time this article is updated. Subscribers will be the first to learn about any releases, status changes, workarounds or decisions made.
Article Type
Technical Solution


Environment

Issue



Diskgroup doesn't import in Sun Cluster environment.  This has typically been seen after a system crash. There may also be "Disk write failure" errors in the syslog (/var/adm/messages) even though the LUNS are write enabled.  Check for residual stale SCSI reservation keys that will need to be scrubbed before the import will succeed.

 


Error



Manual import at the command line:

 

gmacds703a# vxdg import vg04_egate_auto_gifc
VxVM vxdg ERROR V-5-1-10978 Disk group vg04_egate_auto_gifc: import failed:
No valid disk found containing disk group
 

Errors seen in syslog:

 

Nov 14 18:17:27 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d15s2: Disk write failure
Nov 14 18:17:27 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d15s2: Disk write failure
Nov 14 18:17:31 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d20s2: Disk write failure
Nov 14 18:17:31 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d20s2: Disk write failure
Nov 14 18:17:34 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d17s2: Disk write failure
Nov 14 18:17:34 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d17s2: Disk write failure
Nov 14 18:17:34 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d9s2: Disk write failure
Nov 14 18:17:34 gmacds703a vxvm:vxconfigd: [ID 702911 daemon.warning] V-5-1-10977 da_join failed, for device c5t4d9s2: Disk write failure


 


Environment



Sun Cluster and Volume Manager.  This issue is a generic Sun Cluster issue and could likely occur when using any version Storage Foundation product.

In one instance, the Sun Cluster version was reported as:

Sun Cluster 3.2u1 for Solaris 10 sparc
Copyright 2008 Sun Microsystems, Inc. All Rights Reserved.

 

 


Cause



There are residual SCSI keys (presumably written during Sun Cluster operations) that need to be removed from the LUNs before the Volume Manager import (vxdg import <diskgroup>).  In order to import a diskgroup, Volume Manager must be able to write to one or more LUNs in the diskgroup.  A SCSI reservation placed on a disk by one host can prevent another host from writing to the disk.  This is done in the LUNs controller and is handled by the SCSI layer of the OS below Volume Manager in the I/O stack.

This is a similar process to using vxfenadm to manually clear stale fencing keys when using I/O fencing in Veritas Cluster Server.


Solution



THESE ARE THE SUN CLUSTER COMMANDS THAT HAVE BEEN FOUND TO ACCOMPLISH THIS TASK.  PLEASE ENGAGE SUN CLUSTER SUPPORT RESOURCES OR ORACLE BEFORE PROCEEDING.

It is suggested that the OS device name be used in the command below.  This can easily be determined and mapped to any Volume Manager disk by using the command "vxdisk -e list".

 

Example:

 

# vxdisk -e list|grep datadg
c3t2d0s2     auto:cdsdisk   c3t2d0s2     datadg      online               c3t2d0s2         lun RAID_0
c3t2d1s2     auto:sliced      c3t2d1s2     datadg      online               c3t2d1s2         lun RAID_0
 

(the OS device name in bold above will often be the same as the device name but can be different)

 

1)  Confirm keys exist on each disk of the diskgroup

      /usr/cluster/lib/sc/scsi -c inkeys -d /dev/rdsk<OS device name>


2)  If you need to get the diskgroup imported, you can clear the keys like this:

     /usr/cluster/lib/sc/scsi -c scrub -d /dev/rdsk/<OS device name>



3)  After clearing the key(s) from the disk(s), repeat the command in step 1 above to validate that all keys have been removed.

 

4)  The diskgroup should now import successfully.

     vxdg -C import <diskgroup> 

 

Example:

 

gmacds703a# vxdisk -o alldgs list|grep vg04_egate_auto_gifc
c5t4d9s2  auto:cdsdisk    vg04_egate_auto_gifc_01  vg04_egate_auto_gifc online
c5t4d15s2 auto:cdsdisk    vg04_egate_auto_gifc_02  vg04_egate_auto_gifc online
c5t4d17s2 auto:cdsdisk    vg04_egate_auto_gifc_03  vg04_egate_auto_gifc online
c5t4d20s2 auto:cdsdisk    vg04_egate_auto_gifc_04  vg04_egate_auto_gifc online
 

gmacds704a# /usr/cluster/lib/sc/scsi -c inkeys -d /dev/rdsk/c5t4d9s2
Reservation keys(1):
0x490bd41100000002
gmacds704a# /usr/cluster/lib/sc/scsi -c inkeys -d /dev/rdsk/c5t4d15s2
Reservation keys(1):
0x490bd41100000002
gmacds704a# /usr/cluster/lib/sc/scsi -c inkeys -d /dev/rdsk/c5t4d17s2
Reservation keys(1):
0x490bd41100000002
gmacds704a# /usr/cluster/lib/sc/scsi -c inkeys -d /dev/rdsk/c5t4d20s2

Reservation keys(1):
0x490bd41100000002
 

 

gmacds704a# /usr/cluster/lib/sc/scsi -c scrub -d /dev/rdsk/c5t4d9s2

gmacds704a# /usr/cluster/lib/sc/scsi -c scrub -d /dev/rdsk/c5t4d15s2

gmacds704a# /usr/cluster/lib/sc/scsi -c scrub -d /dev/rdsk/c5t4d17s2

gmacds704a# /usr/cluster/lib/sc/scsi -c scrub -d /dev/rdsk/c5t4d20s2

 




Article URL http://www.symantec.com/docs/TECH156118


Terms of use for this information are found in Legal Notices