Veritas Volume Manager problem
Created: 12 Feb 2013 | Updated: 19 Feb 2013 | 27 comments
This issue has been solved. See solution.
We have 2 Sun 3510 arrays each containing three disks. VxVM, on Solaris 10 for SPARC, has been set up to mirror each disk, ie Array 1 Disk 1 is mirrored to Array 2 Disk 1 etc etc. We had a problem with the power to the controller of array 2 and the disks in that array ended up in a disabled and removed state.
The power has now been restored and the disks are now visible to the operating system again. However whatever we try we can not get VxVM to accept the disks.
It can see them but will not re-add them. We have followed the replace disk procedure which fails with the message 'no device available to replace'.
Any suggestions would be most gratefuly received.
Discussion Filed Under:
Comments 27 Comments • Jump to latest comment
If the disks are visible in 'vxdisk list' output then you might need to run '/etc/vx/bin/vxreattach <device>' so that the devices get attached back to the disk group.
A CLI snapshot might help better understand the issue
Hope this helps.
Greg
You mentioned that the failed disks are now visble by the OS, but I could not see them in the vxdisk list output above, I can only see the removed references.
cheers
tony
The disks in question are :
datadg04
datadg05
datadg06
All of which both Veritas and Solaris can see but nothing I do will pursuade Veritas to re-add them to the disk group datadg.
Can you post the latest "vxdisk list" output, if they are visible to VxVM, they should be listed there with the correct state. All I could see in the above are the "removed" references, not the actual disks
cheers
tony
The latest vxdisk list output is posted above.
As the disks are showing as removed I have tried the re-add and add procedures with no result.
Posted above is the current situation. Please bear in mind that the disks showing as removed are the disks we are trying to re-add.
Hi
When a disk gets "detached" or "removed" the disk will be listed as that state as its still refrenced by the imported DG. If the problem disk is still accessible, then it will also show up in an entry but not as part of a DG
For example I do not see, Disk_9, Disk_10 & Disk11 in that list, only the detached references
Possibly you need to run a rescan by VxVM
cheers
tony
Hi Tony
you were right. a re-scan of the disks causes the following output :-
root@server01 # vxdisk list
DEVICE TYPE DISK GROUP STATUS
Disk_0 auto:sliced rootdg01 rootdg online
Disk_4 auto:cdsdisk datadg01 datadg online
Disk_5 auto:cdsdisk datadg02 datadg online
Disk_6 auto:cdsdisk datadg03 datadg online
Disk_7 auto:sliced rootdg02 rootdg online nohotuse
Disk_9 auto:cdsdisk - - online
Disk_10 auto:cdsdisk - - online
Disk_11 auto:cdsdisk - - online
- - datadg04 datadg removed nohotuse was:Disk_9
- - datadg06 datadg removed nohotuse was:Disk_11
- - datadg05 datadg removed was:Disk_10
But we still can not re-add them to the datadg.
Regards
Greg
Please provide the following output:
# vxdisk list Disk_9
# vxdisk list Disk_10
# vxdisk list Disk_11
# vxprint -mg datadg | egrep -i "^dm|disk|device"
If this post has helped you, please vote or mark as solution
Hi Tony
Please see below:
Regards
Greg
root@server01 # vxdisk list Disk_9
Device: Disk_9
devicetag: Disk_9
type: auto
hostid: server01
disk: name= id=1196418828.29.server01
group: name=datadg id=1129915301.61.server01
info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags: online ready private autoconfig autoimport
pubpaths: block=/dev/vx/dmp/Disk_9s2 char=/dev/vx/rdmp/Disk_9s2
version: 3.1
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=2 offset=2304 len=1140796160 disk_offset=0
private: slice=2 offset=256 len=2048 disk_offset=0
update: time=1360778965 seqno=0.52
ssb: actual_seqno=0.3
headers: 0 240
configs: count=1 len=1280
logs: count=1 len=192
Defined regions:
config priv 000048-000239[000192]: copy=01 offset=000000 enabled
config priv 000256-001343[001088]: copy=01 offset=000192 enabled
log priv 001344-001535[000192]: copy=01 offset=000000 enabled
lockrgn priv 001536-001679[000144]: part=00 offset=000000
Multipathing information:
numpaths: 1
c5t40d1s2 state=enabled
root@server01 # vxdisk list Disk_10
Device: Disk_10
devicetag: Disk_10
type: auto
hostid: server01
disk: name= id=1288191337.26.server01
group: name=datadg id=1129915301.61.server01
info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags: online ready private autoconfig autoimport
pubpaths: block=/dev/vx/dmp/Disk_10s2 char=/dev/vx/rdmp/Disk_10s2
version: 3.1
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=2 offset=2304 len=1140796160 disk_offset=0
private: slice=2 offset=256 len=2048 disk_offset=0
update: time=1360778965 seqno=0.40
ssb: actual_seqno=0.1
headers: 0 240
configs: count=1 len=1280
logs: count=1 len=192
Defined regions:
config priv 000048-000239[000192]: copy=01 offset=000000 enabled
config priv 000256-001343[001088]: copy=01 offset=000192 enabled
log priv 001344-001535[000192]: copy=01 offset=000000 enabled
lockrgn priv 001536-001679[000144]: part=00 offset=000000
Multipathing information:
numpaths: 1
c5t40d0s2 state=enabled
root@server01 # vxdisk list Disk_11
Device: Disk_11
devicetag: Disk_11
type: auto
hostid: server01
disk: name= id=1196418973.33.server01
group: name=datadg id=1129915301.61.server01
info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags: online ready private autoconfig autoimport
pubpaths: block=/dev/vx/dmp/Disk_11s2 char=/dev/vx/rdmp/Disk_11s2
version: 3.1
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=2 offset=2304 len=1140796160 disk_offset=0
private: slice=2 offset=256 len=2048 disk_offset=0
update: time=1360778965 seqno=0.54
ssb: actual_seqno=0.3
headers: 0 240
configs: count=1 len=1280
logs: count=1 len=192
Defined regions:
config priv 000048-000239[000192]: copy=01 offset=000000 enabled
config priv 000256-001343[001088]: copy=01 offset=000192 enabled
log priv 001344-001535[000192]: copy=01 offset=000000 enabled
lockrgn priv 001536-001679[000144]: part=00 offset=000000
Multipathing information:
numpaths: 1
c5t40d2s2 state=enabled
root@server01 # vxprint -mg datadg|egrep -i "^dm|disk|device"
diskdetpolicy=global
dm datadg01
da_name=Disk_4
device_tag=Disk_4
pub_bpath="/dev/vx/dmp/Disk_4s2
priv_bpath="/dev/vx/dmp/Disk_4s2
pub_cpath="/dev/vx/rdmp/Disk_4s2
priv_cpath="/dev/vx/rdmp/Disk_4s2
diskid=1129914971.25.server01
last_diskid=1129914971.25.server01
last_da_name=Disk_4
last_disk_offset=2304
dm datadg02
da_name=Disk_5
device_tag=Disk_5
pub_bpath="/dev/vx/dmp/Disk_5s2
priv_bpath="/dev/vx/dmp/Disk_5s2
pub_cpath="/dev/vx/rdmp/Disk_5s2
priv_cpath="/dev/vx/rdmp/Disk_5s2
diskid=1129914973.27.server01
last_diskid=1129914973.27.server01
last_da_name=Disk_5
last_disk_offset=2304
dm datadg03
da_name=Disk_6
device_tag=Disk_6
pub_bpath="/dev/vx/dmp/Disk_6s2
priv_bpath="/dev/vx/dmp/Disk_6s2
pub_cpath="/dev/vx/rdmp/Disk_6s2
priv_cpath="/dev/vx/rdmp/Disk_6s2
diskid=1129914975.29.server01
last_diskid=1129914975.29.server01
last_da_name=Disk_6
last_disk_offset=2304
dm datadg04
last_diskid=1196418828.29.server01
last_da_name=Disk_9
last_disk_offset=2304
dm datadg05
last_diskid=1288191337.26.server01
last_da_name=Disk_10
last_disk_offset=2304
dm datadg06
last_diskid=1196418973.33.server01
last_da_name=Disk_11
last_disk_offset=2304
nodevice=off
da_name=Disk_4
device_tag=Disk_4
path="/dev/vx/dmp/Disk_4s2
mkdevice=off
nodevice=off
da_name=Disk_5
device_tag=Disk_5
path="/dev/vx/dmp/Disk_5s2
mkdevice=off
nodevice=off
da_name=Disk_6
device_tag=Disk_6
path="/dev/vx/dmp/Disk_6s2
mkdevice=off
nodevice=off
device_tag=
mkdevice=off
nodevice=off
device_tag=
mkdevice=off
nodevice=off
device_tag=
mkdevice=off
Great, so now VxVM can see the disks that had the original problems.
I guess these orignally were tagged as "failed was" but are now showing as "removed" as you tried vxdisk adm replacement.
In theory it should now be a case of using vxdiskadm #5 and the disks should be available to choose from
cheers
tony
Hi Tony
I tried that with the following result:
Replace a failed or removed disk
Menu: VolumeManager/Disk/ReplaceDisk
Use this menu operation to specify a replacement disk for a disk
that you removed with the "Remove a disk for replacement" menu
operation, or that failed during use. You will be prompted for
a disk name to replace and a disk device to use as a replacement.
You can choose an uninitialized disk, in which case the disk will
be initialized, or you can choose a disk that you have already
initialized using the Add or initialize a disk menu operation.
Select a removed or failed disk [<disk>,list,q,?] list
Disk group: rootdg
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
Disk group: datadg
DM NAME DEVICE TYPE PRIVLEN PUBLEN STATE
dm datadg04 - - - - REMOVED
dm datadg05 - - - - REMOVED
dm datadg06 - - - - REMOVED
Select a removed or failed disk [<disk>,list,q,?] datadg04
VxVM ERROR V-5-2-1985 No devices are available as replacements for datadg04.
Select a different disk ? [y,n,q,?] (default: n)
Regards
Greg
Greg,
Would you be able to post the output of the following vxreattach (-c only checks if the disk can be reattached):
# vxreattach -c Disk_9
# vxreattach -c Disk_10
# vxreattach -c Disk_11
How were the disks were removed from the dg originally? (ie: what command(s) were run to remove the disks)
If this post has helped you, please vote or mark as solution
Hi
the output from the vxreattach commands :
root@server01 # vxreattach -c Disk_9
VxVM vxreattach ERROR V-5-2-238 No matching Volume Manager disk and device IDs found for Disk_9
root@server01 # vxreattach -c Disk_10
VxVM vxreattach ERROR V-5-2-238 No matching Volume Manager disk and device IDs found for Disk_10
root@server01 # vxreattach -c Disk_11
VxVM vxreattach ERROR V-5-2-238 No matching Volume Manager disk and device IDs found for Disk_11
As far as the person who ran the original commands can remember the vxdisk rm <volume> command was used.
https://sort.symantec.com/public/documents/sf/5.0M...
vxdisk rm <daname> removes disks from vxdisk list, it doesn't remove disks from the dg (or at least not cleanly) - so this might be the problem here (why the disks can't be reattached).
Try:
# vxdg -g datadg -k adddisk datadg04=Disk_9 datadg05=Disk_10 datadg06=Disk_11
Then provide the following output vxprint output (the plexes may still need recovery, so check this before proceeding):
# vxprint -qhtrg datadg
In future, to remove disks from a diskgroup, use vxdg -g <dg> rmdisk (use -k option if disks will be replaced):
https://sort.symantec.com/public/documents/sf/5.0M...
or use vxdiskadm - option 4 (Remove a disk for replacement)
https://sort.symantec.com/public/documents/sf/5.0M...
If this post has helped you, please vote or mark as solution
Hi
The output from the vxdg command :
root@server01 vxdg -g datadg -k adddisk datadg04=Disk_09 datadg05=Disk_10 datadg06=Disk_11
VxVM vxdg ERROR V-5-1-639 Failed to obtain locks:
Disk_09: no such object in the configuration
The vxprint shows datadg04,05 and 06 still removed
Regards
Greg
Greg,
Apologies, made a typo in my original post, should have been Disk_9 not Disk_09:
# vxdg -g datadg -k adddisk datadg04=Disk_9 datadg05=Disk_10 datadg06=Disk_11
Try this again without the typo.
If it doesn't work, you may need to clear the import info as the vxdisk list shows the disks still imported on server01
# vxdisk clearimport Disk_9
# vxdisk clearimport Disk_10
# vxdisk clearimport Disk_11
After running the clearimport, try the vxdg adddisk again.
regards,
Grace
If this post has helped you, please vote or mark as solution
Hi Grace
The clear import seemed to work but the adddisk still errors (a new one this time):
VxVM vxdg ERROR V-5-1-2349 Device Disk_9 appears to be owned by disk group datadg.
I get thye same error on all three disks.
Regards
Greg
Greg,
Try -f to force (normally not recommended, however we confirmed earlier that the diskids do match/are correct):
# vxdg -g datadg -k -f adddisk datadg04=Disk_9 datadg05=Disk_10 datadg06=Disk_11
If this post has helped you, please vote or mark as solution
Hi Grace
Sorry but same result :(
Regards
Greg
Greg,
Please provide this output to confirm the current state of the dg:
# vxprint -qhtrg datadg
In the meantime I'll put together the commands to remove the old dg config from the disks and add them in as fresh disks (since you'll have to resync the mirrors from scratch anyway) - not ideal but seems to be necessary due to the way the disks were removed.
regards,
Grace
If this post has helped you, please vote or mark as solution
Hi Grace
Please see below
Regards
Greg
root@server01 # vxprint -qhtrg datadg
dg datadg default default 41000 1129915301.61.dwprod01
dm datadg01 Disk_4 auto 2048 1140796160 -
dm datadg02 Disk_5 auto 2048 1140796160 -
dm datadg03 Disk_6 auto 2048 1140796160 -
dm datadg04 - - - - REMOVED
dm datadg05 - - - - REMOVED
dm datadg06 - - - - REMOVED
v vol01 - ENABLED ACTIVE 1140795392 SELECT - fsgen
pl vol01-01 vol01 ENABLED ACTIVE 1140795392 CONCAT - RW
sd datadg01-01 vol01-01 datadg01 0 1140795392 0 Disk_4 ENA
pl vol01-02 vol01 DISABLED REMOVED 1140795392 CONCAT - WO
sd datadg04-01 vol01-02 datadg04 0 1140795392 0 - RMOV
v vol02 - ENABLED ACTIVE 1140795392 SELECT - fsgen
pl vol02-01 vol02 ENABLED ACTIVE 1140795392 CONCAT - RW
sd datadg02-01 vol02-01 datadg02 0 1140795392 0 Disk_5 ENA
pl vol02-02 vol02 DISABLED REMOVED 1140795392 CONCAT - WO
sd datadg05-01 vol02-02 datadg05 0 1140795392 0 - RMOV
v vol03 - ENABLED ACTIVE 1140795392 SELECT - fsgen
pl vol03-01 vol03 ENABLED ACTIVE 1140795392 CONCAT - RW
sd datadg03-01 vol03-01 datadg03 0 1140795392 0 Disk_6 ENA
pl vol03-02 vol03 DISABLED REMOVED 1140795392 CONCAT - WO
sd datadg06-01 vol03-02 datadg06 0 1140795392 0 - RMOV
Greg,
Try on Disk_9 first
# vxdiskunsetup -C Disk_9
# vxdisksetup -i Disk_9 format=cdsdisk privlen=2048
# vxdisk list Disk_9
The fields in bold (from your previous output) should be the same. If they're not, stop here and post the vxdisk list Disk_9 output!
root@server01 # vxdisk list Disk_9
[...]
info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags: online ready private autoconfig autoimport
pubpaths: block=/dev/vx/dmp/Disk_9s2 char=/dev/vx/rdmp/Disk_9s2
version: 3.1
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=2 offset=2304 len=1140796160 disk_offset=0
private: slice=2 offset=256 len=2048 disk_offset=0
[...]
Multipathing information:
numpaths: 1
c5t40d1s2 state=enabled
If they are the same, try to add this to the dg now:
# vxdg -g datadg -k adddisk datadg04=Disk_9
Repeat steps for Disk_10, Disk_11 (ensure privlen=2048 is specified in the vxdisksetup command)
If this works, then post the vxprint -qhtrg datadg output after adding the disks so the plexes can be recovered
If this post has helped you, please vote or mark as solution
Grace
Output below.
Regards
Greg
root@server01 # vxprint -qhtrg datadg
dg datadg default default 41000 1129915301.61.dwprod01
dm datadg01 Disk_4 auto 2048 1140796160 -
dm datadg02 Disk_5 auto 2048 1140796160 -
dm datadg03 Disk_6 auto 2048 1140796160 -
dm datadg04 Disk_9 auto 2048 1140796160 NOHOTUSE
dm datadg05 Disk_10 auto 2048 1140796160 -
dm datadg06 Disk_11 auto 2048 1140796160 NOHOTUSE
v vol01 - ENABLED ACTIVE 1140795392 SELECT - fsgen
pl vol01-01 vol01 ENABLED ACTIVE 1140795392 CONCAT - RW
sd datadg01-01 vol01-01 datadg01 0 1140795392 0 Disk_4 ENA
pl vol01-02 vol01 DISABLED RECOVER 1140795392 CONCAT - WO
sd datadg04-01 vol01-02 datadg04 0 1140795392 0 Disk_9 ENA
v vol02 - ENABLED ACTIVE 1140795392 SELECT - fsgen
pl vol02-01 vol02 ENABLED ACTIVE 1140795392 CONCAT - RW
sd datadg02-01 vol02-01 datadg02 0 1140795392 0 Disk_5 ENA
pl vol02-02 vol02 DISABLED RECOVER 1140795392 CONCAT - WO
sd datadg05-01 vol02-02 datadg05 0 1140795392 0 Disk_10 ENA
v vol03 - ENABLED ACTIVE 1140795392 SELECT - fsgen
pl vol03-01 vol03 ENABLED ACTIVE 1140795392 CONCAT - RW
sd datadg03-01 vol03-01 datadg03 0 1140795392 0 Disk_6 ENA
pl vol03-02 vol03 DISABLED RECOVER 1140795392 CONCAT - WO
sd datadg06-01 vol03-02 datadg06 0 1140795392 0 Disk_11 ENA
Greg,
That looks good - the disks are back in the diskgroup.
This should recover the disabled plexes:
# vxrecover -g datadg -sb
Check:
# vxtask list
# vxprint -qhtrg datadg
If this post has helped you, please vote or mark as solution
Hi Grace
That all worked fine and the plex's are recovering now.
May I say many thanks to you and all of the other contributors for all of your help and patience with a VxVM novice.
Once again many thanks to you all.
Regards
Greg
Greg,
Glad to hear the problem has been resolved now.
Just to reiterate - for future disk replacements/recovery, try using vxdiskadm - option 4 (Remove a disk for replacement) as the recovery should be more straightforward (should be able to use vxdiskadm option 5, or vxreattach as suggested in the earlier posts, rather than having to reinitialise disks to put them back in!)
regards,
Grace
If this post has helped you, please vote or mark as solution
Would you like to reply?
Login or Register to post your comment.