Drives stay down
Updated: 25 Jan 2012 | 22 comments
This issue has been solved. See solution.
Hello all,
I am fairly new to Netbackup so bare with me. We are using NetBackup 7.0.1 running on Windows 2003. We have one robot with two drives. The drives seem to stay down. If I can get the drives up, they go right back down as soon as I start a job. I have deleted the robot and drives and reconfigured them. Like I said bare with me this product is new to me. If you ask for logs or such please let me know how to get them.
Thanks in advance
Discussion Filed Under:
Comments
There's a *slim* chance (as you've only got two drives)
that the drives have been mis-configured (twice at least as you're reconfigured them again).
i.e. what you (& the library) think is drive 1 & what NetBackup thinks is drive 1 are different.
You could initially use "robtest" to confirm this:
http://www.symantec.com/business/support/index?pag...
http://www.symantec.com/business/support/index?pag...
Have you any errors for the jobs (job details in GUI) when they result in the drives going down?
Regards Andy
"It's not too late to panic ..."
OK, we'd normally require a
OK, we'd normally require a lot more info, but we'll go slow as you're new ...
For future ref, this is a good starting point for every new thread ...
https://www-secure.symantec.com/connect/blogs/minimum-information-required-when-logging-problem-details
Anyhow, what do the details show in the details tab for a failing job in activity monitor.
Logs we would get are :
(Create these dirs)
On the media server :
<install path>\veritas\netbackup\logs\bptm
<install path>\veritas\volmgr\debug\tpcommand
<install path>\veritas\volmgr\debug\robots
<install path>\veritas\volmgr\debug\ltid
Create these empty files
<install path>\veritas\volmgr\ROBOT_DEBUG
<install path>\veritas\volmgr\DRIVE_DEBUG
Add VERBOSE to a new line in<install path>\veritas\volmgr\vm.conf
If the media server (with the drives going down) is not the robot control host, add the 'robot' lines above to the robot control host.
You can find the RCH, run tpconfig -d on a media server, RCH shown at the bottom.
Regards,
Martin
Share some output
Can you please share the output of this 3 commands
vmoprcmd -h <media server> -shmdrive
tpautoconf -t
vmglob -listall -java
What we are looking for is for missmatchs between your drives at Serial Number, Path and Drive Name, this 3 commands will show what we need and confirm if you drives are properly configured at OS and NBU level.
Regards.
Omar A Villa
Netbackup Expert
These are my personal views and not those of the company I work for
After discovering that we had
After discovering that we had two tapes stuck in the drive, we are still getting no drives available. But that could be because I did not configure them right. Here is the outputs you asked for. Thank you for your help.
D:\Program Files\Veritas\Volmgr\bin>vmoprcmd -h c27sadienrfk800 -shmdrive
0 -1 1 2 0 3 82 8 0 1 -1 -1 -1 0 -1 -1 -1 -1 0 0 24 0 0 0 0 0 0 0 0 0 {0,0,2,0}
*EmPt* *EmPt* *EmPt* *EmPt* *EmPt* *EmPt* HU19034P97 HP.Ultrium4-SCSI.000 HP~~~~
~~Ultrium~4-SCSI~~H58W *EmPt* *EmPt* 0000000000000000000000000000000000000000000
00000000000000000000000000000 *EmPt* *EmPt* *EmPt* *EmPt* 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 c27sadienrfk800 *EmPt* *EmPt* *EmPt* 0 0 - *EmPt*
1 -1 1 2 0 3 82 8 0 2 -1 -1 -1 0 -1 -1 -1 -1 0 0 24 0 0 0 0 0 0 0 0 0 {0,0,3,0}
*EmPt* *EmPt* *EmPt* *EmPt* *EmPt* *EmPt* HU19044T3P HP.Ultrium4-SCSI.001 HP~~~~
~~Ultrium~4-SCSI~~H58W *EmPt* *EmPt* 0000000000000000000000000000000000000000000
00000000000000000000000000000 *EmPt* *EmPt* *EmPt* *EmPt* 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 c27sadienrfk800 *EmPt* *EmPt* *EmPt* 0 0 - *EmPt*
D:\Program Files\Veritas\Volmgr\bin>tpautoconf -t
TPAC60 HP Ultrium 4-SCSI H58W HU19034P97 0 0 2 0 Tape0 - -
TPAC60 HP Ultrium 4-SCSI H58W HU19044T3P 0 0 3 0 Tape1 - -
D:\Program Files\Veritas\Volmgr\bin>vmglob -listall -java
VMGLOB4.5 robot ROBOT0 MXA91210A0 c27sadienrfk800 c27sadienrfk800 0 -1 TLD - 0x0
- 0 HP~~~~~~MSL~G3~Series~~~7.30 - - - -1 -1 -1 -1
VMGLOB4.5 drive HP.Ultrium4-SCSI.001 HU19044T3P c27sadienrfk800 c27sadienrfk800
0 2 TLD hcart 0x0 - 0 HP~~~~~~Ultrium~4-SCSI~~H58W - - - -1 -1 -1 -1
VMGLOB4.5 drive HP.Ultrium4-SCSI.000 HU19034P97 c27sadienrfk800 c27sadienrfk800
0 1 TLD hcart 0x0 - 0 HP~~~~~~Ultrium~4-SCSI~~H58W - - - -1 -1 -1 -1
One other thought here
One other thought here ....
How do you load and unload your library?
If you do it by opening the magazines then it may have been loaded whilst tapes were in the drives
If their original slots were full they could not be ejected and no matter what you do they wont come up.
Run robtest and use s d to show the drives
If there are tapes in the drives see which slots they came from (it should say)
The do s s to show the slots and see if those slots are empty
If they are not make a note of 2 drives that are empty and use m d1 s20 (as an example this moves the tape from drive 1 to slot 20) - do this for each drive
Next run an inventory update in NetBackup to update the location of these tapes
The drives should then come up when you request it
If this was the case then in the future use the load ports if it has them or want until nothing is running to tape before changing tapes
Hope this helps
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
Thank you very much. We did
Thank you very much. We did have two tapes in the drive and now are out. We have actually been looking for them. That solved the drives going down. Now I when I run a job it stays at queued. They are not failing out. I will be posting the outputs requested.
1/24/2012 10:26:13 AM - awaiting resource c27sadienrfk800-hcart-robot-tld-0-1 - No drives are available
I would do a reset of the
I would do a reset of the media server then to clear all allocations and free things up:
\netbackup\bin\admincmd\nbrbutil -resetMediaServer mediaservername
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
I ran that and got a
I ran that and got a resetMediaServerResources() returned status=2005029. Does that mean it is finished?
I wouldnt expect any status
I wouldnt expect any status message - it should just run with no output
Do a View - Refresh All in your admin console and then go to device monitor to see the status of the drives and whether there are any pending requests hanging around - this is at the bottom of the screen but minimised if there aren't any
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
I did that but both drives
I did that but both drives are stuck on Active. I do not know what they are doing nor does it say.
Check the Media Server for
Check the Media Server for bptm processes- if nothing is running kill them off
Then do a reset on the drives (via Device Monitor - right lick and reset) - this will tell you if there are pending allocations and if there are try the resetmedia server command again (you did replace mediaservername with the actual media sever name didnt you?)
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
Yes I did replace
Yes I did replace mediaservername with the actual name. So, both drives now say up. But both of them have a "No" under the "Ready" column. When I try to run a job it sits saying queued. No drives are available.
I should re-run the device
I should re-run the device wizard one more time then as your earlier work may have messed something up
The No to being Ready just means that there is not a tape in the drive.
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
I have tried that but no go.
I have tried that but no go. When I try to run a diag on the drives it fails saying drives are in use. There is no jobs going on at this time
I realize that I am late to
I realize that I am late to this discussion.....
On W2003 server. please confirm that Removable Storage service is stopped and disabled.
Restart NBU Media Manager service after creating logs and adding VERBOSE entry to vm.conf as advised by Martin (mph999).
The exact reason for drives going DOWN will now be logged in Event Viewer Application log.
In addition to 'tpconfig -d' (as per Martin's request) or either 'tpconfig -l', please also post output of 'scan -changer'.
Both these commands can be found in D:\Program Files\Veritas\Volmgr\bin.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links
Nope you are not late. The
Nope you are not late. The drives no longer say down. That was due to having two tapes in the drive with all slots full. But know it just says no drives available. Removable storage service is stopped. Here is the requested output. Thanks!
Device Robot Drive Robot Drive Device
Type Num Index Type DrNum Status Comment Name Path
robot 0 - TLD - - - - {0,0,2,1
}
drive - 0 hcart 1 UP - HP.Ultrium4-SCSI.000 {0,0,2,0
}
drive - 1 hcart 2 UP - HP.Ultrium4-SCSI.001 {0,0,3,0
}
D:\Program Files\Veritas\Volmgr\bin>scan -changer
************************************************************
*********************** SDT_CHANGER ************************
************************************************************
------------------------------------------------------------
Device Name : "Changer1"
Passthru Name: "Changer1"
Volume Header: ""
Port: 0; Bus: 0; Target: 2; LUN: 1
Inquiry : "HP MSL G3 Series 7.30"
Vendor ID : "HP "
Product ID : "MSL G3 Series "
Product Rev: "7.30"
Serial Number: "MXA91210A0"
WWN : ""
WWN Id Type : 0
Device Identifier: "HP MSL G3 Series MXA91210A0"
Device Type : SDT_CHANGER
NetBackup Robot Type: 8
Removable : Yes
Device Supports: SCSI-5
Number of Drives : 2
Number of Slots : 42
Number of Media Access Ports: 3
Drive 1 Serial Number : "HU19034P97"
Drive 2 Serial Number : "HU19044T3P"
Flags : 0x0
Reason: 0x0
All the output provided is
All the output provided is proof that device config is good. (I just needed confirmation for my own sanity). RSM should be stopped and disabled...
Now that drives are up and resource broker media server reset was done, everything should be okay... seems not....
Please show us resource broker allocation status:
D:\Program Files\Veritas\netbackup\bin\admincmd\nbrbutil -dump
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links
Allocation
Allocation Requests
(AllocationRequestSeq
index=0 (AllocationRequest: id={49DC1BEF-3DD0-4DBF-94A2-B144CA979595} p
riority=0 secondPriority=26393 userid=jobid=137408 description=THE_BACKUP_JOB-13
7408-{49DC1BEF-3DD0-4DBF-94A2-B144CA979595} (RequestSeq
index=0 (Request provider=MPXProvider resourcename=MpxRequest-137408 u
serSequence=-1 (MPXGroupRequest maxmpx=1 media=(RequestSeq
index=0 (Request provider=DriveOperationProvider resourcename=__ANY__
userSequence=0 (StorageUnitRequest: storageUnit=(StorageUnitRequest: storageUnit
=__ANY__ mediaPool=CatalogBackup retentionLevel=2 mustUseLocalMediaServer=no fai
lOnError=no mpxRequired=no mustBeNdmp=no getMaxFreeSpace=no minFreeSpaceKBytes=0
usageType=1 client=c27sadienrfk800 stuSubType=-1 diskGroupName= storageServerTy
pe= shareGroup=*ANY* isNdmp=false isTirRestore=false isFlashbackupRestore=false
isBlockMapRead=false isCatalogBackup=true isGcsCatalogBackup=false isVMWare=fals
e isLifeCycle=false preferVtlToDirectAttachedTape=false backupCopy=-1 isGranular
Exchange=false REQ_IS_HYPER_V=false REQ_IS_EXCHANGE14=false REQ_IS_MPX_NDMP=fals
e REQ_IS_VXVI=false) preferredMediaServer= requiredMediaServer= previousStuName=
previousStuType=0)))
))
index=1 (Request provider=NamedResourceProvider resourcename=c27sadienr
fk800.NBU_CLIENT.MAXJOBS.c27sadienrfk800 userSequence=-1 (CountedResourceReques
t resourcename=c27sadienrfk800.NBU_CLIENT.MAXJOBS.c27sadienrfk800 max=10))
index=2 (Request provider=NamedResourceProvider resourcename=c27sadienr
fk800.NBU_POLICY.MAXJOBS.NBU-Catalog userSequence=-1 (CountedResourceRequest re
sourcename=c27sadienrfk800.NBU_POLICY.MAXJOBS.NBU-Catalog max=1)))
))
Allocations
(AllocationSeq
index=0 (Allocation: id={2ACF4645-82FA-4CD2-83A2-D2B7DC7FF105} provider
=NamedResourceProvider resourcename=c27sadienrfk800.NBU_CATALOG.MAXJOBS masterse
rver=c27sadienrfk800 groupid={00000000-0000-0000-0000-000000000000} userSequence
=-1 userid="jobid=137406" named resource allocation))
MDS allocations in EMM:
MdsAllocation: allocationKey=11686 jobType=16 mediaKey=4000056 mediaId=F
0029L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0
MdsAllocation: allocationKey=11692 jobType=16 mediaKey=4000057 mediaId=N
0022L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0
If the resource broker shows
If the resource broker shows nothing, there are no born or bpbrm processes running in the media server.and removable storage service is disabled then you could revert to the old favorite quick fix of deleting the drives and robot then re-running the device config wizard
Also re-check the drives for tapes - orphaned bptm processes could have re-loaded them
Authorised Symantec Consultant
Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.
Is it possible to cancell
Is it possible to cancell jobs and restart them back. And will be good if you will restart the NB services as well on media server. You know sometimes restart will help. I may not be right.. :)
Apologies for only responding
Apologies for only responding now... (Granny was tired last night!)
You have 'stuck' MDS allocations in the resource broker:
MDS allocations in EMM:
MdsAllocation: allocationKey=11686 jobType=16 mediaKey=4000056 mediaId=F
0029L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0
MdsAllocation: allocationKey=11692 jobType=16 mediaKey=4000057 mediaId=N
0022L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0
Clear them as follows:
nbrbutil -releaseMDS 11686
nbrbutil -releaseMDS 11692
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links
"Granny" you are awesome.
"Granny" you are awesome. That worked and backups are working now. The root cause of this was that we switched tapes while a tape was in the drive. So thank you Mark for helping me figure that one out. And thank you all for your help. I get to keep me job...
Would you like to reply?
Login or Register to post your comment.