Video Screencast Help
Search Video Help Close Back
to help
Not able to make it to Vision this year? Get a sampling in the Best of Vision on Demand group.

Drives stay down

Updated: 25 Jan 2012 | 22 comments
Jajones's picture
0 0 Votes
Login to vote
This issue has been solved. See solution.

Hello all,

 

I am fairly new to Netbackup so bare with me.  We are using NetBackup 7.0.1 running on Windows 2003. We have one robot with two drives.  The drives seem to stay down.  If I can get the drives up, they go right back down as soon as I start a job.  I have deleted the robot and drives and reconfigured them.  Like I said bare with me this product is new to me.  If you ask for logs or such please let me know how to get them. 

 

Thanks in advance

Comments

Andy Welburn's picture
24
Jan
2012
0 Votes 0
Login to vote

There's a *slim* chance (as you've only got two drives)

that the drives have been mis-configured (twice at least as you're reconfigured them again).

i.e. what you (& the library) think is drive 1 & what NetBackup thinks is drive 1 are different.

You could initially use "robtest" to confirm this:

http://www.symantec.com/business/support/index?pag...

http://www.symantec.com/business/support/index?pag...

 

Have you any errors for the jobs (job details in GUI) when they result in the drives going down?

Regards Andy

"It's not too late to panic ..."

mph999's picture
24
Jan
2012
0 Votes 0
Login to vote

OK, we'd normally require a

OK, we'd normally require a lot more info, but we'll go slow as you're new ...

For future ref, this is a good starting point for every new thread ...

https://www-secure.symantec.com/connect/blogs/minimum-information-required-when-logging-problem-details

 

Anyhow, what do the details show in the details tab for a failing job in activity monitor.

Logs we would get are :

(Create these dirs)

On the media server :

<install path>\veritas\netbackup\logs\bptm

<install path>\veritas\volmgr\debug\tpcommand

<install path>\veritas\volmgr\debug\robots

<install path>\veritas\volmgr\debug\ltid

Create these empty files

<install path>\veritas\volmgr\ROBOT_DEBUG

<install path>\veritas\volmgr\DRIVE_DEBUG

Add VERBOSE to a new line in<install path>\veritas\volmgr\vm.conf 

 

If the media server (with the drives going down) is not the robot control host, add the 'robot' lines above to the robot control host.

You can find the RCH, run tpconfig -d on a media server, RCH shown at the bottom.

Regards,

Martin

Omar Villa's picture
24
Jan
2012
0 Votes 0
Login to vote

Share some output

Can you please share the output of this 3 commands

vmoprcmd -h <media server> -shmdrive

tpautoconf -t

vmglob -listall -java

 

What we are looking for is for missmatchs between your drives at Serial Number, Path and Drive Name, this 3 commands will show what we need and confirm if you drives are properly configured at OS and NBU level.

Regards.

Omar A Villa

Netbackup Expert

These are my personal views and not those of the company I work for

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

After discovering that we had

After discovering that we had two tapes stuck in the drive, we are still getting no drives available.  But that could be because I did not configure them right.  Here is the outputs you asked for.  Thank you for your help.

 

D:\Program Files\Veritas\Volmgr\bin>vmoprcmd -h c27sadienrfk800 -shmdrive
0 -1 1 2 0 3 82 8 0 1 -1 -1 -1 0 -1 -1 -1 -1 0 0 24 0 0 0 0 0 0 0 0 0 {0,0,2,0}
*EmPt* *EmPt* *EmPt* *EmPt* *EmPt* *EmPt* HU19034P97 HP.Ultrium4-SCSI.000 HP~~~~
~~Ultrium~4-SCSI~~H58W *EmPt* *EmPt* 0000000000000000000000000000000000000000000
00000000000000000000000000000 *EmPt* *EmPt* *EmPt* *EmPt* 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 c27sadienrfk800 *EmPt* *EmPt* *EmPt* 0 0 - *EmPt*

1 -1 1 2 0 3 82 8 0 2 -1 -1 -1 0 -1 -1 -1 -1 0 0 24 0 0 0 0 0 0 0 0 0 {0,0,3,0}
*EmPt* *EmPt* *EmPt* *EmPt* *EmPt* *EmPt* HU19044T3P HP.Ultrium4-SCSI.001 HP~~~~
~~Ultrium~4-SCSI~~H58W *EmPt* *EmPt* 0000000000000000000000000000000000000000000
00000000000000000000000000000 *EmPt* *EmPt* *EmPt* *EmPt* 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 c27sadienrfk800 *EmPt* *EmPt* *EmPt* 0 0 - *EmPt*

D:\Program Files\Veritas\Volmgr\bin>tpautoconf -t
TPAC60 HP      Ultrium 4-SCSI  H58W HU19034P97 0 0 2 0 Tape0 - -
TPAC60 HP      Ultrium 4-SCSI  H58W HU19044T3P 0 0 3 0 Tape1 - -

D:\Program Files\Veritas\Volmgr\bin>vmglob -listall -java
VMGLOB4.5 robot ROBOT0 MXA91210A0 c27sadienrfk800 c27sadienrfk800 0 -1 TLD - 0x0
 - 0 HP~~~~~~MSL~G3~Series~~~7.30 - - - -1 -1 -1 -1
VMGLOB4.5 drive HP.Ultrium4-SCSI.001 HU19044T3P c27sadienrfk800 c27sadienrfk800
0 2 TLD hcart 0x0 - 0 HP~~~~~~Ultrium~4-SCSI~~H58W - - - -1 -1 -1 -1
VMGLOB4.5 drive HP.Ultrium4-SCSI.000 HU19034P97 c27sadienrfk800 c27sadienrfk800
0 1 TLD hcart 0x0 - 0 HP~~~~~~Ultrium~4-SCSI~~H58W - - - -1 -1 -1 -1

Mark_Solutions's picture
24
Jan
2012
2 Votes +2
Login to vote

One other thought here

One other thought here ....

How do you load and unload your library?

If you do it by opening the magazines then it may have been loaded whilst tapes were in the drives

If their original slots were full they could not be ejected and no matter what you do they wont come up.

Run robtest and use s d to show the drives

If there are tapes in the drives see which slots they came from (it should say)

The do s s to show the slots and see if those slots are empty

If they are not make a note of 2 drives that are empty and use m d1 s20 (as an example this moves the tape from drive 1 to slot 20) - do this for each drive

Next run an inventory update in NetBackup to update the location of these tapes

The drives should then come up when you request it

If this was the case then in the future use the load ports if it has them or want until nothing is running to tape before changing tapes

Hope this helps

Authorised Symantec Consultant

Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

Thank you very much.  We did

Thank you very much.  We did have two tapes in the drive and now are out.  We have actually been looking for them.  That solved the drives going down.  Now I when I run a job it stays at queued. They are not failing out.  I will be posting the outputs requested.

1/24/2012 10:26:13 AM - awaiting resource c27sadienrfk800-hcart-robot-tld-0-1 - No drives are available

Mark_Solutions's picture
24
Jan
2012
0 Votes 0
Login to vote

I would do a reset of the

I would do a reset of the media server then to clear all allocations and free things up:

\netbackup\bin\admincmd\nbrbutil -resetMediaServer mediaservername

Authorised Symantec Consultant

Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

I ran that and got a

I ran that and got a resetMediaServerResources() returned status=2005029. Does that mean it is finished?

Mark_Solutions's picture
24
Jan
2012
0 Votes 0
Login to vote

I wouldnt expect any status

I wouldnt expect any status message - it should just run with no output

Do a View - Refresh All in your admin console and then go to device monitor to see the status of the drives and whether there are any pending requests hanging around - this is at the bottom of the screen but minimised if there aren't any

Authorised Symantec Consultant

Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

I did that but both drives

I did that but both drives are stuck on Active.  I do not know what they are doing nor does it say.

Mark_Solutions's picture
24
Jan
2012
0 Votes 0
Login to vote

Check the Media Server for

Check the Media Server for bptm processes- if nothing is running kill them off

Then do a reset on the drives (via Device Monitor - right lick and reset) - this will tell you if there are pending allocations and if there are try the resetmedia server command again (you did replace mediaservername with the actual media sever name didnt you?)

Authorised Symantec Consultant

Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

Yes I did replace

Yes I did replace mediaservername with the actual name.  So, both drives now say up. But both of them have a "No" under the "Ready" column.  When I try to run a job it sits saying queued. No drives are available.

Mark_Solutions's picture
24
Jan
2012
0 Votes 0
Login to vote

I should re-run the device

I should re-run the device wizard one more time then as your earlier work may have messed something up

The No to being Ready just means that there is not a tape in the drive.

Authorised Symantec Consultant

Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

I have tried that but no go. 

I have tried that but no go.  When I try to run a diag on the drives it fails saying drives are in use.  There is no jobs going on at this time

Marianne van den Berg's picture
24
Jan
2012
1 Vote +1
Login to vote

I realize that I am late to

I realize that I am late to this discussion.....

On W2003 server. please confirm that Removable Storage service is stopped and disabled.

Restart NBU Media Manager service after creating logs and adding VERBOSE entry to vm.conf as advised by Martin (mph999).

The exact reason for drives going DOWN will now be logged in Event Viewer Application log.

In addition to 'tpconfig -d' (as per Martin's request) or either 'tpconfig -l', please also post output of 'scan -changer'.
Both these commands can be found in D:\Program Files\Veritas\Volmgr\bin.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

Nope you are not late.  The

Nope you are not late.  The drives no longer say down.  That was due to having two tapes in the drive with all slots full.  But know it just says no drives available. Removable storage service is stopped.  Here is the requested output.  Thanks!

 

Device Robot Drive       Robot                    Drive                 Device

Type     Num Index  Type DrNum Status  Comment    Name                  Path

robot      0    -    TLD    -       -  -          -                     {0,0,2,1
}
  drive    -    0  hcart    1      UP  -          HP.Ultrium4-SCSI.000  {0,0,2,0
}
  drive    -    1  hcart    2      UP  -          HP.Ultrium4-SCSI.001  {0,0,3,0
}
D:\Program Files\Veritas\Volmgr\bin>scan -changer
************************************************************
*********************** SDT_CHANGER ************************
************************************************************
------------------------------------------------------------
Device Name  : "Changer1"
Passthru Name: "Changer1"
Volume Header: ""
Port: 0; Bus: 0; Target: 2; LUN: 1
Inquiry    : "HP      MSL G3 Series   7.30"
Vendor ID  : "HP      "
Product ID : "MSL G3 Series   "
Product Rev: "7.30"
Serial Number: "MXA91210A0"
WWN          : ""
WWN Id Type  : 0
Device Identifier: "HP      MSL G3 Series   MXA91210A0"
Device Type    : SDT_CHANGER
NetBackup Robot Type: 8
Removable      : Yes
Device Supports: SCSI-5
Number of Drives : 2
Number of Slots  : 42
Number of Media Access Ports: 3
Drive 1 Serial Number      : "HU19034P97"
Drive 2 Serial Number      : "HU19044T3P"
Flags : 0x0
Reason: 0x0

Marianne van den Berg's picture
24
Jan
2012
2 Votes +2
Login to vote

All the output provided is

All the output provided is proof that device config is good. (I just needed confirmation for my own sanity). RSM should be stopped and disabled...

Now that drives are up and resource broker media server reset was done, everything should be okay... seems not....

Please show us resource broker allocation status:

D:\Program Files\Veritas\netbackup\bin\admincmd\nbrbutil -dump

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links

Jajones's picture
24
Jan
2012
0 Votes 0
Login to vote

Allocation

Allocation Requests
(AllocationRequestSeq
         index=0 (AllocationRequest: id={49DC1BEF-3DD0-4DBF-94A2-B144CA979595} p
riority=0 secondPriority=26393 userid=jobid=137408 description=THE_BACKUP_JOB-13
7408-{49DC1BEF-3DD0-4DBF-94A2-B144CA979595} (RequestSeq
         index=0 (Request provider=MPXProvider resourcename=MpxRequest-137408  u
serSequence=-1 (MPXGroupRequest maxmpx=1 media=(RequestSeq
         index=0 (Request provider=DriveOperationProvider resourcename=__ANY__
userSequence=0 (StorageUnitRequest: storageUnit=(StorageUnitRequest: storageUnit
=__ANY__ mediaPool=CatalogBackup retentionLevel=2 mustUseLocalMediaServer=no fai
lOnError=no mpxRequired=no mustBeNdmp=no getMaxFreeSpace=no minFreeSpaceKBytes=0
 usageType=1 client=c27sadienrfk800 stuSubType=-1 diskGroupName= storageServerTy
pe= shareGroup=*ANY* isNdmp=false isTirRestore=false isFlashbackupRestore=false
isBlockMapRead=false isCatalogBackup=true isGcsCatalogBackup=false isVMWare=fals
e isLifeCycle=false preferVtlToDirectAttachedTape=false backupCopy=-1 isGranular
Exchange=false REQ_IS_HYPER_V=false REQ_IS_EXCHANGE14=false REQ_IS_MPX_NDMP=fals
e REQ_IS_VXVI=false) preferredMediaServer= requiredMediaServer= previousStuName=
 previousStuType=0)))
))
         index=1 (Request provider=NamedResourceProvider resourcename=c27sadienr
fk800.NBU_CLIENT.MAXJOBS.c27sadienrfk800  userSequence=-1 (CountedResourceReques
t resourcename=c27sadienrfk800.NBU_CLIENT.MAXJOBS.c27sadienrfk800 max=10))
         index=2 (Request provider=NamedResourceProvider resourcename=c27sadienr
fk800.NBU_POLICY.MAXJOBS.NBU-Catalog  userSequence=-1 (CountedResourceRequest re
sourcename=c27sadienrfk800.NBU_POLICY.MAXJOBS.NBU-Catalog max=1)))
))

Allocations
(AllocationSeq
         index=0 (Allocation: id={2ACF4645-82FA-4CD2-83A2-D2B7DC7FF105} provider
=NamedResourceProvider resourcename=c27sadienrfk800.NBU_CATALOG.MAXJOBS masterse
rver=c27sadienrfk800 groupid={00000000-0000-0000-0000-000000000000} userSequence
=-1 userid="jobid=137406" named resource allocation))

MDS allocations in EMM:

        MdsAllocation: allocationKey=11686 jobType=16 mediaKey=4000056 mediaId=F
0029L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
 mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0
        MdsAllocation: allocationKey=11692 jobType=16 mediaKey=4000057 mediaId=N
0022L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
 mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0

Mark_Solutions's picture
24
Jan
2012
0 Votes 0
Login to vote

If the resource broker shows

If the resource broker shows nothing, there are no born or bpbrm processes running in the media server.and removable storage service is disabled then you could revert to the old favorite quick fix of deleting the drives and robot then re-running the device config wizard
Also re-check the drives for tapes - orphaned bptm processes could have re-loaded them

Authorised Symantec Consultant

Don't forget to give a "Thumbs Up" or mark as "Solution" if someones advice has helped you.

Amaan's picture
24
Jan
2012
0 Votes 0
Login to vote

Is it possible to cancell

Is it possible to cancell jobs and restart them back. And will be good if you will restart the NB services as well on media server. You know sometimes restart will help. I may not be right.. :)

Marianne van den Berg's picture
24
Jan
2012
1 Vote +1
Login to vote

Apologies for only responding

Apologies for only responding now...  (Granny was tired last night!)

You have 'stuck' MDS allocations in the resource broker:

MDS allocations in EMM:

        MdsAllocation: allocationKey=11686 jobType=16 mediaKey=4000056 mediaId=F
0029L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
 mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0
        MdsAllocation: allocationKey=11692 jobType=16 mediaKey=4000057 mediaId=N
0022L driveKey=0 driveName= drivePath= stuName= masterServerName=c27sadienrfk800
 mediaServerName=c27sadienrfk800 ndmpTapeServerName= diskVolumeKey=0 mountKey=0
linkKey=0 fatPipeKey=0 scsiResType=0 serverStateFlags=0

Clear them as follows:

nbrbutil -releaseMDS 11686

nbrbutil -releaseMDS 11692

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows.
Handy NBU links

Jajones's picture
25
Jan
2012
0 Votes 0
Login to vote

  "Granny" you are awesome. 

 

"Granny" you are awesome.  That worked and backups are working now.  The root cause of this was that we switched tapes while a tape was in the drive.  So thank you Mark for helping me figure that one out.  And thank you all for your help.  I get to keep me job...