Video Screencast Help

Drives are conteneously goes down

Created: 15 Jan 2013 | 14 comments
Shekhar D's picture

We have windows media server on that only 1 drive working fine other 5 drives are intermmittent goes down:

From library end IBM verified all the drives are working fine also power cycled the library.

From Media server also rebooted just before 5 days back still drives are down.

Zoning connectivity from Fabric manager looks good,ports and all.

O:\Program Files\Veritas\Volmgr\bin>vmoprcmd -d

                                PENDING REQUESTS

                                     <NONE>

                                  DRIVE STATUS

Drv Type   Control  User      Label  RecMID  ExtMID  Ready   Wr.Enbl.  ReqId
  0 hcart    TLD               Yes   0822DB  0822DB   Yes     Yes        0
  1 hcart  DOWN-TLD             -                     No       -         0
  2 hcart  DOWN-TLD             -                     No       -         0
  3 hcart  DOWN-TLD             -                     No       -         0
  4 hcart    TLD                -                     No       -         0
  5 hcart  DOWN-TLD             -                     No       -         0

                             ADDITIONAL DRIVE STATUS

Drv DriveName            Shared    Assigned        Comment
  0 T006_F1_S7_D7         No       

  1 T006_F2_S1_D10        No       -
  2 T006_F2_S2_D11        No       -
  3 T006_F2_S3_D12        No       -
  4 T006_F2_S4_D13        No       -
  5 T006_F2_S5_D14        No       -

O:\Program Files\Veritas\Volmgr\bin>

What should be the issue i suspect from media server drives needs to be update?

Please suggest.

Comments 14 CommentsJump to latest comment

Mark_Solutions's picture

Do a robtest to make sure that there are not any tapes in the drives (usually happens when tapes are no changed correctly in the library)

Take a look in the Windows event logs - there should be entries in there that tell you exactly why the drives have gone down

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Shekhar D's picture

When i tried to give the robtest on media server it shows me following message:

O:\Program Files\Veritas\Volmgr\bin>robtest
No locally-controlled robots with test utilities are configured

O:\Program Files\Veritas\Volmgr\bin>

Thanks,

 

Marianne's picture

Add VERBOSE entry to ...\veritas\volmgr\vm.conf on Media server.

Restart NBU Device Management Service on Media server.

UP the drive(s)

Next time a drive goes DOWN, the reason will be logged in Event Viewer Application log.

Hardware errors (e.g. HBA) will be logged in Event Viewer System log.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Mark_Solutions's picture

you need to run robtest on the robot control host - is the library shared? does another server control the robotics?

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Shekhar D's picture

As of now we have following entry under vm.conf

O:\Program Files\Veritas\Volmgr>more vm.conf
MM_SERVER_NAME = Mediaserver.dadc.sbc.com

 

Marianne's picture

As per my post above - add a new line to vm.conf containing the word VERBOSE.
Save the file and restart Device Management service.
Reason for DOWN drives will now be logged in Event Viewer Application log.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Nicolai's picture

You really need to look at the Windows Event.log as Mark suggested. The event log will contain hints to why the drives are downed.

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

Shekhar D's picture

O:\Program Files\Veritas\Volmgr\bin>robtest
Configured robots with local control supporting test utilities:
  TLD(0)     robotic path = {6,0,1,1}

Robot Selection
---------------
  1)  TLD 0
  2)  none/quit
Enter choice: 1
1

Robot selected: TLD(0)   robotic path = {6,0,1,1}

Invoking robotic test utility:
O:\Program Files\Veritas\Volmgr\bin\tldtest.exe -rn 0 -r {6,0,1,1}

Opening {6,0,1,1}
MODE_SENSE complete
Enter tld commands (? returns help information)

Seems like robtest working from robot control host

Mark_Solutions's picture

so what does S D show?

Authorised Symantec Consultant

Don't forget to "Mark as Solution" if someones advice has solved your issue - and please bring back the Thumbs Up!!.

Yasuhisa Ishikawa's picture

As wrote here again and again, check Windows Event log(Application log and System log) first. when drive goes down, the reason must be logged.

if You can not find any relevant messages, then follow Marianne's suggustions - create debug log folder, add VERBOSE line into vm.conf, restart NetBackup, up drives, and reproduce this issue. 

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

Amaan's picture

Please perform below steps:

1. add VERBOSE into vm.conf file as suggested above;

2. Create folders: ltid, robots, reqlib and daemon

under directory: veritas/volmgr/debug folder

3. upped the drives and wait for the issue to come up again.

4. share the output of logs under above folders and on the event viewer.

 

Also share the results of s d command under robtest and try to move any tape into the drive.

Shekhar D's picture

Drv DriveName            Shared    Assigned        Comment
  0 T006_F1_S7_D7         No       

  1 T006_F2_S1_D10        No       -
  2 T006_F2_S2_D11        No       -
  3 T006_F2_S3_D12        No       -
  4 T006_F2_S4_D13        No       -
  5 T006_F2_S5_D14        No       -

I would like to add as index 0 drive is having one Frame1 and other 5 drives are configured on Frame2

On Frame1 drive 1-7 backups are running suceesful but on other 5 drives goes down which is configured on Frame2.

Does this make diffrence OR is there any problem with Frame 2 on library side?

We have added the VERBOSE entry and we found following logs for rest of 5 drives:

e.x-Operator/EMM server has DOWN'ed drive T006_F2_S3_D12 (device 3)

Thanks.

 

Shekhar D's picture

All drive are LTO Ultrium-2 and having same firmware version 73V1
And on Frame also same version - 8930

Marianne's picture

I was hoping for something more in Application log...

It might be a case of 3 I/O errors in a 12-hour period, which will cause NBU to DOWN the drive.
Are you seeing lots of status 84 errors? If so, see http://www.symantec.com/docs/TECH43243 
and http://www.symantec.com/docs/TECH169477

Do you have bptm log in place? If so, please rename the log to bptm.txt and post as File attachment.

Post <install-path>\veritas\netbackup\db\media\errors as well.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links