Video Screencast Help

NDMP Tapr Drive Issue

Created: 02 Nov 2012 • Updated: 06 Nov 2012 | 5 comments
This issue has been solved. See solution.

Hi All

We have one master server at solaris 10. it also works as media server.

Robot is NDMP robot and we have 4 tape drives. 2 normal and 2 NDMp tape drives.

all was working fine, now one drive gone bad and we repalced it with the help of vendor.

It was NDMP tape drive of lto4 type and the new drive is of lto5 type.

I have configured drive on NBU level. it was showing up.

When I start backups , thsi drive goes down immediately.

Tape drive is  Quantum I500.

 

Logs :

 

11/03/2012 00:23:04 - Error bptm (pid=18611) error requesting media, TpErrno = Robot operation failed
11/03/2012 00:23:05 - Warning bptm (pid=18611) media id E10411 load operation reported an error
11/03/2012 00:23:05 - current media E10411 complete, requesting next media Any
11/03/2012 00:23:26 - Error bptm (pid=18611) NBJM returned an extended error status: All compatible drive paths are down but media is available (2009)
11/03/2012 00:23:26 - Error ndmpagent (pid=18610) NDMP backup failed, path = UNKNOWN
11/03/2012 00:23:26 - end writing
11/03/2012 00:23:27 - Info bptm (pid=18611) EXITING with status 252 <----------
11/03/2012 00:23:27 - Info ndmpagent (pid=0) done. status: 150: termination requested by administrator
An extended error status has been encountered, check detailed status  (252)

 

 

also from /var/adm/messages:

 

Nov  2 13:26:54 nasun001.wmg.com tldcd[16695]: [ID 633948 daemon.notice] TLD(0) issuing initialize_element_status, drive 1 asc=0x83, ascq=0x4
Nov  2 13:26:55 nasun001.wmg.com tldcd[16695]: [ID 495694 daemon.error] TLD(0) cannot clear drive 1 error, drive asc=0x83, ascq=0x4
Nov  2 13:27:16 nasun001.wmg.com tldcd[16742]: [ID 633948 daemon.notice] TLD(0) issuing initialize_element_status, drive 1 asc=0x83, ascq=0x4
Nov  2 13:27:16 nasun001.wmg.com tldcd[16742]: [ID 495694 daemon.error] TLD(0) cannot clear drive 1 error, drive asc=0x83, ascq=0x4
Nov  2 13:27:19 nasun001.wmg.com ltid[16084]: [ID 676105 daemon.error] Operator/EMM server has DOWN'ed drive HP.ULTRIUM4-SCSI.000 (device 2)
Nov  2 13:27:51 nasun001.wmg.com ltid[16084]: [ID 515287 daemon.notice] Operator/EMM server has UP'ed drive HP.ULTRIUM4-SCSI.000 (device 2)
Nov  2 13:29:00 nasun001.wmg.com tldcd[16836]: [ID 633948 daemon.notice] TLD(0) issuing initialize_element_status, drive 1 asc=0x83, ascq=0x4
Nov  2 13:29:00 nasun001.wmg.com tldcd[16836]: [ID 495694 daemon.error] TLD(0) cannot clear drive 1 error, drive asc=0x83, ascq=0x4
Nov  2 13:29:21 nasun001.wmg.com tldcd[16881]: [ID 633948 daemon.notice] TLD(0) issuing initialize_element_status, drive 1 asc=0x83, ascq=0x4
Nov  2 13:29:21 nasun001.wmg.com tldcd[16881]: [ID 495694 daemon.error] TLD(0) cannot clear drive 1 error, drive asc=0x83, ascq=0x4
Nov  2 13:29:24 nasun001.wmg.com ltid[16084]: [ID 676105 daemon.error] Operator/EMM server has DOWN'ed drive HP.ULTRIUM4-SCSI.000 (device 2)
Nov  2 13:30:53 nasun001.wmg.com tldcd[17301]: [ID 633948 daemon.notice] TLD(0) issuing initialize_element_status, drive 1 asc=0x83, ascq=0x4
Nov  2 13:30:53 nasun001.wmg.com tldcd[17301]: [ID 495694 daemon.error] TLD(0) cannot clear drive 1 error, drive asc=0x83, ascq=0x4
Nov  2 13:30:54 nasun001.wmg.com ltid[16084]: [ID 372346 daemon.error] Cleaning for drive 2 failed, status = Robotic mount failure
Nov  2 13:31:14 nasun001.wmg.com tldcd[17346]: [ID 633948 daemon.notice] TLD(0) issuing initialize_element_status, drive 1 asc=0x83, ascq=0x4
Nov  2 13:31:15 nasun001.wmg.com tldcd[17346]: [ID 495694 daemon.error] TLD(0) cannot clear drive 1 error, drive asc=0x83, ascq=0x4
 

 

also from that NDMP host which has this tape drive..

 

Host "server103" tape device model "Hewlett-Packard LTO-5":
  Device "rst0l" attributes=(0x5) REWIND RAW
    DENSITY=LTO-3(ro)/4 4/800GB
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "nrst0l" attributes=(0x4) RAW
    DENSITY=LTO-3(ro)/4 4/800GB
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "urst0l" attributes=(0x6) UNLOAD RAW
    DENSITY=LTO-3(ro)/4 4/800GB
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "rst0m" attributes=(0x5) REWIND RAW
    DENSITY=LTO-3(ro)/4 8/1600GB cmp
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "nrst0m" attributes=(0x4) RAW
    DENSITY=LTO-3(ro)/4 8/1600GB cmp
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "urst0m" attributes=(0x6) UNLOAD RAW
    DENSITY=LTO-3(ro)/4 8/1600GB cmp
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "rst0h" attributes=(0x5) REWIND RAW
    DENSITY=LTO-5 1600GB
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "nrst0h" attributes=(0x4) RAW
    DENSITY=LTO-5 1600GB
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "urst0h" attributes=(0x6) UNLOAD RAW
    DENSITY=LTO-5 1600GB
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "rst0a" attributes=(0x5) REWIND RAW
    DENSITY=LTO-5 3200GB cmp
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "nrst0a" attributes=(0x4) RAW
    DENSITY=LTO-5 3200GB cmp
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
  Device "urst0a" attributes=(0x6) UNLOAD RAW
    DENSITY=LTO-5 3200GB cmp
    ELECTRICAL_NAME=1d.61
    SERIAL_NUMBER=C3898C7000
    WORLD_WIDE_NAME=WWN[5:003:08c389:8c7000]
    ALIAS 0=st0
 

 

 

vmoprcmd -d

                                PENDING REQUESTS

                                     <NONE>

                                  DRIVE STATUS

Drv Type   Control  User      Label  RecMID  ExtMID  Ready   Wr.Enbl.  ReqId
  0 hcart2   TLD                -                     No       -         0
  1 hcart2   TLD                -                     No       -         0
  2 hcart3 DOWN-TLD             -                     No       -         0
  3 hcart3   TLD               -                      No       -         0
 

                             ADDITIONAL DRIVE STATUS

Drv DriveName            Shared    Assigned        Comment
  0 HP.ULTRIUM5-SCSI.000  No       -
  1 HP.ULTRIUM5-SCSI.001  No       -
  2 HP.ULTRIUM4-SCSI.000  No       -
  3 HP.ULTRIUM4-SCSI.001  No       -

 

 

Please suggest on this..

Comments 5 CommentsJump to latest comment

Marianne's picture

These are hardware errors, not NBU errors:

cannot clear drive 1 error, drive asc=0x83, ascq=0x4

 error requesting media, TpErrno = Robot operation failed

 media id E10411 load operation reported an error

Get your hardware vendor back - show him these errors...

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

SOLUTION
Yasuhisa Ishikawa's picture

By this doc in t10.org, Data transfer element not installed. Check if the new drive is recognized and online in i500.
http://www.t10.org/cgi-bin/ac.pl?t=d&f=06-109r1.pdf

It seems that you have never reconfigured devices after replacement.
Remove replaced drive in NetBackup, and run device configuration wizard again. Be sure to change drive type to hcart3.

Authorized Symantec Consultant(ASC) Data Protection in Tokyo, Japan

ChAmp35's picture

Hi Marianne ,

I agree with you on this, it could be a HW error.

On checking I found earlier tape drive was lto4 and its replaced with lto5.

we have logged a case with vendor.. they are checking on it and most probably will replace it.

but i have one query here, if i want to use this ltp5 tape drive .. is there any way out ?

Marianne's picture

It seems you have a mix of LTO3 and LTO5 drives.

You cannot configure LTO5 drives with the same density as LTO3 drives - you can read LTO3 tapes in LTO5 drive, but cannot write to it.

Your other LTO5 drives are configured as hcart2. 
The replaced tape drive should also be configured as hcart2.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

ChAmp35's picture

Thanks for help..

 

Issue has resolved after replacing the tape drive of LTO5 with LTO4 tape drive.