Video Screencast Help

Backup fail with exit status 85

Created: 14 May 2013 • Updated: 15 May 2013 | 9 comments
This issue has been solved. See solution.

I have a lot of job which is fail recently with the simliar code. It can happen on different job and the tape which used before the job. So I don't think it is the problem of tape, however, sometime it can run successfully. Can anyone help to check? Thanks.

 

Error bptm(pid=4520) tapemark or blank tape encountered reading media header, drive index 2
Info bptm(pid=4520) EXITING with status 85
read failed, media read error (85)

 

5/14/2013 10:35:19 AM - requesting resource LCM_ES-group
5/14/2013 10:35:20 AM - granted resource LCM_ES-group
5/14/2013 10:35:21 AM - begin Duplicate
5/14/2013 10:35:21 AM - started process RUNCMD (744)
5/14/2013 10:35:21 AM - ended process 0 (744)
5/14/2013 10:35:21 AM - requesting resource ES-group
5/14/2013 10:35:21 AM - requesting resource 000397
5/14/2013 10:35:21 AM - reserving resource 000397
5/14/2013 10:35:22 AM - awaiting resource 000397 Reason: Drives are in use, Media Server: lh002,
  Robot Number: 1, Robot Type: TLD, Media ID: N/A, Drive Name: N/A,
  Volume Pool: N/A, Storage Unit: N/A, Drive Scan Host: N/A
 
5/14/2013 10:51:05 AM - awaiting resource ES-group - No drives are available
5/14/2013 10:54:38 AM - reserved resource 000397
5/14/2013 10:54:38 AM - granted resource 000107
5/14/2013 10:54:38 AM - granted resource HP.ULTRIUM4-SCSI.005
5/14/2013 10:54:38 AM - granted resource es001-hcart-robot-tld-3
5/14/2013 10:54:38 AM - granted resource 000397
5/14/2013 10:54:38 AM - granted resource HP.ULTRIUM4-SCSI.003
5/14/2013 10:54:40 AM - Info bptm(pid=7588) start          
5/14/2013 10:54:41 AM - started process bptm (7588)
5/14/2013 10:54:41 AM - Info bptm(pid=7588) using 65536 data buffer size      
5/14/2013 10:54:41 AM - Info bptm(pid=7588) setting receive network buffer to 263168 bytes    
5/14/2013 10:54:41 AM - Info bptm(pid=7588) using 30 data buffers       
5/14/2013 10:54:42 AM - Info bptm(pid=7588) start backup         
5/14/2013 10:54:43 AM - Info bptm(pid=4520) start          
5/14/2013 10:54:43 AM - started process bptm (4520)
5/14/2013 10:54:44 AM - Info bptm(pid=7588) setting receive network buffer to 263168 bytes    
5/14/2013 10:54:44 AM - Info bptm(pid=7588) backup child process is pid 3092.6964     
5/14/2013 10:54:44 AM - Info bptm(pid=7588) Waiting for mount of media id 000107 (copy 2) on server es001.
5/14/2013 10:54:44 AM - started process bptm (7588)
5/14/2013 10:54:44 AM - mounting 000107
5/14/2013 10:54:44 AM - Info bptm(pid=7588) INF - Waiting for mount of media id 000107 on server es001 for writing.
5/14/2013 10:54:44 AM - Info bpdm(pid=4520) reading backup image        
5/14/2013 10:54:44 AM - Info bptm(pid=4520) using 30 data buffers       
5/14/2013 10:54:44 AM - Info bptm(pid=4520) spawning a child process       
5/14/2013 10:54:44 AM - Info bptm(pid=3092) start          
5/14/2013 10:54:44 AM - started process bptm (3092)
5/14/2013 10:54:44 AM - Info bptm(pid=4520) child pid: 5312        
5/14/2013 10:54:44 AM - Info bptm(pid=4520) Waiting for mount of media id 000397 (copy 1) on server lh002.
5/14/2013 10:54:44 AM - started process bptm (4520)
5/14/2013 10:54:44 AM - mounting 000397
5/14/2013 10:54:44 AM - Info bptm(pid=4520) INF - Waiting for mount of media id 000397 on server lh002 for reading.
5/14/2013 10:54:44 AM - Info bptm(pid=5312) start          
5/14/2013 10:54:44 AM - started process bptm (5312)
5/14/2013 10:55:42 AM - Info bptm(pid=7588) media id 000107 mounted on drive index 1, drivepath {0,0,1,0}, drivename HP.ULTRIUM4-SCSI.005, copy 2
5/14/2013 10:55:46 AM - Info bptm(pid=7588) INF - Waiting for positioning of media id 000107 on server es001 for writing.
5/14/2013 11:14:35 AM - Error bptm(pid=4520) tapemark or blank tape encountered reading media header, drive index 2
5/14/2013 11:14:37 AM - Info bptm(pid=4520) EXITING with status 85 <----------      
5/14/2013 11:14:46 AM - Error bpduplicate(pid=744) host lh002 backup id alh015.xyz.com_1368210913 read failed, media read error (85).
5/14/2013 11:14:47 AM - Error bpduplicate(pid=744) host es001 backupid alh015.xyz.com_1368210913 write failed, termination requested by administrator (150).
5/14/2013 11:14:48 AM - Error bpduplicate(pid=744) Duplicate of backupid alh015.xyz.com_1368210913 failed, termination requested by administrator (150). 
5/14/2013 11:14:48 AM - Error bpduplicate(pid=744) Status = no images were successfully processed.    
5/14/2013 11:14:48 AM - end Duplicate; elapsed time: 00:39:27
no images were successfully processed(191)
 

Operating Systems:

Comments 9 CommentsJump to latest comment

John Wan's picture

Hints. I found that all those fail job are duplication job.

John Wan's picture

5/13/2013 9:07:37 PM - Info nbjm(pid=3360) requesting STANDARD_RESOURCE resources from RB for backup job (jobid=101712, request id:{25DEE511-B535-4363-8D50-E1638BFB84D4})
5/13/2013 9:07:37 PM - requesting resource LH-group
5/13/2013 9:07:37 PM - requesting resource lh001.NBU_CLIENT.MAXJOBS.lh003
5/13/2013 9:07:37 PM - requesting resource lh001.NBU_POLICY.MAXJOBS.LH003_sys
5/13/2013 9:07:37 PM - awaiting resource LH-group - No drives are available
5/13/2013 9:19:05 PM - granted resource lh001.NBU_CLIENT.MAXJOBS.lh003
5/13/2013 9:19:05 PM - granted resource lh001.NBU_POLICY.MAXJOBS.LH003_sys
5/13/2013 9:19:05 PM - granted resource 000391
5/13/2013 9:19:05 PM - granted resource HP.ULTRIUM4-SCSI.003
5/13/2013 9:19:05 PM - granted resource lh003-hcart-robot-tld-1
5/13/2013 9:19:06 PM - estimated 3205714 Kbytes needed
5/13/2013 9:19:06 PM - Info nbjm(pid=3360) resumed backup job for client lh003, policy LH003_sys, schedule Daily-Inc on storage unit lh003-hcart-robot-tld-1
5/13/2013 9:19:07 PM - started process bpbrm (1904)
5/13/2013 9:19:09 PM - Info bpbrm(pid=1904) lh003 is the host to backup data from   
5/13/2013 9:19:09 PM - Info bpbrm(pid=1904) reading file list from client      
5/13/2013 9:19:09 PM - connecting
5/13/2013 9:19:09 PM - connected; connect time: 00:00:00
5/13/2013 9:19:10 PM - Info bpbrm(pid=1904) starting bpbkar32 on client       
5/13/2013 9:19:10 PM - Info bpbkar32(pid=5180) Backup started         
5/13/2013 9:19:11 PM - Info bptm(pid=5560) start          
5/13/2013 9:19:11 PM - Info bptm(pid=5560) using 65536 data buffer size      
5/13/2013 9:19:11 PM - Info bptm(pid=5560) setting receive network buffer to 263168 bytes    
5/13/2013 9:19:11 PM - Info bptm(pid=5560) using 30 data buffers       
5/13/2013 9:19:11 PM - mounting 000391
5/13/2013 9:19:12 PM - Info bptm(pid=5560) start backup         
5/13/2013 9:19:12 PM - Info bptm(pid=5560) Waiting for mount of media id 000391 (copy 1) on server lh003.
5/13/2013 9:22:42 PM - Info bptm(pid=5560) media id 000391 mounted on drive index 3, drivepath {0,0,3,0}, drivename HP.ULTRIUM4-SCSI.003, copy 1
5/13/2013 9:22:42 PM - mounted; mount time: 00:03:31
5/13/2013 9:38:41 PM - positioning 000391 to file 1
5/13/2013 9:54:57 PM - Info bptm(pid=5560) WriteFile() failed err = 23      
5/13/2013 9:54:57 PM - Error bptm(pid=5560) write error on media id 000391, drive index 3, writing header block, 23
5/13/2013 9:54:57 PM - Info bptm(pid=5560) EXITING with status 84 <----------      
5/13/2013 9:54:58 PM - Info bpbkar32(pid=5180) bpbkar waited 0 times for empty buffer, delayed 0 times. 
5/13/2013 9:55:00 PM - Error bpbrm(pid=1904) could not send server status message     
5/13/2013 9:55:01 PM - Error bpbrm(pid=1904) cannot send mail to gsdsdataman@xyz.com      
5/13/2013 9:55:01 PM - end writing
media write error(84)
 

mph999's picture

Drive or tape failure as far as I can see, is it happening on certain tapes/ certain drives.

System event logs may show something.

NBU does not write or read to tape, these actions are perfomred by the OS, so it won'tt be NBU.

From the media servers that have suffered the issue , perhaps you can post up:

 

<install path>\veritas\netbackup\db\media\errors file (one on each media server).

Nothing in NBU is going to fix this, there are no options to fix such an issue.

I presume this was wroking previously, and there have been no changes to anything.

Martin

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805
 
SOLUTION
Marianne's picture

So media id 000397 seems to be the source media:

Info bptm(pid=4520) INF - Waiting for mount of media id 000397 on server hkbklh002 for reading.

But the tape driver reports that this a blank tape:

Error bptm(pid=4520) tapemark or blank tape encountered reading media header, drive index 2 

Seems either a media problem or else a driver problem.

What is media server OS version? 
Which tape drivers are you using?

Please check that bptm log folder exists on media server under <install-path>\veritas\netbackup logs and that VERBOSE entry exists in <install-path>\veritas\volmgr\vm.conf (followed by NBU Device Manager service restart.

The VERBOSE entry will ensure additional Media Manager logging to Event Viewer System and Application logs.

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

John Wan's picture

The media server OS version is windows server 2003.

how can I check tape drivers?

 

bptm log folder and vm.conf are existed.

 

There are a lot of event ID 7, sources hplto, Description "The device, \Device\TapeX, had a bad block." at System Log.

Also, Event ID 2636 , sources NetBackup, Description "Operator/EMM server has DOWN'ed drive HP.ULTRIUM4-SCSI.00X (device x)

 

John Wan's picture

I found the problem is related to the drive fail on HP.ULTRIUM4-SCSI.003. Replaced it then everything working fine now.

Marianne's picture

Since Martin (mph999) has pointed out hardware problem:

Drive or tape failure .....

please mark his post as Solution?

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

John Wan's picture

Sure~

He always give a valuable point and information!

mph999's picture

Thank you for the kind comments and solution.

M

 

Regards,  Martin
 
Setting Logs in NetBackup:
http://www.symantec.com/docs/TECH75805