Video Screencast Help

media position error(86)

Created: 28 Feb 2014 • Updated: 14 Apr 2014 | 13 comments
xao's picture
This issue has been solved. See solution.

Netbackup Env: Windows Server 2008 R2 + 4 media servers,

IBM Robot, EMC Celera fillers,

Master + Media NB version 7.5.0.7

 

I am trying to backup some emc filers through NDMP on tapes, IBM robot, we have 3 fillers, 2 are working but this one is not.

I attached the joblog, If you wany any other information please just ask (if you ask for other types of logs please provide location of it)

 

Error bptm() io_ioctl_ndmp (MTFSF) failed on media id IBM037, drive index 7, return code 18 (NDMP_XDR_DECODE_ERR) (bptm.c.7061)
Error ndmpagent() connection 0000000000AA8970 ndmp_message_process_one failed, status = 18 (NDMP_XDR_DECODE_ERR)     
Error ndmpagent() NDMP backup failed, path = UNKNOWN       
Info bptm() EXITING with status 86 <----------        
Info ndmpagent() done. status: 86: media position error       
end writing
media position error(86

 

Thank You

//George

 

Operating Systems:

Comments 13 CommentsJump to latest comment

Nagalla's picture

does the tape drives connected to filer?

if yes please post the the logs of NDMPd from the filer

does the same tape drives are attached to the other 2 filers which are working fine?

did you check with the other tapes , i am seeing the failures on IBM037

try with other tapes too... and other drives.

 

xao's picture

does the tape drives connected to filer?

  • yes they are connected

if yes please post the the logs of NDMPd from the filer

  • I have no access to ti, but I will try

does the same tape drives are attached to the other 2 filers which are working fine?

  • yes, we have 5 drives, each is connected with drive path to each  ndmp host

did you check with the other tapes , i am seeing the failures on IBM037

  • yes I did, same result, I eaven freezed IBM037

try with other tapes too... and other drives

I activated in the policy: Allow multiple data stream

The result is fantastic:

The policy has 4 cifs, each resulted in a child job:

  • One cifs has finished
  • One cifs has the same erorr, posted above, with different drive and different tape. What can cause this erorr? I accesed that cifs and it was available!! Every time it freezez the tape, 3 tape on freez beacuse of this error.
  • The other 2 cifs are on going almost to finish line.

Thanks

//George

Nagalla's picture

so , its kind of isolation..

take out the volume that is giving the error. and run the backup for other 3 cifs with out Multple data streams and see how it goes..

if it got successfull, then keep the one that is having issue with to the test or isolated policy and let the other 3 run existing one.

then trigger the backup for issue cifs and collect the ndmpd from the filer admin.

xao's picture

So the Job finished as I 1st tested like I sad above:

Application Event:

TLD(0) [5288] Drive 1 (device 3) has not become ready. Last status: Data error (cyclic redundancy check).

TLD(0) [5288] Could not get tape parameters for drive 1 (device 3): Data error (cyclic redundancy check).

TLD(0) [12968] Drive 1 (device 5) has not become ready. Last status: The requested resource is in use.

I did not find anytghing regarding any of the tape ID in <Install_dir>\VERITAS\NetBackup\db\media\errors

I searched in the <Install_dir>\VERITAS\NetBackup\logs\bptm\<1.3.2014> almost the same time frame, attachment : bptm - time frame log.txt

  • 2 cifs finished with success (some tape erorrs)
  • One cifs, I talking about did now write anything at all.
  • the 3rd that was writing had failed with the attached log-attempt5
AttachmentSize
joblog-attempt5.txt 4.51 KB
bptm - time frame log.txt 36.88 KB
Marianne's picture

I see the following in the job log:

2014-03-01 01:55:03 - granted resource IBM068
2014-03-01 01:55:03 - granted resource IBM.ULT3580-TD5.Drive1

the tape filled up, and a new tape was mounted. For some or reason, a different tape drive was chosen to carry on with the backup:

2014-03-01 08:39:16 - granted resource IBM047
2014-03-01 08:39:16 - granted resource IBM.ULT3580-TD5.Drive3

So, we have 2 new backup resources -  media as well as a different tape drive.

The error reported by ndmpagent is 'Media error'.

2014-03-01 08:39:28 - Error ndmpagent(pid=11460) x: Medium error

 

Try to re-use this tape by adding it to a test pool and create a small test policy for the Windows media server to backup to its own STU using this test pool.

Let us know what the result is.

PS:

The bptm log was from the wrong media server or from the wrong date.

There is no evidence of jobid 430915 or bptm PID 5420.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

xao's picture
  • I added the all the attempt logs
  • the right bptm log file
  • I arranged in chronological order the resource media alocation + all the errors related to it:

one tape drive failed with

03/01/14 08:39:29 IBM047 6 TAPE_ALERT IBM.ULT3580-TD5.Drive3 0x00100000 0x00000000

Didn`t found the conversion of this HERE

0x00100000 0x00000000

I will test those tapes and get back with a full report.

AttachmentSize
attempt 1.txt 2.94 KB
attempt 2.txt 1.21 KB
attempt 3.txt 5.36 KB
attempt 4.txt 32.06 KB
attempt 5.txt 4.51 KB
bptm - 022814.txt 11.28 MB
chronological resource allocation.xlsx 15.55 KB
Marianne's picture

Just WAY too many errors on different tape drives, different media....

13:28:41.887 [9328.12576] <2> io_read_block: read error on media id IBM046, drive index 8 reading header block, len = 0; No more data is on the tape. (1104)

13:28:41.887 [9328.12576] <16> io_position_for_write: cannot position media id IBM046 for write

13:28:41.887 [9328.12576] <2> send_MDS_msg: DEVICE_STATUS 1 118099786 x IBM046 4002703 IBM.ULT3580-TD5.Drive4 2000498 POSITION_ERROR 0 0


13:33:16.760 [12420.11092] <16> io_ioctl: io_ioctl_ndmp (MTFSF) failed on media id IBM037, drive index 7, return code 18 (NDMP_XDR_DECODE_ERR) (bptm.c.7061)

13:33:16.760 [12420.11092] <2> send_MDS_msg: DEVICE_STATUS 1 118108332 x IBM037 4002694 IBM.ULT3580-TD5.Drive5 2000497 POSITION_ERROR 0 0

13:33:16.775 [12420.11092] <2> log_media_error: successfully wrote to error file - 02/28/14 13:33:16 IBM037 7 POSITION_ERROR IBM.ULT3580-TD5.Drive5

13:43:59.262 [10776.12196] <2> io_read_block: read error on media id IBM069, drive index 7 reading header block, len = 0; No more data is on the tape. (1104)

13:43:59.262 [10776.12196] <16> io_position_for_write: cannot position media id IBM069 for write

13:43:59.262 [10776.12196] <2> send_MDS_msg: DEVICE_STATUS 1 118111934 x IBM069 4002726 IBM.ULT3580-TD5.Drive5 2000497 POSITION_ERROR 0 0



13:44:45.376 [8980.9836] <2> io_read_block: read error on media id XO0312, drive index 6 reading header block, len = 0; No more data is on the tape. (1104)

13:44:45.376 [8980.9836] <16> io_position_for_write: cannot position media id XO0312 for write

13:44:45.376 [8980.9836] <2> send_MDS_msg: DEVICE_STATUS 1 118099301 x XO0312 4002639 IBM.ULT3580-TD5.Drive3 2000496 POSITION_ERROR 0 0
13:51:10.822 [9340.4780] <2> send_MDS_msg: DEVICE_STATUS 1 118106543 x IBM022 4002679 IBM.ULT3580-TD5.Drive2 2000495 TAPE_ALERT 268435456 33554944

13:51:10.837 [9340.4780] <2> log_media_error: successfully wrote to error file - 02/28/14 13:51:10 IBM022 5 TAPE_ALERT IBM.ULT3580-TD5.Drive2 0x10000000 0x02000200

13:51:10.837 [9340.4780] <16> process_tapealert: TapeAlert Code: 0x04, Type: Critical, Flag: MEDIA, from drive IBM.ULT3580-TD5.Drive2 (index 5), Media Id IBM022

13:51:10.837 [9340.4780] <8> process_tapealert: TapeAlert Code: 0x27, Type: Warning, Flag: DIAGNOSTICS REQ., from drive IBM.ULT3580-TD5.Drive2 (index 5), Media Id IBM022

13:51:10.837 [9340.4780] <16> process_tapealert: TapeAlert Code: 0x37, Type: Critical, Flag: LOADING FAILURE, from drive IBM.ULT3580-TD5.Drive2 (index 5), Media Id IBM022
 
At this point curious to see ...db\media\error file on this media server as well as Event Viewer System log...
 
We need to see the source of the Events - hba or tape driver...
 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

xao's picture

I added for each media error the db\media\error that coresponds to it int eh xlsx file, but here is the file in the attachement.

How would you like to add event logs?

Log Name:      Application
Source:        NetBackup Tape Manager
Date:          2014-03-01 08:09:17
Event ID:      0
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      x
Description:
TapeAlert Code: 0x37, Type: Critical, Flag: LOADING FAILURE, from drive IBM.ULT3580-TD5.Drive2 (index 5), Media Id IBM069
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="NetBackup Tape Manager" />
    <EventID Qualifiers="0">0</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2014-03-01T07:09:17.000000000Z" />
    <EventRecordID>8797608</EventRecordID>
    <Channel>Application</Channel>
    <Computer>x</Computer>
    <Security />
  </System>
  <EventData>
    <Data>TapeAlert Code: 0x37, Type: Critical, Flag: LOADING FAILURE, from drive IBM.ULT3580-TD5.Drive2 (index 5), Media Id IBM069</Data>
  </EventData>
</Event>
master TLD application event log.JPG
AttachmentSize
errors.txt 8.82 KB
Marianne's picture

Just way too many hardware errors.

At this point in time, I cannot see that ALL of those drives can be faulty.
Probably a good idea to log a call with your hardware and server support team so that they can troubleshoot together.

Check every piece in the data path - from server hba up to the library and tape drives (including drivers).

What we can say with certainty is that NBU is merely reporting the errors, not causing it.

 

Please check Event Viewer System log as well.

**** EDIT ****

How would you like to add event logs?

Save them as text files and upload as attachments.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

SOLUTION
xao's picture

Thank you all for reply, I took this problem with ibm and symantec support.

 

//George

Nagalla's picture

so what is the Problem area.... 

it may be usefull for the one who is looking for solution for similler issue...

xao's picture

When we will definitize the problem, I will come with the solution, until now we found tape HEADER error, and the step is to duplicate the data and erase the tapes with this issue.

As well we are investigating the NDMP configuration, drive paths, ibm drives.

 

//George

Marianne's picture

When you find the REAL solution, please feel free to unmark my post as solution.

We can then mark your own post as Solution.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links