Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

medi write error exit status 84

Created: 07 Feb 2013 | 11 comments

Hi,

I have new install netbackup 7.5.0.4, matser and media server Oracle Enterprise linux 6.3, the library is StorageTek Sl150 with two LT5 FC drive.

The client is Solaris 10 with Oracle.

We tried RMAN backup several times and always exit with error 84 media write error, other backups usually completed successfully, sometimes exit with 84, but this RMAN backup always exit 84 after successully write 300-400GB, and always exit when start backup the new file, start new session. 

in btm.log

cannot write image to media id GAS066, drive index 1, Device or resource busy

in dbclient.log

11:20:35.430 [9299] <2> xbsa_SetEnv: INF - entering
11:20:35.430 [9299] <4> VxBSASetEnv: INF - entering SetEnv - NBBSA_CLIENT_READ_TIMEOUT
11:20:35.430 [9299] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_CLIENT_READ_TIMEOUT
11:20:35.430 [9299] <4> VxBSAGetEnv: INF - returning - 10800
11:20:35.430 [9299] <4> dbc_SetClientReadTimeout: INF - sending client read timeout
11:20:35.430 [9299] <2> xbsa_SetEnv: INF - leaving (0)

11:20:35.430 [9299] <2> int_StartJob: INF - leaving
11:20:35.430 [9299] <2> sbtbackup: INF - leaving
11:20:35.457 [9299] <2> int_WriteData: INF - writing buffer # 1 of size 262144
11:20:35.457 [9299] <4> setSockSize: INF - sock size is set to: 65536
11:20:35.655 [9299] <16> writeToServer: ERR - send() to server on socket failed: Broken pipe (32)
11:20:35.655 [9299] <16> dbc_put: ERR - failed sending data to server
11:20:35.655 [9299] <4> closeApi: entering closeApi.
11:20:35.655 [9299] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files

11:20:35.655 [9299] <4> closeApi: INF - closing commSock 18
11:20:35.655 [9299] <4> closeApi: INF - close of commSock returned <0> errno is <32>
11:20:35.655 [9299] <4> closeApi: INF - closing dataSock 21
11:20:35.655 [9299] <4> closeApi: INF - close of dataSock returned <0> errno is <32>
11:20:35.655 [9299] <4> closeApi: INF - setting linger on nameSock 20
11:20:35.655 [9299] <4> closeApi: INF - closing nameSock 20
11:20:35.695 [9299] <4> closeApi: INF - close of nameSock returned <0> errno is <32>
11:20:35.695 [9299] <16> VxBSASendData: ERR - Could not do a bsa_put().
11:20:35.695 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.695 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.695 [9299] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.695 [9299] <2> sbterror: INF - entering
11:20:35.695 [9299] <2> sbterror: INF - Error=7501: VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve.
11:20:35.695 [9299] <2> sbterror: INF - leaving
11:20:35.696 [9299] <16> dbc_put: ERR - invalid handle received from the application
11:20:35.696 [9299] <4> closeApi: entering closeApi.
11:20:35.696 [9299] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files

11:20:35.696 [9299] <16> VxBSASendData: ERR - Could not do a bsa_put().
11:20:35.696 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.696 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.696 [9299] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.696 [9299] <2> sbterror: INF - entering
11:20:35.696 [9299] <2> sbterror: INF - Error=7501: VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve.
11:20:35.696 [9299] <2> sbterror: INF - leaving
11:20:35.719 [9299] <16> dbc_put: ERR - invalid handle received from the application
11:20:35.719 [9299] <4> closeApi: entering closeApi.
11:20:35.719 [9299] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files

11:20:35.719 [9299] <16> VxBSASendData: ERR - Could not do a bsa_put().
11:20:35.719 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.719 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.719 [9299] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.720 [9299] <2> sbterror: INF - entering
11:20:35.720 [9299] <2> sbterror: INF - Error=7501: VxBSASendData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve.
11:20:35.720 [9299] <2> sbterror: INF - leaving
11:20:35.720 [9299] <2> sbtclose2: INF - entering
11:20:35.720 [9299] <2> int_CloseImage: INF - entering
11:20:35.720 [9299] <2> int_CloseImage: INF - Backup - closing <BSS_1ro1c93m_8251>
11:20:35.720 [9299] <2> xbsa_EndData: INF - entering
11:20:35.720 [9299] <4> VxBSAEndData: INF - entering EndData.
11:20:35.720 [9299] <4> finishTarImage: INF - FractionalObjectBytes: 0
11:20:35.720 [9299] <4> finishTarImage: INF - writing LF_END_U_LEN_FILE record
11:20:35.720 [9299] <4> write_LF_END_tarHeader: entering write_LF_END_tarHeader.
11:20:35.720 [9299] <16> writeToServer: ERR - send() to server on socket failed: Bad file number (9)
11:20:35.720 [9299] <16> write_LF_END_tarHeader: ERR - failed writing LF_END_U_LEN_FILE record on DATA socket
11:20:35.720 [9299] <16> finishTarImage: ERR - write_LF_END_tarHeader() failed.
11:20:35.720 [9299] <16> VxBSAEndData: ERR - EndData unable to bsa_finishTarImage().
11:20:35.720 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.720 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.720 [9299] <16> xbsa_EndData: ERR - VxBSAEndData: Failed with error:
   Server Status:  Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.720 [9299] <2> xbsa_EndData: INF - leaving (3)

11:20:35.720 [9299] <16> int_CloseImage: ERR - Failed to process backup file <BSS_1ro1c93m_8251>

11:20:35.720 [9299] <2> xbsa_EndTransaction: INF - entering
11:20:35.720 [9299] <4> VxBSAEndTxn: INF - entering VxBSAEndTxn.
11:20:35.720 [9299] <4> VxBSAEndTxn: INF - Transaction being ABORTED.
11:20:35.720 [9299] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_LOG_DIRECTORY
11:20:35.720 [9299] <4> VxBSAGetEnv: INF - returning - dbclient
11:20:35.720 [9299] <4> VxBSAEndTxn: INF - Cleaning directory: </usr/openv/netbackup/logs/dbclient>
11:20:35.720 [9299] <4> delete_old_files: entering delete_old_files.
11:20:35.720 [9299] <8> close_image: Session being terminated abnormally, cleaning up

 

I tried disk backup yesterday, and completed successfully

what is the problem?

Thanks

Gabor

 

Discussion Filed Under:

Comments 11 CommentsJump to latest comment

Marianne's picture

We need full bptm media server log from failed backup.

Please post log as file attachment.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

csuri's picture

Hi,

bptm.log from today.

thanks

Gabor

AttachmentSize
bptm.log_.zip 28.04 KB
Marianne's picture
11:15:03.337 [30317] <16> write_data: cannot write image to media id GAS006, drive index 0, Device or resource busy

I can in all honesty say that I've never seen this before...

Do you have a single master/media server?
Or other media servers as well sharing devices?

I cannot say if this is NBU resource broker error or device error returned by the OS.

Please add VERBOSE entry to /usr/openv/volmgr/vm.conf  and restart NetBackup. 
Device errors will now be logged to /var/log/messages.

When error is seen again, please collect NBU resource broker info:
nbrbutil -dump >/tmp/nbrb.txt

Also check /var/log/messages for device errors.

Post nbrb.txt as well as messages file.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

csuri's picture

We have a master server and two media server with same config with 1-1 Sl150 tape library.
We tried this RMAN backup to both media server, the result is same (EXIT 84) on both.
I think both SL150 cannot be wrong with same error, maybe OS or netbackup setting on media server.

I working on to collect the requested output.

Marianne's picture

I was wondering if master and media servers were sharing the same devices - all devices zoned to all servers and using NBU SSO license to configure devices as shared?

Do you mean that each media server have dedicated robot and tape drives?
Nothing shared?

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

csuri's picture

Yes,  two media servers have dedicated SL150, master server is only master server.

No share. Every client backup via LAN.

csuri's picture

nbrbutil -dump output and messages file from media server

AttachmentSize
nbulogs.zip 16.66 KB
csuri's picture

We tried to connect other tape library (Sl24) to media server, the result is same (media write error)

 

Marianne's picture

Seems your issue is on the server side, not with devices. 

We see messages file start and end with device-related errors:

 

Feb  4 15:40:50 m-nbu-dd kernel: bfa 0000:05:00.0: Remote port (WWN = 50:01:04:f0:00:cc:ad:70) connectivity lost for logical port (WWN = 10:00:8c:7c:ff:21:92:fe)
Feb  4 15:40:50 m-nbu-dd kernel: bfa 0000:05:00.0: Target (WWN = 50:01:04:f0:00:cc:ad:70) connectivity lost for initiator (WWN = 10:00:8c:7c:ff:21:92:fe)
Feb  4 15:40:50 m-nbu-dd kernel: bfa 0000:05:00.0: Base port (WWN = 10:00:8c:7c:ff:21:92:fe) lost fabric connectivity
Feb  4 15:41:21 m-nbu-dd kernel: rport-3:0-0: blocked FC remote port time out: removing target and saving binding
Feb  4 15:41:21 m-nbu-dd kernel: sg_rq_end_io: device detached
Feb  4 15:41:36 m-nbu-dd avrd[3286]: Fatal open error on HP.ULTRIUM5-SCSI.000 (device 0, /dev/nst1), errno = 2 (No such file or directory), DOWN'ing it
 

 

Feb  7 16:25:18 m-nbu-dd avrd[5511]: Unable to open HP.ULTRIUM5-SCSI.000 (device 0, /dev/nst1) thru sg driver, Cannot allocate memory, DOWN'ing it
 
 
Ask your server support team for assistance....
 
 
-

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

csuri's picture

Hi,

I found the solution for the problem, i hope.

I changed the NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS to smaller, and RMAN backup was success.

SIZE_DATA_BUFFERS changed from 1048576 to 266122

NUMBER_DATA_BUFFERS changed from 256 to 16

I will play with buffer sizes, numbers for best perfomance without any issues.

Thanks for a help

Gabor

 

 

Efi G's picture

Make sure to patch the SL150 to the latest firmware as the early patch levels had issues to it downed the drives alot

see oracle docs on this

thanks