medi write error exit status 84
Hi,
I have new install netbackup 7.5.0.4, matser and media server Oracle Enterprise linux 6.3, the library is StorageTek Sl150 with two LT5 FC drive.
The client is Solaris 10 with Oracle.
We tried RMAN backup several times and always exit with error 84 media write error, other backups usually completed successfully, sometimes exit with 84, but this RMAN backup always exit 84 after successully write 300-400GB, and always exit when start backup the new file, start new session.
in btm.log
cannot write image to media id GAS066, drive index 1, Device or resource busy
in dbclient.log
11:20:35.430 [9299] <2> xbsa_SetEnv: INF - entering
11:20:35.430 [9299] <4> VxBSASetEnv: INF - entering SetEnv - NBBSA_CLIENT_READ_TIMEOUT
11:20:35.430 [9299] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_CLIENT_READ_TIMEOUT
11:20:35.430 [9299] <4> VxBSAGetEnv: INF - returning - 10800
11:20:35.430 [9299] <4> dbc_SetClientReadTimeout: INF - sending client read timeout
11:20:35.430 [9299] <2> xbsa_SetEnv: INF - leaving (0)
11:20:35.430 [9299] <2> int_StartJob: INF - leaving
11:20:35.430 [9299] <2> sbtbackup: INF - leaving
11:20:35.457 [9299] <2> int_WriteData: INF - writing buffer # 1 of size 262144
11:20:35.457 [9299] <4> setSockSize: INF - sock size is set to: 65536
11:20:35.655 [9299] <16> writeToServer: ERR - send() to server on socket failed: Broken pipe (32)
11:20:35.655 [9299] <16> dbc_put: ERR - failed sending data to server
11:20:35.655 [9299] <4> closeApi: entering closeApi.
11:20:35.655 [9299] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
11:20:35.655 [9299] <4> closeApi: INF - closing commSock 18
11:20:35.655 [9299] <4> closeApi: INF - close of commSock returned <0> errno is <32>
11:20:35.655 [9299] <4> closeApi: INF - closing dataSock 21
11:20:35.655 [9299] <4> closeApi: INF - close of dataSock returned <0> errno is <32>
11:20:35.655 [9299] <4> closeApi: INF - setting linger on nameSock 20
11:20:35.655 [9299] <4> closeApi: INF - closing nameSock 20
11:20:35.695 [9299] <4> closeApi: INF - close of nameSock returned <0> errno is <32>
11:20:35.695 [9299] <16> VxBSASendData: ERR - Could not do a bsa_put().
11:20:35.695 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.695 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.695 [9299] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.695 [9299] <2> sbterror: INF - entering
11:20:35.695 [9299] <2> sbterror: INF - Error=7501: VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve.
11:20:35.695 [9299] <2> sbterror: INF - leaving
11:20:35.696 [9299] <16> dbc_put: ERR - invalid handle received from the application
11:20:35.696 [9299] <4> closeApi: entering closeApi.
11:20:35.696 [9299] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
11:20:35.696 [9299] <16> VxBSASendData: ERR - Could not do a bsa_put().
11:20:35.696 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.696 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.696 [9299] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.696 [9299] <2> sbterror: INF - entering
11:20:35.696 [9299] <2> sbterror: INF - Error=7501: VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve.
11:20:35.696 [9299] <2> sbterror: INF - leaving
11:20:35.719 [9299] <16> dbc_put: ERR - invalid handle received from the application
11:20:35.719 [9299] <4> closeApi: entering closeApi.
11:20:35.719 [9299] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
11:20:35.719 [9299] <16> VxBSASendData: ERR - Could not do a bsa_put().
11:20:35.719 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.719 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.719 [9299] <16> xbsa_SendData: ERR - VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.720 [9299] <2> sbterror: INF - entering
11:20:35.720 [9299] <2> sbterror: INF - Error=7501: VxBSASendData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve.
11:20:35.720 [9299] <2> sbterror: INF - leaving
11:20:35.720 [9299] <2> sbtclose2: INF - entering
11:20:35.720 [9299] <2> int_CloseImage: INF - entering
11:20:35.720 [9299] <2> int_CloseImage: INF - Backup - closing <BSS_1ro1c93m_8251>
11:20:35.720 [9299] <2> xbsa_EndData: INF - entering
11:20:35.720 [9299] <4> VxBSAEndData: INF - entering EndData.
11:20:35.720 [9299] <4> finishTarImage: INF - FractionalObjectBytes: 0
11:20:35.720 [9299] <4> finishTarImage: INF - writing LF_END_U_LEN_FILE record
11:20:35.720 [9299] <4> write_LF_END_tarHeader: entering write_LF_END_tarHeader.
11:20:35.720 [9299] <16> writeToServer: ERR - send() to server on socket failed: Bad file number (9)
11:20:35.720 [9299] <16> write_LF_END_tarHeader: ERR - failed writing LF_END_U_LEN_FILE record on DATA socket
11:20:35.720 [9299] <16> finishTarImage: ERR - write_LF_END_tarHeader() failed.
11:20:35.720 [9299] <16> VxBSAEndData: ERR - EndData unable to bsa_finishTarImage().
11:20:35.720 [9299] <2> xbsa_ProcessError: INF - entering
11:20:35.720 [9299] <2> xbsa_ProcessError: INF - leaving
11:20:35.720 [9299] <16> xbsa_EndData: ERR - VxBSAEndData: Failed with error:
Server Status: Communication with the server has not been initiated or the server status has not been retrieved from
the serve
11:20:35.720 [9299] <2> xbsa_EndData: INF - leaving (3)
11:20:35.720 [9299] <16> int_CloseImage: ERR - Failed to process backup file <BSS_1ro1c93m_8251>
11:20:35.720 [9299] <2> xbsa_EndTransaction: INF - entering
11:20:35.720 [9299] <4> VxBSAEndTxn: INF - entering VxBSAEndTxn.
11:20:35.720 [9299] <4> VxBSAEndTxn: INF - Transaction being ABORTED.
11:20:35.720 [9299] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_LOG_DIRECTORY
11:20:35.720 [9299] <4> VxBSAGetEnv: INF - returning - dbclient
11:20:35.720 [9299] <4> VxBSAEndTxn: INF - Cleaning directory: </usr/openv/netbackup/logs/dbclient>
11:20:35.720 [9299] <4> delete_old_files: entering delete_old_files.
11:20:35.720 [9299] <8> close_image: Session being terminated abnormally, cleaning up
I tried disk backup yesterday, and completed successfully
what is the problem?
Thanks
Gabor
Comments 11 Comments • Jump to latest comment
We need full bptm media server log from failed backup.
Please post log as file attachment.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Hi,
bptm.log from today.
thanks
Gabor
I can in all honesty say that I've never seen this before...
Do you have a single master/media server?
Or other media servers as well sharing devices?
I cannot say if this is NBU resource broker error or device error returned by the OS.
Please add VERBOSE entry to /usr/openv/volmgr/vm.conf and restart NetBackup.
Device errors will now be logged to /var/log/messages.
When error is seen again, please collect NBU resource broker info:
nbrbutil -dump >/tmp/nbrb.txt
Also check /var/log/messages for device errors.
Post nbrb.txt as well as messages file.
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
We have a master server and two media server with same config with 1-1 Sl150 tape library.
We tried this RMAN backup to both media server, the result is same (EXIT 84) on both.
I think both SL150 cannot be wrong with same error, maybe OS or netbackup setting on media server.
I working on to collect the requested output.
I was wondering if master and media servers were sharing the same devices - all devices zoned to all servers and using NBU SSO license to configure devices as shared?
Do you mean that each media server have dedicated robot and tape drives?
Nothing shared?
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Yes, two media servers have dedicated SL150, master server is only master server.
No share. Every client backup via LAN.
nbrbutil -dump output and messages file from media server
We tried to connect other tape library (Sl24) to media server, the result is same (media write error)
Seems your issue is on the server side, not with devices.
We see messages file start and end with device-related errors:
Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links
Hi,
I found the solution for the problem, i hope.
I changed the NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS to smaller, and RMAN backup was success.
SIZE_DATA_BUFFERS changed from 1048576 to 266122
NUMBER_DATA_BUFFERS changed from 256 to 16
I will play with buffer sizes, numbers for best perfomance without any issues.
Thanks for a help
Gabor
Make sure to patch the SL150 to the latest firmware as the early patch levels had issues to it downed the drives alot
see oracle docs on this
thanks
Would you like to reply?
Login or Register to post your comment.