Video Screencast Help

EXIT STATUS 6: the backup failed to back up the requested files

Created: 25 Jan 2013 | 8 comments

All,

I'd like some assistance in resolving an ongoing backup failure I've been encountering.  I have a backup job that runs against a database that is approximately 637GB.  The job runs several times a week and fails every few days, but there is no pattern to when it fails.  Additionally, the job does not always fail.

In use on the client server:  Microsoft SQL Server 2005 - 9.00.4060.00 (X64)

In use on the master server:  NetBackup 7.5.0.4

Client Read Timeout on Master/media server: 900
Client Connect Timeout: 900

I've attached the following logs: dbclient, mssql failure log from master server, bpbrm log from master server, bptm log from master server.

     -The failure occurs at 02:36 in the log from the client and 21:36 in the logs from the master server.

dbclient Log snippet:

02:36:00.119 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:00.447 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:00.728 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:00.931 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:01.150 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:01.291 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:01.541 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:01.728 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:01.978 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:01.994 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:02.291 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:02.572 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:02.822 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.025 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.150 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.166 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.556 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.775 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.853 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:03.947 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:04.010 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:04.150 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:04.166 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:04.650 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:23.995 [5112.2156] <16> writeToServer: ERR - send() to server on socket failed:
02:36:23.995 [5112.2156] <16> dbc_put: ERR - failed sending data to server
02:36:23.995 [5112.2156] <4> closeApi: entering closeApi.
02:36:24.026 [5112.2156] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
 
02:36:24.026 [5112.2156] <4> closeApi: INF - closing commSock 872
02:36:24.026 [5112.2156] <4> closeApi: INF - closing dataSock 900
02:36:24.026 [5112.2156] <4> closeApi: INF - setting linger on nameSock 884
02:36:24.026 [5112.2156] <4> closeApi: INF - closing nameSock 884
02:36:24.135 [5112.2156] <16> VxBSASendData: ERR - Could not do a bsa_put().
02:36:24.167 [5112.2156] <4> getOwnerName: entering getOwnerName.
02:36:24.167 [5112.2156] <4> dbc_GetServerName: entering dbc_GetServerName.
02:36:24.167 [5112.2156] <4> getServerName: entering getServerName.
02:36:24.167 [5112.2156] <4> getServerName: Read server name from nb_master_config: nb-master-02
02:36:24.167 [5112.2156] <4> dbc_GetClientName: entering dbc_GetClientName.
02:36:24.167 [5112.2156] <4> dbc_GetClientName: Returning NB client name: sapr3d01.bentley.com
 
02:36:24.167 [5112.2156] <4> dbc_RemoteWriteFile: NB-MASTER-02 root 37 logs/mssql_backup_failures/012513.rpt a 1 
 
02:36:24.167 [5112.2156] <2> bprd_connect: Ignoring local_name sapr3d01.bentley.com.
02:36:24.167 [5112.2156] <2> bprd_connect: Ignoring owner_name root.
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1556] file cache has expired NAME=nb-master-02 SVC=veritas_pbx
02:36:24.182 [5112.2156] <2> retry_getaddrinfo_for_real: [vnet_addrinfo.c:1061] getaddrinfo() failed RV=10109 NAME=NULL SVC=veritas_pbx
02:36:24.182 [5112.2156] <2> get_sig_pbx_port: [vnet_addrinfo.c:4416] RegOpenKeyEx failed STAT=2 REGISTRY=Software\VERITAS\VxPBX
02:36:24.182 [5112.2156] <2> retry_getaddrinfo: [vnet_addrinfo.c:884] using IANA SVC=veritas_pbx PORT=1556
02:36:24.182 [5112.2156] <2> retry_getaddrinfo: [vnet_addrinfo.c:910] adjusted service name 1556
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1630] found via getaddrinfo NAME=nb-master-02 SVC=veritas_pbx
02:36:24.182 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1556] file cache has expired NAME=10.251.2.24 SVC=NULL
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1630] found via getaddrinfo NAME=10.251.2.24 SVC=NULL
02:36:24.182 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1556] file cache has expired NAME=nb-master-02 SVC=vnetd
02:36:24.182 [5112.2156] <2> retry_getaddrinfo: [vnet_addrinfo.c:906] using SVC=vnetd PORT=13724
02:36:24.182 [5112.2156] <2> retry_getaddrinfo: [vnet_addrinfo.c:910] adjusted service name 13724
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1630] found via getaddrinfo NAME=nb-master-02 SVC=vnetd
02:36:24.182 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.182 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1556] file cache has expired NAME=nb-master-02 SVC=bprd
02:36:24.182 [5112.2156] <2> retry_getaddrinfo: [vnet_addrinfo.c:906] using SVC=bprd PORT=13720
02:36:24.182 [5112.2156] <2> retry_getaddrinfo: [vnet_addrinfo.c:910] adjusted service name 13720
02:36:24.182 [5112.2156] <2> vnet_cached_getaddrinfo_and_update: [vnet_addrinfo.c:1630] found via getaddrinfo NAME=nb-master-02 SVC=bprd
02:36:24.182 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.182 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.182 [5112.2156] <2> async_connect: [vnet_connect.c:1477] connect in progress 1 0x1
02:36:24.198 [5112.2156] <2> vnet_pbxConnect: pbxConnectEx Succeeded
02:36:24.198 [5112.2156] <2> do_pbx_service: [vnet_connect.c:2116] via PBX bprd CONNECT FROM 10.251.1.161.3368 TO 10.251.2.24.1556 fd = 888
02:36:24.198 [5112.2156] <2> async_connect: [vnet_connect.c:1644] connect async CONNECT FROM 10.251.1.161.3368 TO 10.251.2.24.1556 fd = 888
02:36:24.198 [5112.2156] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO nb-master-02 10.251.2.24 bprd VIA pbx
02:36:24.198 [5112.2156] <2> logconnections: BPRD CONNECT FROM 10.251.1.161.3368 TO 10.251.2.24.1556 fd = 888
02:36:24.198 [5112.2156] <2> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:843] Assuming no VxSS for DB Agents 0 0x0
02:36:24.651 [5112.2156] <16> writeToServer: ERR - send() to server on socket failed:
02:36:24.651 [5112.2156] <16> dbc_RemoteWriteFile: ERR - could not write progress status message to the NAME socket
02:36:24.651 [5112.2156] <4> dbc_RemoteWriteFile: INF - RemoteWriteFile status = 0
02:36:24.651 [5112.2156] <16> DBthreads::dbclient: ERR - Error in VxBSASendData: 1.
02:36:24.651 [5112.2156] <4> VxBSALogJobInfo: INF - entering VxBSALOgJobInfo.
02:36:24.651 [5112.2156] <4> getOwnerName: entering getOwnerName.
02:36:24.651 [5112.2156] <4> dbc_GetServerClientConfig: entering dbc_GetServerClientConfig.
02:36:24.667 [5112.2156] <4> dbc_GetServerClientConfig: ServerName: <nb-master-02>, ClientName: <sapr3d01.bentley.com>
02:36:24.667 [5112.2156] <4> VxBSAGetEnv: INF - entering GetEnv - BSA_SERVICE_HOST
02:36:24.667 [5112.2156] <4> VxBSAGetEnv: INF - returning - NB-MASTER-02
02:36:24.667 [5112.2156] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_CLIENT_HOST
02:36:24.667 [5112.2156] <4> VxBSAGetEnv: INF - returning - 
02:36:24.667 [5112.2156] <2> bprd_connect: Ignoring local_name sapr3d01.bentley.com.
02:36:24.667 [5112.2156] <2> bprd_connect: Ignoring owner_name SYSTEM.
02:36:24.667 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.667 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.667 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.667 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.667 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.667 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.667 [5112.2156] <2> async_connect: [vnet_connect.c:1477] connect in progress 1 0x1
02:36:24.682 [5112.2156] <2> vnet_pbxConnect: pbxConnectEx Succeeded
02:36:24.682 [5112.2156] <2> do_pbx_service: [vnet_connect.c:2116] via PBX bprd CONNECT FROM 10.251.1.161.3369 TO 10.251.2.24.1556 fd = 908
02:36:24.682 [5112.2156] <2> async_connect: [vnet_connect.c:1644] connect async CONNECT FROM 10.251.1.161.3369 TO 10.251.2.24.1556 fd = 908
02:36:24.682 [5112.2156] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO NB-MASTER-02 10.251.2.24 bprd VIA pbx
02:36:24.682 [5112.2156] <2> logconnections: BPRD CONNECT FROM 10.251.1.161.3369 TO 10.251.2.24.1556 fd = 908
02:36:24.682 [5112.2156] <2> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:843] Assuming no VxSS for DB Agents 0 0x0
02:36:24.963 [5112.2156] <1> DBthreads::dbclient:     CONTINUATION: - All backup jobs have completed.
02:36:24.963 [5112.2156] <4> VxBSALogJobInfo: INF - entering VxBSALOgJobInfo.
02:36:24.963 [5112.2156] <4> getOwnerName: entering getOwnerName.
02:36:24.963 [5112.2156] <4> dbc_GetServerClientConfig: entering dbc_GetServerClientConfig.
02:36:24.963 [5112.2156] <4> dbc_GetServerClientConfig: ServerName: <nb-master-02>, ClientName: <sapr3d01.bentley.com>
02:36:24.963 [5112.2156] <4> VxBSAGetEnv: INF - entering GetEnv - BSA_SERVICE_HOST
02:36:24.963 [5112.2156] <4> VxBSAGetEnv: INF - returning - NB-MASTER-02
02:36:24.963 [5112.2156] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_CLIENT_HOST
02:36:24.963 [5112.2156] <4> VxBSAGetEnv: INF - returning - 
02:36:24.963 [5112.2156] <2> bprd_connect: Ignoring local_name sapr3d01.bentley.com.
02:36:24.963 [5112.2156] <2> bprd_connect: Ignoring owner_name SYSTEM.
02:36:24.963 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.963 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.963 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.963 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.963 [5112.2156] <2> vnet_sortaddrs: [vnet_addrinfo.c:3945] sorted addrs: 1 0x1
02:36:24.963 [5112.2156] <2> vnet_get_pref_netconnection: [vnet_addrinfo.c:4809] using interface  ANY
02:36:24.963 [5112.2156] <2> async_connect: [vnet_connect.c:1477] connect in progress 1 0x1
02:36:24.979 [5112.2156] <2> vnet_pbxConnect: pbxConnectEx Succeeded
02:36:24.979 [5112.2156] <2> do_pbx_service: [vnet_connect.c:2116] via PBX bprd CONNECT FROM 10.251.1.161.3370 TO 10.251.2.24.1556 fd = 908
02:36:24.979 [5112.2156] <2> async_connect: [vnet_connect.c:1644] connect async CONNECT FROM 10.251.1.161.3370 TO 10.251.2.24.1556 fd = 908
02:36:24.979 [5112.2156] <2> connect_to_service: connect succeeded STATUS (0) SUCCESS FROM 0.0.0.0 TO NB-MASTER-02 10.251.2.24 bprd VIA pbx
02:36:24.979 [5112.2156] <2> logconnections: BPRD CONNECT FROM 10.251.1.161.3370 TO 10.251.2.24.1556 fd = 908
02:36:24.979 [5112.2156] <2> vnet_check_vxss_client_magic_with_info: [vnet_vxss_helper.c:843] Assuming no VxSS for DB Agents 0 0x0
02:36:26.057 [5112.2156] <4> VxBSAEndTxn: INF - entering VxBSAEndTxn.
02:36:26.057 [5112.2156] <4> VxBSAEndTxn: INF - Transaction being ABORTED.
02:36:26.057 [5112.2156] <4> VxBSAGetEnv: INF - entering GetEnv - NBBSA_LOG_DIRECTORY
02:36:26.057 [5112.2156] <4> VxBSAGetEnv: INF - returning - DBCLIENT
02:36:26.057 [5112.2156] <4> VxBSAEndTxn: INF - Cleaning directory: <C:\Program Files\Veritas\NetBackup\Logs\DBCLIENT>
02:36:26.057 [5112.2156] <4> delete_old_files: entering delete_old_files.
02:36:26.057 [5112.2156] <8> close_image: Session being terminated abnormally, cleaning up
02:36:26.057 [5112.2156] <4> closeApi: entering closeApi.
02:36:26.057 [5112.2156] <4> closeApi: INF - EXIT STATUS 6: the backup failed to back up the requested files
 
02:36:26.057 [5112.2156] <4> close_image: INF - backup FAILED
02:36:26.057 [5112.2156] <4> close_image: INF ---- end of Backup ---
 
-----------
 
Thanks in advance!

Comments 8 CommentsJump to latest comment

Nagalla's picture

see the below blog, it might be related

<16> CDBbackrec::FreeDeviceSet(): ERR - Error in VDS->Close: 0x80770004

https://www-secure.symantec.com/connect/forums/exi...

Nicolai's picture

The link point to this thread. I think the clipboard tricked you wink

Assumption is the mother of all mess ups.

If this post answered your'e qustion -  Please mark as a soloution.

Nagalla's picture

hoo.. my bad...(I can not edit my Previous post )sad

 

Thanks Nicolai, i am assuming that i have posted below link.

below is the link that i was referring

https://www-secure.symantec.com/connect/blogs/sock...

Jeff Cici's picture

The blog posting seems to indicate that the problem I am experiencing is related to a timeout problem.  From reading previous posts on the forum I also felt that timeouts were the cause of the backup job failures.  However, even after changing timeout settings from 300 seconds (the default) to 900 seconds, I am still experiencing failures.

Do further adjustments need to be made to timeout settings or has my change and continued failures point to a different problem?

wr's picture

what is going on with this server that makes it so slow?

 

task starts:

22:32:21.257 [5112.2156] <4> bsa_printObjectDescriptor: INF - objectName.pathName: /sapr3d01.MSSQL7.SAPR3D01.db.RD1.~.7.001of001.20130124223011..C
22:32:24.007 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
...

23:48:37.308 [5112.5660] <4> CODBCaccess::LogODBCerr: INF - ODBC return code <2>, SQL State <01000>, SQL Message <3211><[Microsoft][SQL Native Client][SQL Server]10 percent processed.>.
...

01:27:49.115 [5112.5660] <4> CODBCaccess::LogODBCerr: INF - ODBC return code <2>, SQL State <01000>, SQL Message <3211><[Microsoft][SQL Native Client][SQL Server]20 percent processed.>.
...

(never even gets to 30 percent)

task dies:

02:36:04.650 [5112.2156] <4> dbc_put: INF - sending keepalive on NAME Socket
02:36:23.995 [5112.2156] <16> writeToServer: ERR - send() to server on socket failed:
 

 

Will Restore -- where there is a Will there is a way

Marianne's picture

Is there a time difference between client and master/media server?
Are dbclient log on SQL server and bpbrm log from the same date?

Job failed on client at 02:36.

First entry in bpbprm log for client sapr3d01 is the start of backup at  04:37 :

 

04:37:34.763 [13096.18920] <2> logparams: -backup -S extprdbk03 -c sapbwd01 -ct 15 -ru SYSTEM -cl sap-mssql-servers-dev -sched IncrDaily-policy -bt 1358980219 -dt 0 -st 2 -b sapbwd01_1358980219 -mediasvr extprdbk03 -jobid 2336819 -jobgrpid 2336815 -masterversion 750000 -maxfrag 1048576 -bpstart_time 1358980519 -reqid -1358366993 -mt 2 -to 0 -stunit extprdbk03-hcart-robot-tld-0 -rl 1 -rp 1209600 -eari 0 -cj 6 -D 6 -rt 8 -rn 0 -pool SAP-Weekly -mpx 8 -use_ofb -use_otm -jm -secure 1 -brmchild 0 0 1 0 544 START BACKUP -c sapr3d01 -ct 15 -ru SYSTEM -cl sap-mssql-servers-dev -sched IncrDaily-policy

We need to see a set of logs for the same date.

We normally make timeouts on media servers 1800 for large database clients.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Jeff Cici's picture

 

Hi Marianne,

Thanks for the response.  I should have specified in my post that our master server runs on Eastern Time and our client runs on UTC.  So the time difference is 5 hours between the two, with UTC being 5 hours ahead.  I merged the dbclient logs from 1/24 and 1/25 because UTC runs into the next day.  The logs I posted are a direct comparison, the time difference just needs to be accounted for.

The job started on the master server at 17:30ET and failed at 21:36ET.  On the client it started at 22:30UTC and failed at 02:36UTC

I hope this helps in examining the logs.

Jeff Cici's picture

Hi Marianne,

Was my response a help in taking a look at the logs?  

Our Client Read Timeout setting for the server in question is currently set at 900 and I believe that the master server is set the same.  This seems to have reduced the amount of failures we receive, but hasn't eliminated them.

I'd like to read your input prior to bumping the settings to 1800.

Thanks in advance.