Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

unable to backup particular folder

Created: 20 Sep 2012 • Updated: 01 Oct 2012 | 15 comments
This issue has been solved. See solution.

we have couple of WAN backups in my environment.I have enabled multi streaming.usually it runs quite long time but completes with in schedule/some times after manual re run.

But from last one week two clients unable to take f drive folders failing with 13read file error.

I have increased time settings that dint help me to fix this issue.i reverted the changes now and tried again ..dint wrk out.

Client A:detail log:recent:

9/19/2012 4:29:50 PM - requesting resource ppme307-hcart2-robot-tld-0
9/19/2012 4:29:50 PM - requesting resource ppme304.NBU_CLIENT.MAXJOBS.fps-pst-01
9/19/2012 4:29:50 PM - requesting resource ppme304.NBU_POLICY.MAXJOBS.win_pst
9/19/2012 4:29:50 PM - Info nbrb(pid=7864518) Limit has been reached for the logical resource ppme304.NBU_CLIENT.MAXJOBS.fps-pst-01    
9/19/2012 4:30:21 PM - awaiting resource ppme307-hcart2-robot-tld-0 - No drives are available
9/19/2012 4:36:02 PM - granted resource ppme304.NBU_CLIENT.MAXJOBS.fps-pst-01
9/19/2012 4:36:02 PM - granted resource ppme304.NBU_POLICY.MAXJOBS.win_pst
9/19/2012 4:36:02 PM - granted resource 0066L5
9/19/2012 4:36:02 PM - granted resource IBM.ULT3580-TD5.001
9/19/2012 4:36:02 PM - granted resource ppme307-hcart2-robot-tld-0
9/19/2012 4:36:02 PM - estimated 99010686 Kbytes needed
9/19/2012 4:36:03 PM - started process bpbrm (23527534)
9/19/2012 4:36:12 PM - mounting 0066L5
9/19/2012 4:36:15 PM - connecting
9/19/2012 4:36:17 PM - connected; connect time: 00:00:02
9/19/2012 4:36:26 PM - Warning bpbrm(pid=4063288) from client fps-pst-01: WRN - can't open file: F:\Spool\FP04824.SHD (WIN32 32: The process cannot access the file because it is being used by another process. )
9/19/2012 4:36:27 PM - Warning bpbrm(pid=4063288) from client fps-pst-01: WRN - can't open file: F:\System Volume Information\SRM\quota.md (WIN32 32: The process cannot access the file because it is being used by another process. )
9/19/2012 4:36:43 PM - mounted; mount time: 00:00:31
9/19/2012 4:36:43 PM - positioning 0066L5 to file 1196
9/19/2012 4:37:34 PM - positioned 0066L5; position time: 00:00:51
9/19/2012 4:37:34 PM - begin writing
9/19/2012 9:19:29 PM - Error bpbrm(pid=4063288) socket read failed: errno = 119 - System call timed out  
9/19/2012 9:19:32 PM - end writing; write time: 04:41:58
file read failed(13)
9/20/2012 10:34:05 AM - job 52868 was restarted as job 53002
 

Client B Activty log:

 

 

9/19/2012 10:54:01 PM - requesting resource ppme307-hcart2-robot-tld-0

9/19/2012 10:54:01 PM - requesting resource ppme304.NBU_CLIENT.MAXJOBS.inf-pst-01

9/19/2012 10:54:01 PM - requesting resource ppme304.NBU_POLICY.MAXJOBS.win_pst

9/19/2012 10:54:01 PM - granted resource ppme304.NBU_CLIENT.MAXJOBS.inf-pst-01

9/19/2012 10:54:01 PM - granted resource ppme304.NBU_POLICY.MAXJOBS.win_pst

9/19/2012 10:54:01 PM - granted resource 0066L5

9/19/2012 10:54:01 PM - granted resource IBM.ULT3580-TD5.002

9/19/2012 10:54:01 PM - granted resource ppme307-hcart2-robot-tld-0

9/19/2012 10:54:01 PM - estimated 217546741 Kbytes needed

9/19/2012 10:54:02 PM - started process bpbrm (10420398)

9/19/2012 10:54:07 PM - mounting 0066L5

9/19/2012 10:54:07 PM - connecting

9/19/2012 10:54:07 PM - connected; connect time: 00:00:00

9/19/2012 10:54:40 PM - mounted; mount time: 00:00:33

9/19/2012 10:54:40 PM - positioning 0066L5 to file 1277

9/19/2012 10:55:40 PM - positioned 0066L5; position time: 00:01:00

9/19/2012 10:55:40 PM - begin writing

9/20/2012 2:14:59 AM - Error bpbrm(pid=23068702) socket read failed: errno = 119 - System call timed out 

9/20/2012 2:21:33 AM - Error bptm(pid=12451988) read of TIR file failed: the file list is incomplete  

9/20/2012 2:21:33 AM - end writing; write time: 03:25:53

file read failed(13)

9/20/2012 10:34:46 AM - job 52882 was restarted as job 53003

Comments 15 CommentsJump to latest comment

Mouse's picture

Here is an in-depth troubleshooting guide for exit code 13: 

http://www.symantec.com/docs/TECH43240

As there are many 10053/10054 errors in logs that indicate something wrong is happening on network layer with Windows TCP/IP connectivity.

According to the log, NBU version on the client is very old (6.0GA), in case of WAN I would suggest to consider the new feature of NBU 7.5 that could become very handy in error-prone WAN environment (WAN Resiliency), check this whitepaper: http://www.symantec.com/business/support/resources/sites/BUSINESS/content/live/TECHNICAL_SOLUTION/183000/TECH183552/en_US/About_WAN_Resiliency.pdf

Marianne's picture

You will need to increase logging level on the client to see what happened between these to timestamps in bpbkar log.

 

2:53:22.896 PM: [90172.86072] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
3:15:17.663 PM: [90172.90484] <16> dtcp_write: TCP - failure: send socket (1848) (TCP 10054: Connection reset by peer)

bpbrm simply timed out because no data was received from the client within the timeout.

You can also use bpbkar test on the client to test local read of F-drive on the client.
See: http://www.symantec.com/docs/TECH17541

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

sri vani's picture

Thank you Mouse...yep client version is 6.0..

very soon we will change this to 7.x..as per our company policy we need to keep one successful full backup..once it is fixed we will defiantely change the client version ..Thq again

sri vani's picture

Hi Marianne,

Thq for ur help...

Plz see the below bpbkar log and also am unable to run the bpbkar32 test,kindly help me here..

12:15:03.596 AM: [90764.88816] <2> tar_backup::V_SetupProcessContinue: TAR - CONTINUE BACKUP received
12:15:03.596 AM: [90764.88816] <2> tar_backup::V_SetupFileDirectives: TAR - backup filename = /F/
12:15:03.596 AM: [90764.90288] <4> tar_base::V_KeepaliveThread: INF - The Keepalive thread is active. Keepalive interval 60 Seconds
12:15:04.674 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - BEDS_Init() Enter InitFlags:0x0
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: C:
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: F:
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: G:
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: Microsoft Terminal Services
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: Microsoft Windows Network
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: RemotelyAnywhere Virtual Disk Network
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: Web Client Network
12:15:04.705 AM: [90764.88816] <2> ov_log::V_GlobalLog: INF - DumpDleInfo() DLE Device Name: Shadow?Copy?Components
12:15:06.690 AM: [90764.88816] <4> backup_create: INF - NetBackup Temp Directory: 'C:\Program Files\VERITAS\\NetBackup\Temp'
12:15:06.752 AM: [90764.88816] <4> folder_access::V_OpenForRead: INF - obtained disk quota data: F:
12:15:07.846 AM: [90764.88816] <16> file_access::V_OpenForRead: ERR - CreateFile() failed: \\?\F:\System Volume Information\SRM\quota.md (WIN32 32: The process cannot access the file because it is being used by another process. )
12:15:07.846 AM: [90764.88816] <2> tar_base::V_vTarMsgW: WRN - can't open file: F:\System Volume Information\SRM\quota.md (WIN32 32: The process cannot access the file because it is being used by another process. )
12:30:36.983 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
12:45:06.900 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:00:08.771 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:15:07.673 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:15:07.673 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
1:30:07.107 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:45:07.837 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
2:00:41.130 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
2:04:24.570 AM: [90764.88816] <2> tar_base::V_vTarMsgW: ERR - failure reading file: F:\u1\S9019795\Exchange\mailbox.pst (WIN32 33: The process cannot access the file because another process has locked a portion of the file. )
2:14:23.196 AM: [90764.88816] <2> tar_base::V_vTarMsgW: ERR - failure reading file: F:\u1\S9021095\Exchange\mailbox.pst (WIN32 33: The process cannot access the file because another process has locked a portion of the file. )
2:15:07.951 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
2:15:43.958 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
2:30:07.049 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
2:45:11.728 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
3:00:09.459 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
3:15:08.814 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
3:15:17.845 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
3:16:49.612 AM: [90764.88816] <2> tar_base::V_vTarMsgW: ERR - failure reading file: F:\u1\S9021304\Exchange\archive.pst (WIN32 33: The process cannot access the file because another process has locked a portion of the file. )
3:16:49.612 AM: [90764.88816] <2> tar_base::V_vTarMsgW: ERR - failure reading file: F:\u1\S9021304\Exchange\mailbox.pst (WIN32 33: The process cannot access the file because another process has locked a portion of the file. )
3:30:06.794 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
3:57:21.440 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
3:57:21.440 AM: [90764.88816] <2> tar_base::V_vTarMsgW: ERR - failure reading file: F:\u1\s1700772\Exchange\archive.pst (WIN32 33: The process cannot access the file because another process has locked a portion of the file. )
4:15:09.188 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
5:15:09.562 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
6:15:11.030 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
7:15:12.076 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
8:15:12.544 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
8:25:33.709 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
8:25:33.724 AM: [90764.88816] <16> file_access::V_OpenForRead: ERR - CreateFile() failed: \\?\F:\u1\s1700772\Exchange\~archive.pst.tmp (WIN32 2: The system cannot find the file specified. )
8:25:33.724 AM: [90764.88816] <16> file_access::V_OpenForRead: ERR - CreateFile() failed: \\?\F:\u1\s1700772\Exchange\~mailbox.pst.tmp (WIN32 2: The system cannot find the file specified. )
8:45:46.365 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
9:00:23.657 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
9:15:07.215 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
9:15:14.153 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
10:15:15.043 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
10:32:20.775 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
10:45:10.003 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
11:00:15.265 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
11:15:15.188 AM: [90764.88816] <4> ov_log::OVLoop: Timestamp
11:35:51.200 AM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
11:35:51.200 AM: [90764.88816] <2> tar_base::V_vTarMsgW: ERR - failure reading file: F:\u1\s9004732\Exchange\mailbox.pst (WIN32 33: The process cannot access the file because another process has locked a portion of the file. )
12:14:03.100 PM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
12:15:17.976 PM: [90764.88816] <4> ov_log::OVLoop: Timestamp
1:05:52.155 PM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:15:13.850 PM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:15:18.459 PM: [90764.88816] <4> ov_log::OVLoop: Timestamp
1:30:31.127 PM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
1:45:09.826 PM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
2:00:07.150 PM: [90764.88816] <4> tar_backup::backup_send_chkp_data_state: INF - 'CPR_MSG_PROCESSED' received
2:15:18.771 PM: [90764.88816] <4> ov_log::OVLoop: Timestamp
3:01:35.338 PM: [90764.88816] <16> dtcp_write: TCP - failure: send socket (1848) (TCP 10054: Connection reset by peer)
3:01:35.338 PM: [90764.88816] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
3:01:42.760 PM: [90764.88816] <16> dtcp_write: TCP - failure: send socket (1848) (TCP 10054: Connection reset by peer)
3:01:42.760 PM: [90764.88816] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
3:01:47.213 PM: [90764.88816] <16> dtcp_write: TCP - failure: send socket (1848) (TCP 10054: Connection reset by peer)
3:01:47.213 PM: [90764.88816] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
 

badri's picture

check if the folder large number for small files inside the folder

Marianne's picture

Please help me understand what this means?

.... unable to run the bpbkar32 test ....

The example in the TN even uses F:\ as example...  
bpbkar32 -nocont F:\ 1> nul 2> nul
 

Open cmd. cd to <install_path>\veritas\netbackup\bin\, then copy & paste the command.

Please check Host Properties for this client and ensure that logging is set to 1 at minimum before you start the test. Also rename current log to ensure a new log is created.

The command will simply return to the prompt, while backup to 'nul' is running in the background.
You can use Task manager to see bpbkar32 in the process table. Keep an eye on Task Manager until bpbkar32 is done. It will take quite a while, judging by the bpbkar log above.

.Please post log as File attachment.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

sri vani's picture

Hi Marianne,

Thq for your guidance..

Please find the log file attached below

AttachmentSize
bpbkar.doc 279.11 KB
Marianne's picture

This log shows failure for a normal scheduled backup (policy win_pst, schedule weekly_full ) with logging level set to 0.

Please follow steps in my previous post to perform bpbkar test on the client?

Let me re-arrange the steps:

Please check Host Properties for this client and ensure that logging is set to 1 at minimum before you start the test. Also rename current (bpbkar) log to ensure a new log is created.

Open cmd. cd to <install_path>\veritas\netbackup\bin\, then copy & paste the command.

bpbkar32 -nocont F:\ 1> nul 2> nul

Keep an eye on Task Manager until bpbkar32 is done.

Please post log as File attachment.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

sri vani's picture

thx for the detailed steps...

Plz find the recent log

 

Rarely incremental backups are completing on these machines but full backups on F drive - definate failure

AttachmentSize
bpbkar.doc 31.84 MB
Marianne's picture

All that the log tells us is what you confirmed - Incremental backup was successful (status 1).

You need to check bpbkar log after failed full backup to find out where backup process is getting stuck...

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

sri vani's picture

Hello Marianne ,

Sry for the late..

Kindly check the below log...

AttachmentSize
bpbkar 092612.doc 97.82 MB
Marianne's picture

File is too big to download using my private usb modem... Please save as .txt, zip the file and then upload?

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

Marianne's picture

All we can see in the log is a break in network connection:

 4:51:54.533 PM: [908.5244] <2> tar_base::V_vTarMsgM: DIR - 316 132540 45811565 31 16832 root root 0 1348644992 1315517488 1315517487 /F/u1/s9014079/New Restore PST/
4:55:02.895 PM: [908.5244] <16> dtcp_write: TCP - failure: send socket (1848) (TCP 10054: Connection reset by peer)
4:55:02.895 PM: [908.5244] <16> dtcp_write: TCP - failure: attempted to send 1 bytes
(Lots of these 2 lines......)
4:59:16.648 PM: [908.5244] <2> tar_base::V_vTarMsgW: FTL - tar file write error (40)
4:59:16.664 PM: [908.5244] <2> tar_base::V_vTarMsgW: INF - EXIT STATUS 14: file write failed
 
Have a look at troubleshooting steps listed for Status 14:
 
■ On Windows systems, the client bpbkar log may contain a 10054 Connection
Reset Error (usually indicates a hardware error). Somewhere between the
NetBackup client and server, the connection was reset. When NetBackup
receives this error, it cannot continue the backup. This error has been attributed
to the following:
■ A hiccup in the network.
■ A bad network interface card on a NetBackup client.
■ A bad network interface card on the NetBackup server.
■ Faulty routers.
■ Any other applications that interfere with NetBackup connections.

 

You may want to upgrade to NBU 7.5 and read up on the new Wan Resilient feature.

See this White Paper: http://www.symantec.com/docs/TECH183552

Prior to NetBackup 7.5 release, any disconnection in network over WAN environment will cause backups to fail.
...

 NetBackup 7.5 includes a feature, Resilient Network, which will help data transfers over a WAN environment by becoming resilient to WAN interruptions.

Refer attached document for more details.

 

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

SOLUTION
sri vani's picture

Thank you so much for all your help ...

I ll check all the provided troubleshooting steps...Soon we will upgrade this to 7.5..

Backup is running good except on F as mentioned earlier...So i tried splitting them mutiple subfolder and ran them in single policy(for F ) and tested..Luckily we ve successful backup for F drive as wel  now..

 

once again thq so much Marianne