Video Screencast Help
Protect Your POS Environment Against Retail Data Breaches. Learn More.

Backup Time Suddenly Increases 50%

Created: 21 Oct 2013 • Updated: 30 Oct 2013 | 2 comments
This issue has been solved. See solution.

We have a NetBackup 7.5 system that backs up about 150 virtual servers and a dozen physical servers.  One of those physical servers (call it Server X) has about 2.9 terabytes of data on a SAN.  We do a weekly full backup and daily incrementals.  Up until two weeks ago, the backup time for Server X was about two days.  Starting two weeks ago, backup time suddenly jumped up to three days.  There have been no significant changes to the server, and no changes to the NetBackup system.  All the other backups are proceeding normally.  What should we look at to determine why this particular backup job is taking so much longer?  Server X is a Windows 2003 R2 machine.  Thanks.   -G.

Operating Systems:

Comments 2 CommentsJump to latest comment

Marianne's picture

Normal culprits are fragmentation on disk and network. Also check for other processes running at the same time as the backups (e.g. virus scan).

You will need bpbkar log on the client as well as bptm log on the media server.
bpbkar log will need minimum logging level of 3 to see if there is a file/folder where backup is getting 'stuck'. 
bptm log will tell you where the 'waits' were - normally many waits for full buffers indicates a problem on the client side. This can be with reading data from disk or sending over the network. 

Another culprit is 'TCP Chimney' that is sometimes activated during Windows updates. See http://www.symantec.com/docs/TECH60844

You will need a tool to monitor disk read speed (e.g. perfmon) and also involve network team to monitor throughput.
So, my gutfeel is that the problem is on the client side and that all troubleshooting should be started there.

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

SOLUTION
GlennG-NB's picture

As it turns out, there was a problem with the SAN where Server X stores data.  So it was essentially a client storage malfunction.