Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Backup to Tape slowdown 2010 r3

Created: 18 Jun 2013 • Updated: 12 Sep 2013 | 31 comments
This issue has been solved. See solution.

Our Backups to tape have slowed down, it happened overnight and had been running fine for months. I've done the usual things like clean the heads but this has made no difference. Having watched them realtime, the job rate (including verify) seems to get slower as the job progresses. Copying some files from one server to another seems fine, we did 150GB in about 30 mins to the Media Server.

Here's some job history and more detail because it seems to slow down once it starts to backup our main file server, which also holds our SharePoint SQL DB

This is the first item of the job, this happens to be our Media Server that the Tape Unit is connected. It shows the start time on on day, then the next and this looks ok

Backup started on 6/6/2013 at 1:03:10 AM.
Backup completed on 6/6/2013 at 1:21:16 AM.

Backup started on 6/7/2013 at 1:03:16 AM.
Backup completed on 6/7/2013 at 1:26:17 AM.
 

The the next part of the job, on a remote server which looks to be ok.

Backup started on 6/6/2013 at 1:22:24 AM.
Backup completed on 6/6/2013 at 1:26:23 AM.

Backup started on 6/7/2013 at 1:27:32 AM.
Backup completed on 6/7/2013 at 1:38:02 AM.

Next is our concern, the main file server, same amount of data but this has really slowed up.

Backup started on 6/6/2013 at 1:26:28 AM.
Backup completed on 6/6/2013 at 6:40:57 AM.

Backup started on 6/7/2013 at 1:38:08 AM.
Backup completed on 6/7/2013 at 11:10:05 AM.

The last part of the job is our Exchange Server (2010) which I suspect is delayed because of the one above.

I haven't copied any major data between the file server and media server, or run any test jobs because the backups are finishing late in the afternoon and wondered if anyone had some further ideas. I'm thinking of disabling the SOPHOS av tonight on the Media Server and also peforming some direct diags on the Tape unit itself.

Thanks

Operating Systems:

Comments 31 CommentsJump to latest comment

CraigV's picture

Hi,

Disabling the AV is 1 troubleshooting step, but a better way around this is to ensure that there are exceptions in for the BE services...you're not putting yourself at 100% risk by doing this.

Were there any updates done? Windows and/or BE?

If there is enough disk space, can you run a backup to a B2D and see if you get a slow-down as well?

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

Hi Craig,

My only query about the BE exceptions would be why the overnight change. We don't have Windows Update enabled and Live Update is not enabled either, so at least that's ruled out.

Yep, plenty of space on the Media Server, 500GB

Hopefully the backup will finish by 14:30 this afternoon so I'll run a Disk Backup but we do have a small to disk backup of 1.5gb and this doesn't seem any different.

As for the exclusions, are these the same as 12.5, is there a list somewhere?

I've also checked the daily scan which runs in the afternoon so it's not that.

Thanks

CraigV's picture

...you were talking about disabling the AV and I just advised you to rather put in exceptions to protect your environment...6 of 1, half-a-dozen of another.

Sometimes a reboot of the media server also fixes things. The exceptions are exactly the same for the versions of BE.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

Ok, no problem, if it's the AV you never know since we run Endpoint and it might have upgraded, although the mediaserver is on 10 and has been for a while.

Reboot already done, first thing I tried and was hoping it was that, sadly not.

I'll update tomorrow

Thanks

Sush...'s picture

Hello Marco,
    It could be possible there is some issue with the Network connectivity between your file server and Media server.
    Try to copy some big amount of data (atleast 15 GB) from File server to Media server through Windows explorer and see if that is fast or even that has some latency. If it is taking a bit longer time then you may need to focus on fixing the network.
    I do not think there is any issue with the Tape drive as other backups to the same are working fine.

Regards,
-Sush...

Hope this piece of Information Helps you... and if it does then mark this response as Solution....!!!

CraigV's picture

...the OP clearly stated that a 150GB copy completed within 30 minutes. There isn't an issue with the network connectivity.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Sush...'s picture

My apology as I missed that... Thanks Craig for pointing out.. Thumps Up to you :).

But can we confirm that the file transfer was actually done from affected File server to Media server?

Regards,
-Sush...

Hope this piece of Information Helps you... and if it does then mark this response as Solution....!!!

Marcopolo's picture

I should have said, the 150GB copy was not from the file server so I've just performed a 4gb copy and it took 1m 40s

Marcopolo's picture

Just watching the Verify, which also reduces as it runs. Started off @ 8,600 mb/min and 20 mins later down to 7,600 mb/min and it will reduce significantly during the process.

Marcopolo's picture

Well, no differences this morning after adding the Sophos Exclusions.

Marcopolo's picture

Just checking the Media while it's running and getting a lot of soft write and read errors.

CraigV's picture

...it's Hard Write errors you should be worried about. These indicate faults with either the tapes or drive itself.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

Soft errors will apparently and significantly slow down tape backup times. The DLT-4 is reporting the last cleaning date of 2011, although I inserted the cleaning tape last week, surely that's not right.

Think I'll perform a manual clean, 100gb across the wire backup to a new tape and see what happens.

CraigV's picture

...could be an expired/old cleaning tape!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

ummm, only purchased last year and we tick it each time inserted.

pkh's picture

You might want to run the manufacturer's diagnostic utility against the tape drive to see whether there is any hardware error.  Make sure that you select the write test and that you have stopped all the BE services beforehand.

Marcopolo's picture

It's looking like this could be a Network issue, but I'll update again soon.

In the meantime, I'm still concerned about these soft errors and noted this yesterday

Before Tape Backup, soft error count was 12985583, after it was 13034126

Is there something to expand on what causes these soft errors, is there a log for them?

pkh's picture

If the soft errors are particular for one tape, then it needs to be replaced.  If soft errors increase for all the tapes, then first clean the tape drive.  If the errors still increase after cleaning, then the tape drive needs to be replaced.

Marcopolo's picture

Well they point to the DLT then, but where is the evidence logging what these soft errors are?

Thanks

Sush...'s picture

Hello Marco,
    as I already said before I do not think that your backups are slow due to device as other backups to the same device are working fine.
As you wanted to know more about Soft errors please refer to the following technotes which will give more information on the same.

http://www.symantec.com/docs/TECH148542 : Why a tape backup will take longer to run or use more tape if Soft Write errors are present

http://www.symantec.com/docs/TECH7363  : An explanation of "soft" and "hard" write errors visible on the Statistics tab within Backup Exec.

http://www.symantec.com/docs/TECH69082 : Large number of soft-write errors are reported when using Quantum LTO tape drives

You may want to refer to the last technote above which seems to give more information about logging.

Thanks,
-Sush...

Hope this piece of Information Helps you... and if it does then mark this response as Solution....!!!

Marcopolo's picture

I think some consistency wouldn't go a miss on here from you advisers, either continuous soft write errors on different tapes after numerous cleans are issues or they are not. I appreciate all the feedback but it's difficult to assess and make decisions when you're possibly conflicting one another.

Sush, I've read all those three articles and it does mention that continuous soft writes could well be the start of more serious issues.

I perfmored the same tape backup job, but to disk yesterday and the job rate started at 1.5gb/min but after two hours it had only completed 40gb and was running 370mb/min and still reducing.

As mentioned, this problem started overnight so I've the issue of the soft writes which I understand impact job rate but a local 170gb to tape only took 55mins @ 5.1gb/min which to me suggests a possible Network Issue. I wonder whether the soft writes are being caused by the network throughput problems but as yet I'm none the wiser.

What's even stranger is the data I copied from and to the same servers from disk to disk, which did not suggest a network issue

CraigV's picture

...inconsistencies will occur as we aren't in consultation with each other. Also doesn't help that posts are days apart...

I've had Soft Write Errors plenty of times and never seen it impact backup times, and that's an honest statement. However, it's not to say it would happen elsewhere.

Upgrade the firmware on the drive and card...this is another step I would do in troubleshooting.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

Inconsistencies occur when the foundation for a fault aren't consistent, regardless of where people are located or the timelapse of information, it's about reading what's already there and reported and attempting to base your theory on it. That said, the detail needs to be accurate in the first place.

The soft writes were just an observation, there was a count prior to my report but I cannot say for sure whether they existed prior to this particular issue.

Firmware has been upgraded on the DLT, although I'm not to convinced the SCSI controller firmware is an issues since nothing has changed, but it's an option already on my list.

Thanks

CraigV's picture

...as I said, if someone else posts 5 days after another, you're going to get information you may/may not have.

If you require consistency outside of the forums, then you need to consider logging a call with Symantec support.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

Quck update.

Rebooted our core switches and the backup time hasn't changed at all.

Just rebooted the Media server and started another test backup. Here's something I've observed.

The first 12gb flew along, throughput was increasing and was upto 4750mb/min and then all of a sudden it started to drop, 17gb in an several minutes later its already down to 1850mb/min

CraigV's picture

Marco: are you on BE 2010 R3? If not, consider upgrading to this...you can use the same licenses you currently do, but just grab copies of the Data/Catalogs folders before doing so!

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Marcopolo's picture

Craig,

Yep, on 2010 R3.

Performed a few more tests this morning.

As mentioned earlier, set a job to backup two folders on a remote server (the post above), one had about 10 sql backup files (12gb), the other is our users home folders (username), so theres plenty of data and folders.

I then setup another backup from another remote server (program files), from the start it was only doing 300mb/min

Setup the first test again and watched the file backup status, it seemed to fly through the SQL folder but as soon as it started the 'users' it dropped 2gb on the job rate and continues to decline.

I then setup another job on the same remote server for our data, it's fluctuating aroung 800mb/min

It's like it's having indexing issues.

Marcopolo's picture

Escalated to support yesterday and so far I've not been impressed. Another call today which has not filled me with confidence. Hopefully now it's been escalated further this will imporve.

Marcopolo's picture

Ok, in conjunction with Support I've done a little more testing and network throughput seems fine, replaced the unit yesterday and it made no difference whatsoever. The only outstanding test is copying some data from the Media Server to the File Server, just to clarify throughput both ways.

I was reading another thread about someone having a similar issue where backups of large data but minimal files is fine, but large amounts of files for the same size was slow and how this causes slow backups because of the units spinning up and down all the time.

Although similar, our issue happened overnight, and though our data was fluctuating from 696gb - 715gb, was there a peek in file numbers what has tipped it over the edge.

The problem continues....

Marcopolo's picture

Ok, we've located the issue. It's SOPHOS Endpoint Security and Control Ver 10 which must have been upgraded that night from an earlier version on the remote server.

SOLUTION