Video Screencast Help

BE 2010 R3 2 days out of 5 not completing

Created: 01 Nov 2013 | 6 comments

Hello all,

 

I have BE 2010 R3 on a 2003 server.  Recently, I had to swap out the tape drive due to dead heads.  After updating the firmware and getting it all back to normal, there are 2 days in a schedule of a 5 day backup that just will not complete.  

The content being backed up is and Exchange database and the contents of our file server, both remotely.  The thing that makes this odd, is that new tapes do not like to run a full backup, and 3 of the old ones are working just fine, as these were until about a week after the drive swap.  The Exchange backup happens first, then the file server.  It asks for another tape, even though there should be room on it.  I also noticed that the overall backup, both the Exchange and file server, runs slower than the other days.

There are no events happening on any server for the nights that it runs slow.  I have tried formatting them, full erases, erasing with the drive utility, retension [which doesn't seem to do anything] and multiple different tapes.  I have not tried using one from another day that I know works, but that is coming up next week.

The Exchange backup is running as an overwrite, and the file server as an append.  There is enough native room on the tape to handle the data, and compression is on for the hardware.  The compression does work, as I tested that.

Here are the summaries for a working and non working day:

Name Device Name Job Type Job Status Percent Complete Start Time End Time Elapsed Time Byte Count Job Rate Error Code Deduplication Ratio
FileServer Full IBM 0002 Backup Canceled N/A 10/30/2013 10:10:04 PM 10/31/2013 10:45:40 AM 12:35:36 154,738,093,036 604.00 MB/min    
Exch Store full IBM 0002 Backup Successful 100% 10/30/2013 9:05:04 PM 10/30/2013 10:57:35 PM 1:52:31 57,550,875,180 756.00 MB/min    
FileServer Full IBM 0002 Backup Completed with exceptions 100% 10/29/2013 10:10:03 PM 10/30/2013 3:31:19 AM 5:21:16 159,761,340,648 867.00 MB/min    
Exch Store full IBM 0002 Backup Successful 100% 10/29/2013 9:05:03 PM 10/29/2013 10:34:02 PM 1:28:59 57,259,358,816 1,133.00 MB/min    

Any insight would be very helpful.

 

Thanks!

-=Alejandro

Operating Systems:

Comments 6 CommentsJump to latest comment

pkh's picture

In your example both the Exchange jobs completed successfully. You cancelled the first file server job and the second file server job completed with exceptions.

Since your Exchange job specify overwrite it needs an overwritable tape. It will not append to the previous tape. If it cannot find an overwritable tape it will ask for one

For your file server job it could be not appending to the tape because it overlapped the Exchange job. For the reason why see my article below

https://www-secure.symantec.com/connect/articles/w...

DWNAR's picture

Hello PKH,

I cancelled the job because it was asking for another tape, with less data on the tape.  If you look at the times, even the one that completed with exceptions [which is an older client software] started before the end of the Exchange job.  The list is also in reverse order by date.

Part of the concern is that the rate for both jobs is slower on the day that the second job fails.  There is no reason that I can find for this.  That is also the only difference I can find.  I was hoping that would be a clue as to why the tape reports being full, even when it is not.

Tonight, I will be running this backup on a new tape, quick formatted.  If my theory is correct, it should go through successfully.  This new tape that I am using has failed in one of the two days that it always fails.

Thanks for the information regarding overlapping.

-=Alejandro

pkh's picture

 If you look at the times, even the one that completed with exceptions [which is an older client software] started before the end of the Exchange job. 

This is what I mean as an overlap.  It may cause the 2nd job not to append to the tape.

Part of the concern is that the rate for both jobs is slower on the day that the second job fails.

Check the statistics of the tape used.  If there are a lot of errors, it could cause the slowdown.  Clean your tape drive and/or replace the tape.

 

DWNAR's picture

Hello,

I tried your suggestions of moving the time and checking statistics.  So far, moving the time so that there is no overlap has seemed to have the job hold less on the tape before asking for another, and it seems to be moving even slower, but only on the Wed. and Thur. backups...the other days run smoothly, even when they were overlapping.

As for the statisics, the tapes in question have less errors, and all softwrite errors, than the ones that run just fine with softwrite errors and hard errors.  I did try and clean the drive, with no changes.  Part of the original problem is that this not filling the tape problem happens on every new tape I put in, so the new tapes do not help.

I must say, I am a little perplexed at why this would start behaving badly.  The upside is, it is consistantly breaking on the same two days.

Is there any detail you would like to help assess the problem?  I must be missing something that changed, but there seems to be no clue as to what it is.

Thanks again for your assistance so far...

-=Alejandro

pkh's picture

To rule out hardware problems, run the tape manufacturer's diagnostic utility against the tape library/drive.  Make sure you select the write test and that you have stopped all the BE services beforehand.

DWNAR's picture

Hello PKH,

I have run each of the diagnostics available in the software 3 times, with 2 different tapes in the drive.  Each test passed.

-=Alejandro