Satus 96 - job didn't restart according to retry delay
Hi,
We running Netbackup 7.1 - Windows 2003 server
This client being backed up is a windows server in a MS-Windows policy. This is the only server in the policy. Multi streaming is enabled and so are checkpoints.
Two streams out of four for this backup job failed over the weekend - error "unable to allocate new media for backup, storage unit has none available (96)".
Upon checking the volume pool, it's seems not enough tapes were put in over the weekend to accommodate a full backup of one of this servers, so this explains the failure.
However, the job didn't retry according to the retry period set in the master server host properties - currently the retry delay is 10mins and schedule backup attempts it 3 tries per 12 hours.
So today we added some more tapes, but then strangely enough about 5 minutes later the job restarted.
Can anyone explain why this job didn't restart according to the set retry delay? Also, why is it with this error does the backup not resume from where it failed instead of restarting from the beginning, even though checkpoints are enabled?
Thanks
JB
Comments
Would need to look in the
Would need to look in the nbpem log to see what was going on, nbjm might show something also, anything else would just be guessing I think.
Run vxlogview, using -d all -o 116 and then again with -o 117
Martin
Ok thanks, i'll check out
Ok thanks, i'll check out these logs.
Why is it with this error the backup can't resume and instead restarts?
Not sure without the
Not sure without the logs.
NBU will not retry all jobs, some are excluded (eg, I think Oracle /rman jobs don't retry) and I suspecty some will not depending on the failure cause.
You can trace a job in the nbpem/ nbjm log using the jobid, and then the TID (hence why you should always run vxlogview with -d all).
I expect we will see "job not eligable for retry" or similar. If you do not have retry after runday (and a suitable window), is it possible the window closed, if it was small ? - just an idea.
Martin
Are there any KB
Are there any KB articles that will tell me that once a particular failure occurs which jobs can resume and which will restart?
I've never seen one - a quick
I've never seen one - a quick search didn't find anything.
Best bet, would be to log a call - if you can get those log details sorted for when you do this it would be excellent ...
The call will also need ...
nbsu -c -t
name of the policy that failed
details of failure from activity monitor for the job (details tab)
1. If the log shows as I expect, we will see the job was not valid for retry
... then it is reasonable to ask the question why ...
As far as I can think, the only way we can tell you why, will be to look at the details of the job and failure, and then check the NBU code to see why it behaves in this manner.
That will need a BL engineer ....
Martin
May be ,If you cancelled that
May be ,If you cancelled that backup ... then only it will retry most of the times..
Would you like to reply?
Login or Register to post your comment.