Video Screencast Help


Created: 06 Oct 2012 • Updated: 07 Oct 2012 | 7 comments
This issue has been solved. See solution.

We have SLP in our environment

Copy 1 goes on Data domain

Copy 2 on tape.

Our tape library was down for 4 -5 days.

Now i want to initiate all my duplication jobs.How to check what all are pending and how to initiate?


Comments 7 CommentsJump to latest comment

mph999's picture

You can't kick them off, there is no supported command to do this, they should start to duplicate on their own.
You can check what is pending with nbstlutil list -U


Regards,  Martin
Setting Logs in NetBackup:
Arun K's picture

I read a document stating that the oldest image is duplicated first,

SLPs duplicate your oldest images first. While an old backlog exists, your newest backups will not be duplicated. This can cause you to miss your Service Level Agreements (SLAs) for getting a copy of a backup to offsite storage.

mph999's picture

No, the robot going down, or, whatever the fault was caused you to miss the sla
This is how slp currently works, and there is no way around it.
You could cancel the images and duplicate them manually, this would be the only option.

Regards,  Martin
Setting Logs in NetBackup:
Marianne's picture

Keep on reading... The document that you are quoting from contains more than just the lines you have copied.....

2. To reduce (and ultimately get rid of) a backlog, your duplications must be able to catch up. Your duplications must be able to process more images than the new backups that are coming in. For duplications to get behind in the first place, they may not have sufficient hardware resources to keep up. To turn this around so that the duplications can catch up, you may need to add even more hardware or processing power than you would have needed to stay balanced in the first place.
The key to avoiding backlog is to ensure that images are being duplicated as fast as new backup images are coming in, over a strategic period of time.
As you introduce SLPs into your environment, monitor the backlog and ensure that it declines during periods when there are no backups running. Do not put more jobs under the control of SLPs unless you are satisfied that the backlog is reducing adequately.
Consider the following questions to prepare for and avoid backlog:
 Under normal operations, how soon should backups be fully duplicated? What are your Service Level Agreements? Determine a metric that works for the environment.
 Is the duplication environment (that includes hardware, networks, servers, I/O bandwidth, and so on) capable of meeting your business requirements? If your SLPs are configured to use duplication to make more than one copy, do your throughput estimates and resource planning account for all of those duplications?
Do you have enough backup storage and duplication bandwidth to allow for downtime in your environment if there are problems?
Have you planned for the additional time it will take to recover if a backlog situation does occur? After making changes to address the backlog, additional time will be needed for duplications to catch up with backups.

PS: "Strange" how you and Nikhil/Puneet have exactly the same environment??

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links

mph999's picture

Thank you for posting that Marianne, regarding spare capacity if there are issues.
Any backup system using any backup software can have downtime due to many reasons, there should always be spare capacity to catch up, and if you do not have this, in my opinion, and that of many experienced people i know, is bad design.

Still its always easier to just blame Netbackup.


Regards,  Martin
Setting Logs in NetBackup:
revarooo's picture

Hear, hear!

mph999's picture

when i used to run an environment we split the backups into multiple seperate environments with no one environment being busy for more than 16 hours in a day. That way we had capacity for catch up if we had any issues, extra backups and restores. Sure backups failed, but most were successful on the rerun, so in any 24hr period we maintained an average of 98.4 %, across about 2500 servers. Additionally all database servers had 3 days worth of diskspace for the redo logs just in case of an issue with backups.

We never had any major issues, as there was always the ability of moving a client to another server, and if a backup server did go down we only lost 1/25 of the backups as opposed to 100%.

Further, the load for any server was spread out with some fulls running on a mon, some on a tue, some on a wed and so on, that way we didnt get near to overloading the network by running all the fulls on a fri

The systems were all designed to do what they needed to do, importantly, they were designed on paper, not by guessing.

All this seems excessive perhaps, but given the importance of the data, and the potential financial loss if systems were down it was cheap.

I have zero sympathy when I am told that backups are down and the database will stop in 30 mins, or there are 5 hours to duplicate 8 hours worth of data because there was no allowance in the system for unexpected downtime.


Regards,  Martin
Setting Logs in NetBackup: