Video Screencast Help

Netbackup Vault Process Taking Too Long to Finish

Created: 11 Mar 2010 • Updated: 12 Sep 2010 | 7 comments

6.5.4 media, master and clients. Windows Server 2003 master and media, W3K, W8K clients...

It is taking over a week for my vault process to finish.  Can someone point me in the correct direction on learning more about the Vault process?  I am running backups at the same time the vault process is going on, has never been an issue before until the last few weeks.  I have to cancel the vault process to start the new vault process.  Is this what is messing it up?  If I leave the vault process running for as long as it needs (1 week +) might that fix my issue or is there something more that I can check? Is there a way to break the vault process down in smaller groups based on policies?  Once I get a good weekly vault again, I thought about scheduling the vault to run twice a week to maybe spread the load a little.  Does anyone have any advice to give?  Thanks

Comments 7 CommentsJump to latest comment

rjrumfelt's picture

Are you just ejecting tapes?

Or are you including duplications in your vault?

Denda's picture

including duplications in the vault job, then ejecting...

rjrumfelt's picture

the duplications or the ejects?

There are a number of things that could cause duplications to take a very long time to run.  Have you checked your environment for any type of issues such as downed drives, slow throughput on the duplications, etc?  If this is the case, you will continue to get a backlog of images that need duplicating, and the backlog will continue to grow until either you fix what is causing the backlog or until images start to expire before they get duplicated.

If it is the eject portion that is taking a very long time, the eject job will run as long as the tapes are still in the cap of the library.  Until someone physically pulls the tapes from the library, the vault job will show in the activity montior as running.

Denda's picture

Duplications are what is running long.  No drives down.

For example.. the vault job details show a duplication batch started at 1:00 pm on 3/9 and it is still sitting there, it duplicated 4 batches prior to the duplication job...  the duplication job has been sitting 'awaitng resources' since 3/9, eventhough I know there were several drives availble  last night.

Just having you ask me these questions is helping me break it down in different areas to troubleshoot.  I am going to kill this specific dup job and see if there is something going on specifically with it and go from there. 

rjrumfelt's picture

we will notice that drives are not being utilized when they appear to be free in the device monitor

if we run nbrbutil -dump, it will show you all of the allocations that nbrb is using.  Sometimes these allocations get hung, and if you see an allocation to a drive that is not actually being used, you can terminate this allocation and thus free up the drive.

From the nbrbutil -dump output, sift through and find any drives that have allocations that are not being used, and note the drive name
then execute:

nbrbutil -releaseDrive <drive_name>

Of course this is only a small possibility of what is going on

Ron Cohn's picture

Denda,

Below is a response I provide nearly 2 years ago in this forum.  Parts of it may not apply to you, but I hope you find this useful:

==============================

The Vault process has everything to do with how images are stored on the VTL - especially if multiplexing to the VTL. For example on the VTL, tape #001 contains image_001, image_002, image_003. Tape #002 contains image_001, image_003, image_004.

When you start the vault process, it goes to tape #001 and gets the image_001. BUT, it knows that image_001 is also on tape #002. It will dismount tape #001, mount tape #002 and continue vaulting image_001. Once that is completed, it dismounts tape #002, mounts tape #001 and proceeds to image_002. Since it is contained strictly on tape #001 - no problem. Then it vaults image_003. Again, it performs the same process as it did for image_001. This is what will kill you when vaulting. There is a fine line in trying to optimize writing to the VTL AND using Vault.

Because of the possible mounts / dismounts (even virtually), fragment size is very important. If you did set fragment size, then going back to my example, when mounting tape #001 to begin vaulting image_002, NetBackup is going to have to scan tape #001 from the beginning to get to image_002 starting point.
 

Ron Cohn
"I maybe lost, but I am making good time..."

Environment: NBU 7.6.0.2 for Windows
Write to EMC DD4200 -> Vault to ADIC i500

Marianne's picture

Have a look at the Best Practices section of the Vault Admin Guide.

Topics that might help you to cut down vaulting time:

  • Do not vault more than you need to
  • Avoid resource contention during duplication
  • Avoid sending duplications over the network
  • Increase duplication throughput
  • Maximize drive utilization during duplication

Supporting Storage Foundation and VCS on Unix and Windows as well as NetBackup on Unix and Windows
Handy NBU Links