Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Long duration optimized duplication job when using OST plugin

Created: 14 Mar 2012 • Updated: 14 Mar 2012 | 13 comments

Question for anyone who's listening...

We have implementanted two OST aware deduplication devices on Backup Exec 2010 R3 and running into unusually long duplication job times between the appliances.  There is no network activity on the media server or on the dedup appliance during the jobs. The is also minimal processor and disk activity, so it appears as though nothing is happening.  However, at ten minute increments (almost on the nose) the byte count for the job increases and then sits again for 10 minutes.  The appears to be no relation to media set file size and duration of increment, a 10GB media set takes as long as a 800GB media set.  It is as though it is waiting for a file timeout before moving the the next media set in the backup.  We usually have multiple jobs running concurrently, but on dup jobs only one job will increment at each 10 minute interval.  Any active backup jobs cease until the dup jobs complete.  Add these symptoms together and our "optimzed" duplication jobs take hours to complete and almost always push off our incremental\differential backup schedules.

As anyone seen this behavior or have any idea how to fix it?  Since no data is actually being copied between the devices, just pointers and catalogs of the new location creaated, I would think these jobs should run a few minutes, not step on each other and take hours to complete.

Thanks for any help!

Comments 13 CommentsJump to latest comment

pkh's picture

If I am not mistaken when you dup from one OST device to another OST device, you are not doing optimised dedup.  Hence you actually hydrate the backup and then dedup into the other OST device and this will result in a long run time.

When you dup between two dedup folders, you are doing optimised dedup which means that no hydration takes place and only the changed data blocks are transmitted over the link.

teiva-boy's picture

pkh, reread the HCL again.  You are incorrect.

However when it comes to GRT, I believe for any GRT related jobs that may be the case.  As a work around you backup to a B2D on the device, and let the device replicate on its own.  Thats what you do for Exagrid and DataDomain.

Which OST device are we talking about?  There are 5 each with their own nuances.

I've seen with DataDomain you have to upgrade to the latest DDOS and OST plugin.  

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) http://mysupport.symantec.com "We backup data to restore, we don't backup data just to back it up."

pkh's picture

The user said, "... unusually long duplication job times between the appliances" which means that he is not using replication between the devices, but doing a duplication, so I believe my answer holds.

teiva-boy's picture

Duplication with supported OST devices is "Optimized Duplication."  AKA, replication.

BackupExec with the OST plugin initiates replication from within the appliances.  In turn the supported appliances can feed information back to BackupExec that it's completed its replication task to update the catalogs.

Marketing terms and realworld terms are often at odds.  Duplication in the context of OST and optimised duplication is replication at its base.

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) http://mysupport.symantec.com "We backup data to restore, we don't backup data just to back it up."

Jonosparks's picture

That was a mistake in my wording.  We are doing replication between the devices, so no data re-hydration is occuring.  The dup job is simply creating a secondary set of catlogs with a different retention policy.

Sorry for the confusion.

dedupe-works's picture

What devices do you have?
 

I have experince with Exagrid devices more over than Quantum or Data Domain.

Regards.

Randey

Jonosparks's picture

To fill in some of the gaps...

We have 2 HP D2D4112 running the latest OS and the latest OST plugin for the devices.  Replication is enabled between the two appliances, so only index changes are replicated between the two devices.  Auto replication takes a minute or two for several hundred GB datasets (gigabit fiber links two of our main location together).  Duplication jobs from the previous night's fulls run during business hours when we only have Exchange\SQL Log backups running.  Each device is divided into two storage locations so that backup types are reasonably grouped by type (Exchange\SQL on one and VM backups on another), this helps to optimize the deduplication ratio.  Each storage location can handle 6 concurrent jobs, so 12 per device.  This number is NEVER exceeded during operations, jobs are spread throughout the week to prevent this.

My original thought was that there would be significant disk activity while new catalogs are created on the media server and some base network traffic to either device indicating that the media headers were being read.  However, this isn't the case.  No significant disk, CPU, network traffic on either the D2D appliances or the media server.  I did a test with ~100GB of data to test a theory.  One job had a single media set of 100GB and a second job had 3 media sets with ~35GB a piece.  The job with a single media set took just over 12 minutes to complete (no other backup or dup jobs were running at the time), the second job with 3 media sets took ~35 minutes to complete.  This lead me to my "10 minutes per media set with a little overhead for media loading and unloading" theory.  Now if multiple jobs are running at the same time, only a single media set will be processed at a time.  So three jobs with three media sets will take 30 minutes for the first, 1 hour for the second, and 1.5 hours for the third, instead of all three jobs processing concurrently and taking 30 minutes total.  Also, if any backup jobs are running, all data transfers grind to a halt except during the gap of media loads and unloads for dup jobs.

That lead me to two questions: a) is the single processed media set a by-product of the OST plugin and b) why is there a ten minute waiting period between media sets within a dup job when no activity is occurring?   My thought is that there is a timeout period written into the DB or registry and the OST plugin isn't receiving a completed command, thus it waits to hit the timeout before moving on.

Is this a correct ascertation?  Is there a way to change that timeout?  Or am I just daft and completely missing something?

Thanks!

dedupe-works's picture

It's possible each set is being built on an individual piece of media on the target.

I know we do that by default during opt-dupe jobs between folders.

See if this regKey helps:

LOCATION:HKEY_LOCAL_MACHINE\SOFTWARE\Symantec\Backup Exec For Windows\Backup Exec\Engine\Misc
KEY TYPE: DWORD
NAME: Opt-dupe One Set Per Destination Image
VALUE: 0

Regards.

Randey

Jonosparks's picture

Thanks for the info, Randy!  Before I go making registry changes, could you give me a little more backgroud as to what these key changes will do?

Thanks!

dedupe-works's picture

Instead of putting all your backup sets in seperate media and making a catalog for it, this puts the backup sets into as few media as possible.

Regards.

Randey

Jonosparks's picture

It appears to already be set to zero...does it need to be set to one (1)?

Thanks for the help!

teiva-boy's picture

Interesting that the last HCL I looked at doesn't have HP listed as an OST partner, though I know HP advertises it as such for more than a year since the D2D's were announced.  

You may want to open a case with symantec.

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) http://mysupport.symantec.com "We backup data to restore, we don't backup data just to back it up."

Jonosparks's picture

Apparently, there is a bit of a miscommunication between HP and Symantec.  When Symantec went to R2, they changed most of the code base and HP's OST plugin stopped working for Backup Exec (NetBackUp still functions).  Now BE2010 is into R3 and I believe HP is still trying to work a final version of the plugin.  I am following up with our HP VAR and HP Sales Engineer to see if I can get some clarity on the problem.  In the mean time, I've further spaced dup jobs thoughtout the week as to not load too many on at one time.  It isn't ideal becaus ethere are now a few days between catalog duplication, but it could be worse.  Hopefully, there is will some updates soon from HP.

Thanks for the assistance!