Video Screencast Help

BackupExec 2012 Client Side Deduplication

Created: 04 Dec 2012 | 13 comments

Hello All, 

Good day to you.

I am having issue 1st backup job after seeding data to deduplication storage in  BackupExec 2012.

It took more than 48hrs to complete the job and it is still running as of now. Size of data is 810 GB

Here is the process I have follow as per Symantec's documentation to seed the data:

1. Perform full backup of remote server (BE2010 RAWS installed) by using existing BackupExec 2010 Media Server to a D2D folder on portable harddisk drive. It completed successfully and verified OK.

2. Portable harddisk drive containing D2D folder is disptached to central site where BackupExec 2012 server with deduplication is located

3. First D2D storage is imported to BackupExec 2012 Media Server and subsequently Inventory and Catalogue job is run successfully.

4. And then I have duplicated the existing backup set on D2D to Deduplication Storage on BackupExec 2012 Media server successfully and verified OK. In this regards, seeding is successful.

5. Uninstall BackupExec 2010 RAWS from remote server and reboot it. Deploy BackupExec 2012 RAWS to remote server.

6. Configure a FULL backup job and check following settings are OK

                 - Check Remote Server has Direct Access to Deduplication Storage (OK)

                 - Check Client Site Deduplication is enabled on Deduplication Storage (OK)

                 - Check Enable Remote Agent to access Deduplication Storage on Backup Job is checked (OK)

7. And I have initiated a the job and it took so far 48 hours and it is still running

Do you guys have similar experience or did I do something wrong in the process?

Thanks in advance.

Comments 13 CommentsJump to latest comment

RLeon's picture

Seems like you've followed similar steps as described in the above TechNote. The only thing you did differently, which is not covered in any documents, is that RAWS version upgrade you did in one of your steps.

But then again, I don't think I have ever seen anyone on the BE forum to have claimed to have the dedup seeding procedure worked flawlessly, in preparation for the (expected) high speed client-side-dedup backups coming from the slow WAN.

Please refer to these related discussions and all the links they referenced:

It would appear that the general consensus between the experts is that the surest way to get this whole thing to work is to:
  1. Ship a BE server with the dedup storage to the remote client site, connect it to the client's LAN.
  2. Backup the client to the BE server's dedup storage. (This should be fast because the BE server is now connected locally to the client's network.)
  3. Ship that BE server back to the main site.
  4. (Optional) Back at the main site, Optimized-duplicate the backup from the returned BE server to other BE servers that also have dedup storage, as you see fit.

Note: The "travelling" BE server could be a 3600R2 appliance.

This whole thing could be made less painful if you are running CASO with multiple dedicated BE servers at the main site. In this case, you simply use one of those servers for this task (Not the CASO server). This would give you the least disruption at the main site.
Another easy way, as suggested in the thread, would be to use an evaluation BE server installed on a laptop for this task. (you might still need to get an eval dedup key file, can't remember...)

pkh's picture

The poor performance could be due to the WAN link or some other factors, and not because the seeding is done wrongly.

TekkieDude's picture

Hello Guys,

Thanks for the advices and trust me I have gone through them a thousand times. Right now I am running out of idea.

Symantec Support said it's because of RAWS version on the server so I make 1st full backup on remote server with BE2012 RAWS and ship the USB drive over to central site and do the seeding again. And perform backup and it's the same behavior. 

Last week, we are told to perform 1st full backup on remote server using "Client Side Dedupe". I obliged and tried and sent the USB drive over to central site again. And we just found out only one Dedupe Storage is supported on one Media Server in BE2012 in Central Site. Which means I could not duplicate the backup set from Dedupe Store located in the USB disk. How brillant!!!

By the way the scope of project is 25 geographically dispersed sites (from South Asia to Pacific Ocean countries) and it is logistically a nightmare to ship this server to those site one by one. 

Now support is telling us it is kind of expected behavior and WAN could be contributing so on and so forth [Honestly it is expected behavior of Symantec Support. Put you on hold for 2 hours route your call from India to Philippines and tell you to run a few tools and tell you again what you have already know]. Fine, I can accept the fact but I need a good documentation to predict the behavior. Basically there should be a documentation to explain PureDisk's behavior on:

  • How it chunks the data (128 Kb??)
  • How it fingerprints them
  • How it performs checks againts the fingerprint database (what's the payload of the fingerprints sent back and forth RAWS and Media Server over WAN)
  • Where is fingerprint database located

Should one knows those information CPU cycle, RAM usage, Payload of network traffic for fingerprint check for X amount data and WAN connectivity (latency and bandwidth, QoS etc), it's fairly easy to work out how long it will take to perform backup. 

Right now I am lost. I will call Symantec to see if I can switch to anoter product.

pkh's picture

810GB is a lot of data to transfer over a WAN link.  Using the bandwidth of your link and your backup window, work out how much data can be transfered over the link during the window and compare this with 810GB.  This will tell you the expected dedup ratio.  If it is not a reasonable ratio, then it is your link which is the bottleneck.

TekkieDude's picture

Yeah I know 810 GB is alot. If the answe is WAN is bottleneck, what is the point of using dedupe. We know in the first place WAN is the issue. Still my questions above remain unanswered.

When we talk about technical we need numbers and calculation not guess works.

pkh's picture

What I am saying is that it may be the case where even with dedup, the WAN link is still a bottleneck.  Suppose your dedup ratio is 50%, this means that you still need to transwer 40%GB over the WAN link and this is going to take a long time.  You may never achieve this dedup ratio.  Even if you do, it would not happen over night.  You would need quite a few backups to reduce the dedup ratio.  So what is the point of knowing the answers to your questions, when what you are trying to achieve is impossible.

TekkieDude's picture

OK I reiterate WAN is bottleneck and I am completly aware of it. I do not expect dedupe will solve the issue overnight. And I am aware of it. That's normal Symantec respond. And It is brilliant as always.

What I am trying to find out is very simple about PureDisk and its behavior on

  • How it chunks the data (128 Kb??)
  • How it fingerprints them
  • How it performs checks againts the fingerprint database (what's the payload of the fingerprints sent back and forth RAWS and Media Server over WAN)
  • Where is fingerprint database located

We can put parameter of WAN link and we can figure out how long it will take for the first backup. Even if Symantec cannot answer this question why even bother they are touting and selling the product in the first place.

What I am seeing right now is beating around the bushes very long.

TekkieDude's picture

By the way if you never tried it by yourself before, pls dont pretend you know. I am just trying get information from someone who has already tried and want to know their experience.

pkh's picture

Did I ever say that I know the answers to your Puredisk questions?  So where is the pretense?  What I am trying to tell you is not to be fixated with getting these answers.  They are not going to solve your problem.  These are overheads which will only add to your bandwidth and hardware demands.  A quick calculation on the amount of data that needs to be shipped across the WAN link will tell you whether your project is feasible or not.  By being fixated with getting the answers to your Puredisk questions and trying unnecessary things when your bandwidth is an unsurmountable obstacle, you are wasting time when you should be moving on with your project.

If you do get the answers that you are looking for and managed to get your project going despite of its high bandwidth requirement, do share with us.

teiva-boy's picture

NBU and PureDisk have been certified for WAN based client backups for a couple years now. BE2010 was not, i didnt think the official stance changed with 2012 due to the lack of a clientside hash cache table. Technicaly PureDisk was the first to have the cache, NBU used client retry amd checkpoimt restarts.

Its this lack of a client cache that makes it unsuitible for wan backups. Are you sure thisis even supported? Frankly Avamar is the best for WAN based backups, but we're not even in the same league at that point. Honda vs ferrari....

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) "We backup data to restore, we don't backup data just to back it up."

TekkieDude's picture

@ Teiva-boy,

Your answer is spot on. I just get some feedback today and here how it goes:

  • 1st Initial Backup over WAN will be long (though seeding is done). It is due to the fact that client will chunk, hash and fingerprints the data and register in database. This also means there's no clientside hash cache and fingerprints are sent across WAN. I really appreciate this honest answer I got today from Symantec.
  • Subsequent backup will be faster because client will load the fingerprints from dedupe pool of Media Server and fingerprint calculation / comparison is done on client and only unique data segment are sent over the WAN.

But my technical contact remain confidant it is supported and it will be a working solution. Right now, I am doing my proof of concept to demostrate this fact.

Yes I bought a HONDA.

Will update you guys when I get conclusive results.

Vlad Velciu's picture

I think I am having a similar issue with yours but both B2D stoarge and Dedupe storage are locally attached. A 1,5TB of data took 37 hours. 

The "performance" of 37 hours was due to the fact that it takes a long time before beginning to transfer the data. The backup which I was duplicating was 1.5TB in 2 vmdk, one of 45 GB and one of 1,45TB. The job started with a short transfer of 80KB, then waited about 20-30 minutes, continued transfering the 45GB of data, then waited 12 hours and continued transfering the rest of 1,45TB. Actual transfer of data had a rate of 2.250 MB/min. After that it started verifying.

In your case, being on a WAN connection, you add extra time in transferring but the main culprite seems to be the process that takes place before actual data sending. 

You can find my post here: