Video Screencast Help

understanding client side dedupe and direct access

Created: 21 Dec 2011 | 6 comments

Folks

I am trying to understand how client side dedupe is implemented, what are its pre-requisites etc.

To give a background, I have a OST plugin which does client side dedupe as well as appliance dedupe, we call it as optimized and pass through based on where the dedupe occurs. On a normal scenario we install the plugin on the media server, we configure storage server and OST device and do these backups both optimized and pass through by modifying a conf file setting.

BE gives 3 option for dedupe namely Deduplication storage folder, Open Storage & Remote agent deduplication. To make my appliance OST compliant I need to know which among these can be used.

From the admin guide I read this:-

 

Client-side deduplication enables a remote computer that is configured as a RemoteAgent for Deduplication to send data directly to an OpenStorage device or a deduplication storage folder. The remote agent is configured with direct access.
 
Q1) Can I install the OST plugin and remote agent on the client and point to a media server on a different host and run backups of the client.The data sourced by the plugin in the client side will do the dedupe and send the unique data to the disk storage (appliance) directly. The appliance runs OST server component.
 
Does this work this way?
 
Q2) Or in above case the dedupe data is send to media server first and then to appliance?
Q3) Or is the external OST plugin not required at all, BE implements client side dedupe in it's own way?
 
Q4)Both Media server and client side dedupe are software dedupe, are there any difference other than that they happen at different servers and that each has it's own adavantages and disadvantages? 
Internally does both operate with the same design or algorithm?
 
 
thanks in advance for the comments.
 
regards

Comments 6 CommentsJump to latest comment

waymoon's picture

 

Q1) Can I install the OST plugin and remote agent on the client and point to a media server on a different host and run backups of the client.The data sourced by the plugin in the client side will do the dedupe and send the unique data to the disk storage (appliance) directly. The appliance runs OST server component.
 
Does this work this way?
 
 
Yes that works, the client side dedupe can be implemented as a OST plugin from the appliance vendor. The plugin let's you bypass the media server and write optimized deduplicated data to the storage appliance.
 
Q2) Or in above case the dedupe data is send to media server first and then to appliance?
 
Sent to appliance directly.
 
Q3) Or is the external OST plugin not required at all, BE implements client side dedupe in it's own way?
Yes BE has it's own implementation but they are primarily divided into 3 categories, The first is

Client Sidewhere data is deduplicaed at the source and sent to the media server in deduplicated form. Thesecond is Media server where data is deduplicated in-line, or as it arrives, at the media server. The thirdis appliance dedupe, where the entire dedupe operation is done by the 3rd party hardware using the OST plugin interface. 

Now when appliance vendors support their own plugin based client side dedupe it supercede on the native implementation.

 

 

Q4)Both Media server and client side dedupe are software dedupe, are there any difference other than that they happen at different servers and that each has it's own adavantages and disadvantages? 
Internally does both operate with the same design or algorithm?
 
I am not sure internally how they are designed but both are different, client side dedupe does the job of deduplication and sending the optimized data to the media server to be stored in the storage folder, however in media server deduplication the media server will have the overhead of implementing the dedupe algorithm that ofcourse consume good chunk of CPU and memory resource.
teiva-boy's picture

Q4)

Client dedupe and media server dedupe use the same technology and algorithms.  You can manage each via the PDCONF file.

Both consume CPU and resources, the question comes down to, spread the CPU load (Client-side) or let a beefy media server do it on the backend. 

The only benefit at this point for client-side dedupe from Symantec is less bandwidth used.  I've yet to encounter shorter backups!  So misleading...

OST has been faster in almost every way from some vendors.  DataDomain being the best, though the premium product.  But hey, when it works as advertised, my customers don't complain.  But when under budget restraints, Symantec is acceptable.  

I know of a Direct Access configured BE/DD customer getting well over 4000MB/min+ on all of their clients, consistently!  OST is the killer app for BE/NBU.

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) http://mysupport.symantec.com "We backup data to restore, we don't backup data just to back it up."

teiva-boy's picture

Not sure what OST appliance you are using, but with DataDomain's BoOST plugin, you can install that on a client.  Enable Direct Access, turn on client side dedupe for that client...

At this point you can backup directly to the appliance, without going through the media server.  Granted there are a few other caveats...  But in a nushell,

The DD OST/BOOST plug-in does some dedupe on that host, offloading it from the DataDomain.  

 

I think Quantum also has this in their new DXi line, but I have yet to try it or see it implemented anywhere.  Plus it's extremely new so it may not be mature yet.

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) http://mysupport.symantec.com "We backup data to restore, we don't backup data just to back it up."

waymoon's picture

Yes the offloading of dedupe from appliance to the client is now adopted by most of the vendors. Quantum has it's in latest relase and call it as hybrid dedupe.

I am actually testing a new plugin from dell, who is soon to join the bandwagon.

teiva-boy's picture

I wouldn't say most..  Just Quantum and DD.  HP's D2D, Exagrid hasn't nor has greenbytes.  I think thats it?

For Dell, I take it you're talking Ocarina?  How does that work in comparison to traditional appliance products?  Any major differences in the technology?

I know they hyped being able to dedupe images and pre-cocmpressed files.  But no one else has been able to do that, so I feel that is marketing hypre more than anything...  But if true, I'm sure I could think of a lot of autocad customers and graphic houses that I work  with that could use something like that.

The one downside to other vendor's "joining the bandwagon," is that Symantec's testing labs and certification often take forever...  Often the vendor says "yeah we have OST."  But it's months before Symantec updates their HCL to reflect it, sometimes upwards of 6 months.  This was the caase with HP's D2D.

There is an online portal, save yourself the long hold times. Create ticket online, then call in with ticket # in hand :-) http://mysupport.symantec.com "We backup data to restore, we don't backup data just to back it up."

waymoon's picture

Yes, things look brighter, not elaborating due to business ethics.

Do you know if there is any Symantec Certification for OST for Backup Exec, if yes can you share the details.

 

thanks