Video Screencast Help
Backup and Recovery Community Blog

Gary's Dedupe experiences Update with stats

Created: 18 Mar 2011 • Updated: 27 Mar 2011 • 6 comments
macpiano's picture
+3 3 Votes
Login to vote

I have decided to start a blog for my experiences with dedupe.

My setup right now is 2 Dedupe servers in 2 buildings with a 100MB wan. One building has a file server with 2.2 TB of data and several other smaller servers. Both dedupe servers are housing the dedupe folder in both buildings with about 4 TB available. 

I have a tape drive attached to both but what I want is mainly in the second building.

My finding(s) at least to me.

1. Live Update. That was the first hurdle. It was necessary to go to the control panel, live update then manually add the proxy exception, using IE settings did not work. I also changed the cache size to 100 MB just to be sure.

2. Remote Agent has to be manually removed from each server then the new agent installed. I was going from 2010 to R2 so your milage may vary.

3. I am using Backup to dedupe, duplicate that dedupe job to the other dedupe folder in the other building, then backup to tape in the second building. You end up with 4 copies including the original actual files. You get 3 backups of that and they will all have identical times etc. You have to look at its properties to see where it backed up to. SUSH... had a great reply to this at https://www-secure.symantec.com/connect/forums/how-do-i-duplicate-tape-after-dedup-duplicate-another-dedupe-folder-job

4. You CANNOT do #3 unless you setup all the jobs on the MMS that actually has the tape drive attached to it. NOTE: Update is I am still having trouble getting the last part to work.

5. Do not do verify along with the job. Some say don't do at all. At some point I will implement that.

6. Initially I was getting slow speed on client side dedupe but after a full backup to the dedupe folder I get 2500 MB per minute on either media or client side deupe.

7. If you need to remove the dudupe folder to start over as I did just removing it from the devices will not do it. I deinstalled BE itself as I am still in testing mode. Some say you can just remove the deduplication option from BE then you can manually delete the dedupe folder itself.

8. Windows firewall can be problematic. It stopped my dedupe folder from working. There is a workaround for it.

9. Sometimes the dedupe folder will be in red on the devices so if I just click on a server in the list that is supposed to have it and go to properties and close it it will then be active.

10. WAN consumption. I have a 100MB WAN. On a duplicate backup job from one dedupe folder to the other dedupe folder in the othe building it will consume 97% of the WAN connection-the first time to get the data there. On subsequent ones it does just the changes. I will check this and correct if I am wrong. This is also true going from backed up server in Building A directly to tape in the other building all the time (no dedupe involved). Backing up a server (not the dedeup one) in Building A to dedupe folder in Building B it will consume 70% of the WAN. Note that our traffic is about 10-20MB per building during the day the dedupe job just scales back and the users done't even notice.

11.Mulitple jobs and NICS. I have run mulitple jobs to the local server. 2 different servers always get 1000 MB per minute by themselves so I upped the dedupe folder to 6 connections. I ran both jobs and they both still got 1000MB at the same time on the one nic. I added a 3rd one to the file server and it used a 2nd nic and got its usual 2500MB per minute. I checked the let the media server manage the nics and it did.

12. If I try to run a duplicate job from one dedupe folder to the other dedupe folder that duplicate job will appear to just hang there because of the next item in 13..

13. A duplicate job will not give you a running total of the byte count, just the total byes that it is going to do. The job will start out very high in MBs per minute but keeps going down as the job completes.

14. My first full duplicate of 2.2 TB from one server's dedupe folder to another one's dedupe folder in another building (100MB WAN connection) took 40 hours. The 2nd one this week took 3 hours and 14 minutes, only jammed the router for part of the time, and did it at almost 17,000 MB per minute. The dedupe ratio was 50:1 Fantastic and finally worked as I had hoped.     

Comments 6 CommentsJump to latest comment

Sush...'s picture

Thanks for sharing the experience about the DeDuplication option. I just have few queries here:

In point number 10, have you observed that the WAN network is utilised 97 % everytime when you run a Duplicate from one Dedupe folder to another??

I understanding is that when you use the Media side Dedupe (where your source Dedupe is located), it will process the data first on the Media server and then very little data (only the difference) willl be transfered over the WAN network to the destination DeDupe folder.

 

Also I need to test the point number 12. I will get back to you with more information on it after doing some testing on that point.

 

Thanks,,

-Sush...

Hope this piece of Information Helps you... and if it does then mark this response as Solution....!!!

+1
Login to vote
macpiano's picture

On 10. I believe now it is the first time only as it gets the dedupe data there. I am going to change 10 because I believe you are correct.

On 12. I think what has thrown me off the first times I looked at this was because in 13. It does not change the byte count and I think that was what I was looking at. I will correct that and if my testing proves otherwise I will change it back.

0
Login to vote
Sush...'s picture

Yes. Now the points 12 and 13 looks to be correct. That looks to me its by design.

Thanks,

-Sush...

Hope this piece of Information Helps you... and if it does then mark this response as Solution....!!!

+4
Login to vote
dedupe-works's picture

@macpiano -

For 2, The team I work with is actively looking for these upgrade issues. Please create a support case.

For 3/4, Which ever media server has the policy and the policy contains a duplicate, the job has to run from the same server.It may be best to move the tape drive to the local server as apposed to the remote server.

For 5, Symantec will always want a verify to be performed. Later is still good, immediately after the job is not the best time in some cases.

For 7, besides removing the folder from BE there are 2 reg keys that need to be removed to clear the flag that a folder is present. A removal of the product is not needed, nor is a removal of the option.

For 8, I reported an issue with Windows firewall and the dedupe connections regarding inbound and outbound connections. That issue has been resolved in BE 2010 R2.

For 10, You should be seeding your target server before sending opt-dupe jobs across the wan. For how to seed and other neat stuff about dedupe, check this FAQ:

www.symantec.com/docs/TECH162822

For 12, This is kind of normal, because of dedupe works on the back end.

For 13, the temp image to be sent to the remote server is going to be built on the local server first.

Comparing the data needed to send may take some time. Included in that time frame is the COPY of the image from the local server to the remote server.

For 14, that is normal. I would suspect at least 3 opt-dupe jobs before you see the smallest network imprint needed to transfer the data.

Regards.

Randey

+1
Login to vote
macpiano's picture

@Randey, thanks for your comments. I wish that 3/4 were not that way because it takes away the purpose of CASO. I have a new wrinkle because I'm buying 2 SANS for thsoe 2 buildings and I will have replication set between those SANS. I am also getting expanded bandwidth in the 2 buildings where I was trying to seed the backup dedupe stuff.

+1
Login to vote
dedupe-works's picture

RE: 3/4 - In BE 2010 R3, the program shouldn't let a policy be created with a different media server target than the backup, when a duplicate template is present.

If the tape device is shared with the Media Server that performed the initial backup, then it would work so long as the Catalog Mode is set to replicated. Distributed mode may work, but I haven't seen too many cases where it has in the past.

I would strongly suggest moving the tape device to the media server performing the initial backup, maintain the initial backup target, maintain the opt-dupe job to the off-site, but add a duplicate to the, now, local tape drive.

Regards.

Randey

-3
Login to vote