Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Ratios Explanation

Created: 19 Jan 2010 • Updated: 04 Aug 2010 | 8 comments

Can someone explain the deduplication ratios?  I don't see anything in the Admin Guide for this and curcious how these numbers are being generated.

Comments 8 CommentsJump to latest comment

Aidan Finley's picture

Sure.  The ratio is a way to express how much data is stored on disk after deduplication calculations are run.

If you take a ratio like 10:1, you can read it like: For every 10 units of data sent to the media server, 1 of those unit is actually stored to disk.

This can be expressed as a percentage as well.  For that 10:1 ratio above, the formula is (1 - (1/10) * 100) to get a value in percent.  So, 10:1 deduplication equates to 90% reduction in data stored to disk when compared to the original size of the data.  

You may have seen this same expression for Compression (Winzip, etc.) in other products.  Typical file compression for Backup Exec's compressed backups offers about 2:1 compression - that is, a 50% reduction in the amount of data stored, compared to the original size of the data.

Thanks,

Aidan Finley
Sr. Product Manager, Backup Exec

  

C2 Computers's picture

Thank you for the response.  When we back up the same data in our testing the ratios seem to be changing, example of it below.  I assume the 1:1 goes to the 4971.8:1 and then doubles at the next job (almost exactly doubles) and then it drops back down since some files were changed?  Just seems like it should not have dropped that much?

Name Device Name Job Type Job Status Percent Complete Start Time End Time Elapsed Time Byte Count Job Rate Error Code Deduplication Ratio
Backup 00002 Deduplication Storage Folder 1:2 Backup Successful 100% 1/15/2010 2:43:41 PM 1/15/2010 2:46:40 PM 0:02:59 4,019,996,218 4,035.00 MB/min   3577.1:1
Backup 00002 Deduplication Storage Folder 1:1 Backup Successful 100% 1/6/2010 3:26:22 PM 1/6/2010 3:29:46 PM 0:03:24 4,019,972,075 4,035.00 MB/min   10019.7:1
Backup 00002 Deduplication Storage Folder 1:2 Backup Successful 100% 1/6/2010 3:20:12 PM 1/6/2010 3:22:49 PM 0:02:37 4,019,972,075 4,107.00 MB/min   4971.8:1
Backup 00002 Deduplication Storage Folder 1:1 Backup Successful 100% 1/6/2010 3:14:57 PM 1/6/2010 3:19:20 PM 0:04:23 4,019,972,075 1,575.00 MB/min   1.0:1

Aidan Finley's picture

That's very strange.  The first backup to that dedupe storage location should be along the lines of the 1.0:1 ratio.  Not the third backup of 4.

(FYI - Dedupe ratios can jump around from 4971:1 to 3577:1 and back again; when are talking about ratios that high even a small bit of new data - say, for example, something in the registry changed or the system state of the machine was altered ever so slightly - can have a big effect on the ratio.)

I'm intererested in the device names in the the job list there.  You have Dedupe Storage Folder 1:1 and 1:2; are these unique deduplication storage folders on the SAME BE media server, or are these folders on different BE media servers?  We only allow 1 dedupe storage folder per phsical media server, so if you have more than 1 per media server, then we have other issues here.

Are you by chance running in a CASO environment with multiple media servers, where each media server has deduplicated storage folders?

Thanks,

Aidan Finley
Sr. Product Manager, Backup Exec

C2 Computers's picture

Thanks for the follow up.  I see what you mean about jumping around in the numbers, I think we see some of that.  As far as storage, this is just an HP Dl360 G5 with some local storage (SAS drives), no external storage.  Deduplication folder is located on the local drive.

Not sure where the 1:2 came from, I think we only have one folder that I know of.  This is a single install, no multiple media servers. Put image in below if this helps.  I think we may also erase all the data and start over to be sure we didn't do something wrong?

1.png

C2 Computers's picture

Update:

I ran some new jobs, different servers, new data, etc, etc.  and the ratios looked good this time. As far as the folders 1:1 and 1:2, not sure, it is still doing this so if you have any ideas let me know or else I will start looking into that some more.  Thanks

Aidan Finley's picture

OK, thanks C2.  I'll still pursue the issue inernally and see if this might be a bug you are seeing in the Beta/Ealy Adopter code. 

Thanks,

Aidan

KentM's picture

Aidan,

On a related question, after all the deduplication is done and we're down to that single unit of data that is actually stored that one time, is that single unit of data additionally compressed within the de-dupe folder, or is it fully intact/uncompressed???  I'm just curious if beyond the deduplication "compression" if further, additional, traditional compression is also (still?) being used.

Thanks,
~Kent