How to interpret the deduplication statistics in the Backup Exec Job Log
|Article:TECH146827|||||Created: 2010-12-22|||||Updated: 2012-03-28|||||Article URL http://www.symantec.com/docs/TECH146827|
The job log for a job that is backing up to a Deduplication Folder contains a line that looks something like this right at the end of the Backup Set Summary
Deduplication Stats::PDDO Status for (MEDIASERVER): scanned: 1289 KB, stream rate: 78.67 MB/sec, CR sent: 22 KB, dedup: 98.3, cache hits: 17 (73.9%)
What does this mean?
To know the job rate of the deduplicate backup job which can be determined by checking the statistics under Backup set Summary ,
One will not find This information will not be there if its Duplicate Dedup backup job
(A) Deduplication Stats::PDDO Status for (MEDIASERVER):
This was a run on the media server named (MEDIASERVER).
(B) scanned: 1289 KB
The amount of data that was sent into the deduplication process (should roughly match the amount of data being backed up)
(C) stream rate: 78.67 MB/sec
The rate at which the data stream was processed
(D) CR sent: 22 KB
The amount of data that was sent to the CR = Content Router = Backup Exec Deduplication Engine
(E) dedup: 98.3%
(B) - (D) / (B) . So, in this case 1289 - 22 / 1289 = 98.3%. So, 98.3% of the data did not need to be sent to the deduplication folder. Only 1.7% of the data was new.
Note that in the deduplication folder statistics the deduplication is presented as a ratio such as 2:1. In that case it represents (B)/(D). If the above were the only backup in the deduplication folder, the deduplication folder statistic would be "Deduplication Ratio: 58.59 : 1"
(F) cache hits: 17 (73.9%)
Backup Exec caches information about the previous backup to speed up the deduplication process. This indicates how many times the deduplication process was able to use its cache to determine that this is a duplicate piece of information.
Article URL http://www.symantec.com/docs/TECH146827