Video Screencast Help

deduplication pool size estimation.

Created: 27 Sep 2012 | 1 comment

hi all.

Need an advice for the following.

We have an Netbackup setup with some MSDP pools .

We need to plan our deduplication pool sizes according with our disaster backup/restore scheme we have.

So at buiding a we have the "pud1"  server at buildingA ,with client data size (according to the client backup report) at about 7TB.

That building will do a "replicate" - optimized dedup- to the building B at server pud1B.

We have a third building with pud3 with client backups sized at 1.5TB.This one has to replicate to "pud1" at building A


All sizes are taken from full backup sessions.

at "pud1" have some 2Tb of db backups with dedup rate 65%.I undestand that those will need some additional space for one week retention (5 backups)

The main planning is to keep 1 week retension at disk and rehydrate to tape for longer retentions .


The question is what sizes the Disk dedup pools must have?

The Netbackup deduplication calculator gives some 6TB for pud1 and pud1b and some 2ΤΒ Max for pud3.

Is there any safe way to calculate the space?

Comments 1 CommentJump to latest comment

f25's picture

The back-end storage size is very difficult to estimate. If your users are fans of JPG picutres, MP3 recordings orr AVI videos there is almost no deduplication (except the situation with multiple copies of the same multimedia stuff).

1 week retention period on PureDisk/deduplication is a poor idea. I would go for a month at least. The whole "cleanup proccess" (in PureDisk case) takes four cycles of maintenance policies to run, so depending on your setup it may be from 4 days to 4 weeks to remove the expired backup data.

From my experience shortening the retention period from 12M to 3M gave 7% gain in space. So, it is not really worth to cut it to 1 week as the sophistacted deduplication mechanism puts a lot of effort to keep all the records consistent. And a lot more effort to remove the expired chunks. I would risk saying that the data expiration would be more demanding that backup.

As well, the more data you have backed-up, the higher probabilty is that the "new" data on Client has already been backed-up some time before and is not neccessary to be backed-up again: can be just referenced.

If you can expand the storage size in a flexible way I would start with 2 TB for putd3, 10TB for pud1, 8 - 10 TB for pud2. Please remember that 85% of storage utilisation == storage pool full.So the 10TB is in fact 8,5 TB.

Good luck!