Video Screencast Help

the write thread and path to store files on Vault Store

Created: 21 May 2013 • Updated: 21 Jun 2013 | 8 comments
This issue has been solved. See solution.

Hello,I'd like to know , when store files on the VaultStorePartitionfor one archive task ,like archive one mailbox, It will be multi thread write or single thread?

Is it can be configed?

Or only when archve multi-mailbox at the same time ,then the write thread will be multithread.

Bytheway, another question.

is there any way to identiy the files stored on the VaultStorePartition belong to different task(like different mailbox) through the file's store directory or name? 

Thank you for your help

Operating Systems:

Comments 8 CommentsJump to latest comment

TonySterling's picture

There are multiple threads utilized by the Storage Service and is configured on the property tabs of the Storage Service.

In answer to your second question, no, you will not be able to test on disk what files belong to whom.  Files are stored in a directory under a month-day folder, unless of course you are using a Centera or something like that.

Is there a specific reason you ask?

eric_xie's picture

thank you for your reply, very helpfull.

I am concern about using the deduplicaiton equipment as the vault store, the file 's organization will affect the deduplication ratio

I have another question, will the collection progress collect different task's files into one cab file ?

If so,dose that mean because the files will arrival at the vaultstore allmost the same time ,then collect them together regardless which the task they belong to ?

in addition,

when disable the SIS EV provided and two same mail  store on the vaultstore as dvs file .

Is the file 's data is absolutly the same ? otherwise,a deduplication eguipment base on file level not

the block level will not identify the duplicate data .

 

 

Rob.Wilcox's picture

Eric try to remember if it was all single threaded it would be terribly slow. It is massively multithreaded just like Exchange

Rob.Wilcox's picture

Collections happen outside of the archiving window - they happen at a later time. What is the concern about deduplication?

Remember that EV SIS doesn't work quite like you might imagine.  Mails are broken down in to shareable components and sharing takes place on components of about 20 KB or higher.  So if a two mails have the same 3 MB Word document attached, the Word document will become a SISPart, and would be SISed.  If you turn off EV SIS, then that SISPart will exist twice on the file system, and it will be exactly the same.

 

Can you further explain the problem you are trying to solve?

eric_xie's picture

Thank you for your help

I am concern about using deduplication equipment like data domain.

Because symantec recommend using collection when using data domain, so the write performance will be better .

my understanting is this kind of collection is happen at the writing progress,not after the archive.

so , if the collection was made of different mailbox, the duplicate data will not easy to be identify by some

deduplication equipment like HP's D2D ( their dedupe base on similar detection),because the data was interweaved .

So I'd like to know  if the collection was made of different mailbox

TonySterling's picture

You set that age at which items are collected, the default is files older than 10 days.  Are you using replication?  If you are, you probably won't need to worry with collections, but if you aren't I believe collections is still recommened.

Have a read of this: http://www.cornerstonetechnologies.com/images/stor...

It was written for EV 8 so it's been around awhile but should still mostly apply.  You best option is to ask your DD\EMC rep what they recommend.

HTH,

SOLUTION
Rob.Wilcox's picture

Eric, did you need any more help with this issue, or is it resolved now?

eric_xie's picture

From data domain 's slides , they recommend use data domain as secondry storage ,the collections 's cab files first generated on a temp storage, then migrated to data domain on schedule.

That would be a solution.

Sorry,I have another question. when sis endable , data are store as dvs,dvssp, dvscc

I don't know when disable the SIS, Are they still store like this? 

I mean if a sharable part file (dvssp) exist ?

because I 'd like to know if a  NAS equipment with File base deduplication  will work as good as centera? 

because centera store the oringinal file, but not the dvs,..., the centera's own SIS will work well.