Backing up PST files

TroyBauer's picture

I have some questions regarding the best practice for backing up files such as PST files.

I am evaluating whether or not to go with the Recovery solution on a limited basis in our environment.  I need to provide a backup strategy for end users in a few of our offices that store important email's in their PST files, and our Server group will not allow them to have more Exchange space or be willing to back them up on their servers locally any more.  The Server team wants all PST's currently residing on their boxes moved back locally and have the Client team figure out a way to back them up.  

We are running NS 6.5 with CMS and I think I can get funding for a smaller implementation of RS, but want to make sure I can accomplish what I need too.

1.  Can you backup a file by using wild cards?   For example:  D:\*.pst  since everyone has named their pst files differently and placed them in different folder structures?

2.  Can you backup a PST file while Outlook is open?

3.  What is the compression ratio for data typically?

4.  Can I ONLY backup certain file types and NOT anything else?  For example all PST, DOC, XLS, PPT files?

Thanks for any words of wisdom!

KSchroeder's picture

RS is your answer

Troy,
I think RS would be a perfect fit for the requirements you're describing.  The first snapshot may be a bit "painful" but after that it is very fast, even over relatively slow WAN links (384kbit or so) since RS only transfers the blocks in a file which were changed from the prior snapshot.  To answer your questions:

1.  Can you backup a file by using wild cards?   For example:  D:\*.pst  since everyone has named their pst files differently and placed them in different folder structures?
Yes, you can include (or exclude) files, directories, etc. with wildcards.

2.  Can you backup a PST file while Outlook is open?
Most of the time, yes. RS has a built-in open file handler that can in most situations backup open PST files.  Our users here are PST crazy (since we have a 150MB mailbox limit) so we have lots of PSTs, and I rarely hear complaints that a PST isn't backing up.

3.  What is the compression ratio for data typically?
RS uses a HLZS (??) compression.  Obviously it varies from file to file (just like any compression) but the number Altiris usually throws around is 15:1.  I think this is for overall compression using Full System Snapshot (including Redundant File Elimination and Redundant Block Elimination technologies).  

4.  Can I ONLY backup certain file types and NOT anything else?  For example all PST, DOC, XLS, PPT files?
Yes.  Basically you would configure a full system snapshot. On the exclusions, you would exclude C:\, with an exclusion exception for those file types you want to include.

Let me know if you want more information.  There is a lot of info in the KB about some of these topics as well.   FYI, I've been administering RS at my company for about 5 years now (off and on) so I do have some experience with it :).

Thanks,
Kyle
Symantec Trusted Advisor
If your question has been resolved, please be sure to click "Mark as Solution"! Thank you.

MBHarmon's picture

a semi off topic additional question

So, when backing up PST files does the metadata get modified (like the date modified or Owner) ?

- Matt

KSchroeder's picture

Not as far as I know

Matt,
AFAIK, the file is not changed whatsoever; the backup process through RS is a read-only operation.  Whatever changes may occur "under the hood" of the PST are copied up to the RS server.  Could you explain a little more about what you're looking for?

Thanks,
Kyle
Symantec Trusted Advisor
If your question has been resolved, please be sure to click "Mark as Solution"! Thank you.

MBHarmon's picture

Just curious if that metadata

Just curious if that metadata was preserved.  We're casually browsing for an app that we can use to automatically collect data for e-discovery purposes. 

- Matt

KSchroeder's picture

Matt, The Modified Date is

Matt,
The Modified Date is collected, but not any NTFS permissions AFAIK.  Maybe a home-grown custom inventory job (with History enabled) would get you what you want?  You'd probably have to create it in VBScript or similar since AeXCustInv doesn't have any "owner" properties that I'm aware of (though I guess you could use WMI in the CustInv...?)

Thanks,
Kyle
Symantec Trusted Advisor
If your question has been resolved, please be sure to click "Mark as Solution"! Thank you.

TroyBauer's picture

GREAT info Kyle. 

GREAT info Kyle.  I appreciate you taking the time to answer my questions.  Now I only have 1 more thing that you kind of answered before in a different forum question, but here goes....

I need to utilize an XP Workstation with a large quantity of disk space for the Backup repository since the server team will not allow me to place another Server on site for this.  My only other option which might be feasable if you think it is, would be to backup these pst files over the WAN, Throttled during off hours.  Since the PST file would probably be different every time, it would probably have to backup a 2gb file every single time, so not sure if this would be the correct route versus being able to backup to a local box....wishfully running XP.

Your thoughts and wisdom on this is appreciated!!!

Troy

KSchroeder's picture

Well supposedly you can use

Well supposedly you can use an XP machine, but I've not seen it working "in real life".  We backup over the WAN for about 120 sites in the US, and our users are PST crazy.  The nice thing with RS is that it transfers only the changed "blocks" (I think with PSTs it goes by 64KB chunks, and has some special handling for PST files in general).  So after you get the initial PST copied, it only transfers the "new" data within the file, and subsequent snapshots don't take that long.  Our average remote site snapshot here is 10 ~ 15 minutes, usually with less than a hundred MB actually copied from the end point to the RS server.

What I haven't tested is what happens when you "optimize" (err...compact?) the PST through the Outlook GUI; ostensibly that makes major changes to the data structures and might result in a lot of changed blocks.

Thanks,
Kyle
Symantec Trusted Advisor
If your question has been resolved, please be sure to click "Mark as Solution"! Thank you.

KSchroeder's picture

Any news Troy?

Hi Troy,
It has been a month or so...any updates?  Have you evaluated RS to see if it will work in your environment?  Please let us know.

Thanks,
Kyle
Symantec Trusted Advisor
If your question has been resolved, please be sure to click "Mark as Solution"! Thank you.

cnpalmer75's picture

RS & e-discovery & storage...

@Matt...
We have been using RS for this purpose fro quite some time now. However, the data as it is stored is not readily or easily "searchable". In order to actually access the data and perform a thorough search you would need to restore the data onto a machine. The data as stored by RS is encrypted in a proprietary .BLOB format.

@Troy - A few things to keep in mind if you are concerned about space requirements for your XP machine... only backup exactly what you need, set your retention policies appropriately, run your maintenance jobs regulary. This are probably the best recommendations I can make in regards to running RS in a confinded space like a single XP machine.

@Kyle - Your assumptions on the "compact" operation of Outlook are correct. It appears that that funciton works similar to a single file defrag which does alter a majority of the file structure. I also believe that even though RS uses the RBE process, if the file has changed significantly, then it will backup a new version of the file. I may be wrong, but Rene may have stated this to me in a metting before.

MBHarmon's picture

We've got a specific

We've got a specific e-discovery product that will allow us to search items.  Apparently preserving all available metadata while moving it to that NAS device that holds the data is key.

- Matt

Greg Zielinski's picture

I just wanted to add some

I just wanted to add some feedback to this post. We have had great success in backing up open PST files.  Many of our laptop users have PST files in the gigabytes and they backup without a problem on the block level.  In fact, to my suprise, when we rolled out Office SP2 and the PST/OST file structure was changed (Resulting in an average 20% growth in data), only that 20% was trasffered over the WAN.  It did not trigger a complete re-transfer of the "new" PST file.

About the only issue we see is the need to repair a PST file after we restore it from the recovery server.  These are usually PST files that were backed up while Outlook was open.  The repair is always successful.

As far as compression, the storage requirements of a PST file are about the same as ZIPPING the file.  Around 40-60% compression.  If the e-mail is mostly text with few attachements, it is much higher.

Where you see the 15:1 compression is with redudant file elimination.  Example

Winxp image with MS office and other standard applications (5gb).  All PCs in this example use this image and the only unique data is their PST file. We'll assume 50% compression on the PST.

RS BLOB storage requirement.
PC 1 (1gb pst) - 4.5gb  (The first copy of all the apps in the image compressed down to 4gb, then 500mb compressed for the PST)
PC2 (2gb pst) - 1.0gb (The app files in the image are all redudant so they are not trassferred over the WAN and do not require additional BLOB storage)

PC3 (3gb pst) - 1.5gb
PC4 (500mb pst) - 250mb

etc etc.