Video Screencast Help

Changes to Backup procedures in Enterprise Vault 8

Created: 31 Jan 2010 • Updated: 01 Nov 2010 | 10 comments
Language Translations
JesusWept3's picture
+4 4 Votes
Login to vote

In Enterprise Vault 2007 and earlier versions, backups were a fairly simple affair, for the most part you had three choices, and these choices are the same in Enterprise Vault 8 yet with significant differences.

The choices were regarding when to remove safety copies and are found on the properties of the Vault Store Partition, and they are as follows:

1. After Backup
2. After Backup (Immediately for Journaling)
3. Immediately After Archiving

Firstly, what is a safety copy?
A safety copy is when an item is stored in Enterprise Vault, and the original email remains in the users mailbox untouched except for the fact it is marked as a Pending item. The pending item is turned in to an Enterprise Vault Shortcut depending on what the Remove Safety Copy is set to on the vault store partition.

The reason for having safety copies is so that if you archive an item in Enterprise Vault, and then the storage goes down losing DVS Files before a backup can be made, then you have no lost any data from the exchange side, as that item is a pending item, leaving the users email intact.

However if you set it to “Immediately After Archiving” then regardless of whether the DVS file has been backed up or not, the email in the users mailbox will be changed, the attachments stripped etc, and without the DVS files being backed up, then you can face data loss.

We’ll have a look at what each of the “Safety Copies” settings mean:

1. After Backup
An item is stored to the Vault Store locations as a DVS (and any other DVS parts) and is indexed, the item in the physical mailbox remains as a pending item yet that email is still fully intact, with attachments and full message body etc.

When the item is set to after backup, it is watched in two locations in the Vault Store Database.
One in the JournalArchive table with a BackupComplete flag set to 1, and one in a table called Watchfile, which contains a full path to the Archived item.

Every item listed in the watchfile table will be scanned upon Storage Service start or every 12 hours.
The StorageFileWatch process will go through each item listed in the WatchFile table and determine whether the item has been backed up or not. It determines this by looking at whether the archive bit exists on the file or not. If the archive bit has been removed, then Enterprise Vault removes that particular entry from the watchfile table, and then in the journalArchive table, it changes the “BackupComplete” column from a 0 to a 1

Every morning (Depending on your monitoring settings) you will get an event notifying you how many items are still awaiting backup,  and that is driven from the JournalArchive table doing a count on where backupcomplete = 0

Once the item has been confirmed backed up, a message will be placed in the A1 or J1 queues, letting the Archive Task know that it is ready for post processing, where it changes the item from pending to fully archived, stripping the attachments, adding links and any other customizations you have made per the policy.

Some things to be aware of are that with CIFS based and NetApp devices, the archive bit is not supported, so you would have to use the IgnoreArchiveBitTrigger.txt file to let EV know that the items have been backed up.

In the case of an EMC Centera, if it is set up to After Backup, then you must have a replica centera, as EV will poll the replica every few hundred ms to see whether the items exist on both primary and replica centeras. Once it has been replicated then it takes this as being backed up.

2. After Backup (Immediately for journaling)
This is basically the same as above, except for the fact that if you have a journal mailbox, it will remove items as soon as they have been successfully archived. The reason for this is that some companies can journal up to 1 million items per day, and awaiting for them to be backed up can cause major issues having a mailbox that size.

3. Immediately After Archive
For some companies, an Immediately After Archive option is desirable so that you are not awaiting for backups to occur before you change the items from pending to fully archived, this can be when Enterprise Vault is used  for mailbox management, and some people cannot afford for their quotas sake, to have to wait for a backup to make their mailbox smaller.

Using this setting however you do incur the risk of data loss if the EV storage goes down and the DVS files haven’t been backed up. This may also be compounded if the item was sent or received after the last exchange backup, so you don’t have the avenue of going back to the exchange for the backup of the item.

When an item Is archived Immediately After Archive, it goes straight in to the JournalArchive table with BackupComplete set to 1 and it will not be listed in the watchfile table. There are usually no events for monitoring saying that there are x amount of items awaiting backup.

Changes in Enterprise Vault 8 to backups
Administrators that have the safety copies set to Immediately After Archive are used to not having any monitoring events regarding items awaiting backup, however there have been changes in regards to how Enterprise Vault monitors items that need to be backed up.

Regardless of your Safety copy settings, Enterprise Vault now logs each item in the WatchFile table to ensure that the item has been backed up, even if you have it set to Immediately After Archive, thus you should start to look at how you should process your backups to remove the archive bit or incorporate trigger mechanisms such as the ignorearchivebittrigger.txt

The upside of doing this is so that you are aware of how and even if your backups are processing correctly, since Backup administrators usually don’t have that much of a direct line of communication with other engineers such as Enterprise Vault administrators. Thus if you have an event stating that you have X amount of items awaiting backup, you know there is an issue that needs to be addressed.

But what was the practical reason for this change?
The change really revolves around the knew Vault Store Groups and sharing in between vault stores and partitions.

Imagine the following scenario

Default Vault Store Group
                - Users Vault
                 - Users Vault Ptn1 (After Backup)
                - Journal Vault
                 - Journal Vault Ptn1 (Immediately After Archive)

 

The mix of After Backup and Immediately After Archive can become a conflict of interest, especially since attachments are SIS’d.

So in this scenario, you have a vault store for users which is set to After Backup and its imperative that users items be post processed after backup. However we then have the Journal Vault which is set to Immediately After Archive meaning that you want to get the items in and out of the journal asap.

Now imagine if items that are set to Immediately After Archive don’t go in to the watchfile table and set to backup complete. Then a user archives an item in their mailbox as part of the users vault store, which is set to “After Backup”

The items are then SIS’d together (such as an attachment), when the Storage service looks to see whether the item has been backed up, it would look to the journal vault and see that the item has been backed up and therefore be one step closer to post processing the item.

So with Enterprise Vault monitoring the backups, EV can still respect removing an item immediately after archive, but can also respect that other vault stores and partitions will rely on those other items being backed up properly, so it can be confident that the items are secure and there will be no data loss incurred.

Is there any way to make it behave as it did in Enterprise Vault 2007?
If you really want to circumvent the Watchfile/JournalArchive process, you can set the IgnoreBackup column in the PartitionEntry table from a 0 to a 1, but you must be sure that the partitions you are setting them against Immediately After Archive partitions, otherwise you will negate the After Backup process.

Comments 10 CommentsJump to latest comment

Joseph Rodgers's picture

Hello,

Thanks for the great articles, much appreciated.

I'm curious if Centera and Centera collections are affected by or effect this setting.

For example,  With EV 7.5 set to immediate but Centera collections enabled would this introduce a delay until the collected blob was written to centera?

From your article I would think not since the item is never written to the watch table but does the Centera process make any changes to this functionality as you described it?

Thanks
Joe

0
Login to vote
JesusWept3's picture

With centeras and collections, immediately after archiving means that an item will remain in pending until the collection of 100 or so items has been written to the centera

if there is a communication issue with the centera and the items aren't collected and posted, then the items will remain in the collections area and the email will be in a pending state

if the centera has a replica and is immediately after archiving then the above applies also

when you have after backup and a replica then you will have to wait for it to be collected, posted to the primary and await ev to call clip exists against the replica

if you set after back up to a centera and have no replica, then Tyne items will be collected, posted to the primary, deleted from collections but the original items will always remain in pending state because it will have no replica to check

If that's the case you'd have to call support for which a registry key exists but shouldn't really ever be used

the good thing though is with collections plus immediately after archive, if the collections area is lost, deleted etc, you will not suffer from data loss due to the fact the items must be confirmed on centera before the pending items will be post processed

I should probably post an article regarding centeras, oh and also most of what I described in this article doesn't apply to centers in a vault store sharing scheme as centera items will not be shared and thus other vault stores or partitions will not rely on the backup or availabilty of items stored on the centera

0
Login to vote
J. Rodgers's picture

Good information and I would enjoy a deep dive on Centera.  I've read the Symantec and the more recent EMC white papers.

Couple more questions:

Collections  is 10MB or 100 Items.  Is this size\# items adjustable?

If I archive a single message at 10KB in size and no other archiving occurs when would this item be written from the cache to the Centera?  It seems to happen pretty quickly in testing.

Thanks
Joe

0
Login to vote
JesusWept3's picture

collections on centera aren't restrained to 10mb, they constrain it to the number of items per clip and don't look at how big the collections will be

typically when they centera collections runs it groups in to 100 items, if there aren't 100 items then it waits for 15 seconds, if the 15 seconds elapses then it writes what ever is in the collection, so if its a manual archive outside of an archiving window, its possible you may have a collection/clip with just that 1 item in there.

0
Login to vote
J. Rodgers's picture

Thanks for the info. 

I got the 10mb / 100 items from the EMC EV whitepaper:

"A collection is up to 100 items or 10 of data."

Are you familiar with registry keys that could control any of these settings?

Thanks
Joe

0
Login to vote
JayDhillon's picture

Great article, We are looking to create the vault stores on the Centera, want to know if it will behave differently if we are using Centera API versus creating CIFS share on Centera (as file server).

0
Login to vote
JayDhillon's picture

Great article, We are looking to create the vault stores on the Centera, want to know if it will behave differently if we are using Centera API versus creating CIFS share on Centera (as file server).

0
Login to vote
JayDhillon's picture

Great article, We are looking to create the vault stores on the Centera, want to know if it will behave differently if we are using Centera API versus creating CIFS share on Centera (as file server).

0
Login to vote
JesusWept3's picture

if you use the CIFS on a centera, then it will have to go in as a network share under the partition type and you will have to use the IgnoreArchiveBitTrigger.txt mechanism, the problem though is the CIFS part of the centera is god awful slow, i wouldn't recommend it to anyone

0
Login to vote
benq's picture

Hi,

i have EV 8.0 and get the error reported in EV Admin Console - Status checks: "It is more than 1 days since some partitions were scanned to check for newly-archived items that have not been backed up or replicated."
(as described : http://seer.entsupport.symantec.com/docs/312348.htm )

I note in the original post, under paragraph, "1. After Backup" ... "Every item listed in the watchfile table will be scanned upon Storage Service start or every 12 hours."

the issue with my configuration is that the scan is not occuring every 12 hrs.   (as observed from within EV Admin Console > Vault Store Groups > Express Vault Store Group > Express Vault Store > Express Vault Store Ptn1 >properties:Backup:
"Last item secured: 20/08/2010
"Last scan started: 20/08/2010
"Unsecured items found in last scan: 187
"Items secured in last scan: 279981

i manually initiated the scan a few days ago (20/08/2010) by restarting the Storage Service, and this cleared the counters (and the error), however prior to that, the last reported scan was about 10 days earlier, and now today it's reporting that the last scan was run on 20/08/2010 - so obviously scannning isnt running automatically

what is causing the scan not to automatically run every 12hrs as described? can i set this manually or can i check this setting somewhere from within the EV Admin Console?

we have employed the SQL backups as per http://seer.entsupport.symantec.com/docs/322715.htm
we also have the EV server being backed-up over the network via Symantec Backup Exec 12.0 (all drives where EV and associated d/b are installed, as well as the System State and Shadow Copy Components)

I believe the backups are working and the archive bit is being reset, as we're not getting any errors regarding 'no backups done since...' etc.

so just wondering why the scan of partitions isnt running

thanks
Ben

thanks

0
Login to vote