Video Screencast Help
Search Video Help Close Back
to help
Not able to make it to Vision this year? Get a sampling in the Best of Vision on Demand group.

A warning to anyone who is backing up Exchange using GRT and CPS. Perform a test restore now!

Updated: 22 May 2010 | 2 comments
Alastair's picture
0 0 Votes
Login to vote

Hi all,

We've have come across a nasty problem with Backup Exec 11D when backing up Microsoft Exchange information stores with the CPS option, which I think everyone should be aware of.

We recently upgraded to Backup Exec 11D, primarily to take advantage of the Exchange Continuous Protection feature which was introduced with this version. As our backup server is located in a different building from our Exchange server, CPS backups would provide excellent up to the minute protection for Exchange in the event of a fire or flood destroying our production servers.


After installing backup exec, CPS server and running live update to ensure both products were current, I configured a Microsoft Exchange CPS job to backup our Microsoft Exchange information store (3 storage groups totalling 4 databases). The inital backup and verify completed sucessfully and the CPS job began replicating the log files as expected. Everything appeared to be working correctly, no errors in the job log.

About a week later, I decided it would be prudent to run a test restore to ensure that CPS was backing up as intended. I created a recovery group and attempted to restore from one of our previous sucessful CPS backups. To my surprise, the job failed with the error:

"0xe000fe7a - An error occurred while accessing Microsoft Exchange Server. Check the application event log on the Exchange Server for more details."

On the Exchange server, ESE Mailbox Recovery was failing with the error:

"Information Store (2744) Callback function call ErrESECBRestoreComplete ended with error 0xC80003FE A disk I/O error occurred. "


I ran a second  and third full CPS backup of the information store , just in case the first backup was corrupted in some way (even though no errors were reported by Backup Exec during the backup or verify). I was also unable to restore from these backsup - again the restore failed with the same error.

At this point I logged a support call with Symantec. After some troubleshooting, they asked me if the Backups were targeted at a RAID-5 volume; which they are. Apparently this was a big mistake, in the words of Symantec support:

The "Disk i/o error message" has been noted to be given while doing GRT backups/restores to a RAID -5 volume. There will be a hotfix released for this in near future however, the workaround suggested till the time being is to do a non-GRT backup. If you want to be able to restore individual items, then you will have to do a separate legacy mailbox backup.

So basically, GRT is presently so broken that if you are backing up to a RAID-5 volume, your data may be silently corrupted without warning. This also renders Exchange Continuous Protection useless for us, as you cannot perform a CPS job with GRT disabled. In the words of Symantec tech  support:

CPS Exchange Continuous backups have been designed only for GRT purposes so unfortunately, you will not able to use it till we have a hotfix.

Alternatively, you can backup to a non RAID-5 drive as the issue has been noted only with RAID-5 volumes.

I personally find this unbelieveable! With the volume of data we're backing up, there's simply no option to use a non RAID disk array. I would venture that most users of Backup Exec who are performing D2D2T backups are storing their disk based backups on a RAID-5 volume of some sort. If GRT does not operate correctly with Raid 5 volumes, why on earth has this functionality which is still of beta version quality been included in a release version of Backup Exec?

I am especially concerned with the nature of this failure. As Backup Exec was reporting that it was successfully backing up our Exchange information stores, and the verify also succeeded, there was no evidence of a problem until we attempted a trial restore. In reality, Backup Exec was silently corrupting our backups! Had there been a real disaster, we would have found ourselves in the unfortunate position of being unable to restore our Exchange databases, with no prior warning from Backup Exec. In my view, this sort of failure is exceptionally dangerous in a backup product, which is the last line of defense for every organization. There must be many companies running Backup Exec 11D who have not yet attempted a CPS restore of their Exchange data and may only find out in a disaster recovery situation that their backups are useless! My faith in what I once considered an excellent product has now been somewhat shaken!

So my final point, and I cannot stress this enough. If you are running CPS Backups for Exchange, and you haven't done so already: TEST YOUR BACKUPS NOW! Before there is a disaster!

Comments

Rob Walker 2's picture
22
Jun
2007
0 Votes 0
Login to vote

Welcome to the world of GRT. Basically it's a pile of cr*p. It just does not work AT ALL. We've deployed it at several of our sites and I really wish we hadn't bothered. I haven't heard of this problem before and to be honest I struggle to understand how the RAID level would make any difference. If you're using a separate RAID controller then it should be completely transparent. The RAID card would handle the actual writing of data, with backup exec not knowing the difference between a RAID 1, RAID 5 or even single disk.
 
I guess if Symantec say it's a problem then maybe it is but I just can't see how!
 
Anyway, I'm off to throw a couple of copies of BE 11d off a cliff or maybe use them as coasters.
Dano Oliveira's picture
22
Jun
2007
0 Votes 0
Login to vote

Alastair,

First, I would like to apologize for the unfortunate experience you had with support.  I want assure you that Symantec is working hard to enhance support to build a better customer experience.

To address some of your concerns:

At this time, Symantec is not planning to release any hotfixes for anything related to RAID5 or any other hardware disk configurations in relation to GRT.  In recent cases where Symantec has encountered problems backing up to a RAID volume, it was found that the hardware used in the disk configuration was not on the Microsoft Windows 2000 server or Microsoft Windows 2003 server hardware compatibility list. 

We are currently investigating why you were provided the original information and I'll be contacting you shortly.  I would like to see if we will be able to help you resolve your issues or provide a better explanation as to why you might be experiencing these problems.

Thank you for choosing Backup Exec,

Dano Oliveira

Sr Technical Product Manager