A warning to anyone who is backing up Exchange using GRT and CPS. Perform a test restore now!
Hi all,
We've have come across a nasty problem with Backup Exec 11D when backing up Microsoft Exchange information stores with the CPS option, which I think everyone should be aware of.
We recently upgraded to Backup Exec 11D, primarily to take advantage of the Exchange Continuous Protection feature which was introduced with this version. As our backup server is located in a different building from our Exchange server, CPS backups would provide excellent up to the minute protection for Exchange in the event of a fire or flood destroying our production servers.
After installing backup exec, CPS server and running live update to ensure both products were current, I configured a Microsoft Exchange CPS job to backup our Microsoft Exchange information store (3 storage groups totalling 4 databases). The inital backup and verify completed sucessfully and the CPS job began replicating the log files as expected. Everything appeared to be working correctly, no errors in the job log.
About a week later, I decided it would be prudent to run a test restore to ensure that CPS was backing up as intended. I created a recovery group and attempted to restore from one of our previous sucessful CPS backups. To my surprise, the job failed with the error:
"0xe000fe7a - An error occurred while accessing Microsoft Exchange Server. Check the application event log on the Exchange Server for more details."
On the Exchange server, ESE Mailbox Recovery was failing with the error:
"Information Store (2744) Callback function call ErrESECBRestoreComplete ended with error 0xC80003FE A disk I/O error occurred. "
I ran a second and third full CPS backup of the information store , just in case the first backup was corrupted in some way (even though no errors were reported by Backup Exec during the backup or verify). I was also unable to restore from these backsup - again the restore failed with the same error.
At this point I logged a support call with Symantec. After some troubleshooting, they asked me if the Backups were targeted at a RAID-5 volume; which they are. Apparently this was a big mistake, in the words of Symantec support:
The "Disk i/o error message" has been noted to be given while doing GRT backups/restores to a RAID -5 volume. There will be a hotfix released for this in near future however, the workaround suggested till the time being is to do a non-GRT backup. If you want to be able to restore individual items, then you will have to do a separate legacy mailbox backup.
So basically, GRT is presently so broken that if you are backing up to a RAID-5 volume, your data may be silently corrupted without warning. This also renders Exchange Continuous Protection useless for us, as you cannot perform a CPS job with GRT disabled. In the words of Symantec tech support:
CPS Exchange Continuous backups have been designed only for GRT purposes so unfortunately, you will not able to use it till we have a hotfix.
Alternatively, you can backup to a non RAID-5 drive as the issue has been noted only with RAID-5 volumes.
I personally find this unbelieveable! With the volume of data we're backing up, there's simply no option to use a non RAID disk array. I would venture that most users of Backup Exec who are performing D2D2T backups are storing their disk based backups on a RAID-5 volume of some sort. If GRT does not operate correctly with Raid 5 volumes, why on earth has this functionality which is still of beta version quality been included in a release version of Backup Exec?
I am especially concerned with the nature of this failure. As Backup Exec was reporting that it was successfully backing up our Exchange information stores, and the verify also succeeded, there was no evidence of a problem until we attempted a trial restore. In reality, Backup Exec was silently corrupting our backups! Had there been a real disaster, we would have found ourselves in the unfortunate position of being unable to restore our Exchange databases, with no prior warning from Backup Exec. In my view, this sort of failure is exceptionally dangerous in a backup product, which is the last line of defense for every organization. There must be many companies running Backup Exec 11D who have not yet attempted a CPS restore of their Exchange data and may only find out in a disaster recovery situation that their backups are useless! My faith in what I once considered an excellent product has now been somewhat shaken!
So my final point, and I cannot stress this enough. If you are running CPS Backups for Exchange, and you haven't done so already: TEST YOUR BACKUPS NOW! Before there is a disaster!
Comments
Alastair,
First, I would like to apologize for the unfortunate experience you had with support. I want assure you that Symantec is working hard to enhance support to build a better customer experience.
To address some of your concerns:
At this time, Symantec is not planning to release any hotfixes for anything related to RAID5 or any other hardware disk configurations in relation to GRT. In recent cases where Symantec has encountered problems backing up to a RAID volume, it was found that the hardware used in the disk configuration was not on the Microsoft Windows 2000 server or Microsoft Windows 2003 server hardware compatibility list.
We are currently investigating why you were provided the original information and I'll be contacting you shortly. I would like to see if we will be able to help you resolve your issues or provide a better explanation as to why you might be experiencing these problems.
Thank you for choosing Backup Exec,
Dano Oliveira
Sr Technical Product Manager
Would you like to reply?
Login or Register to post your comment.