Video Screencast Help

Backup Exec 12.5 Server Crash / Lockup

Created: 09 Oct 2009 | 31 comments
Mike Riley's picture
Hello,
I have read several threads now with this same problem, basically we are getting a intermittent server lockup whilst using backup exec 12.5, we are confident that it is backup exec that is causing the problem since this is happening on two servers and one of these is a new build.
Basically we get several good backups and then anywhere between 2 – 3 days or even up to several weeks the server will just hang completely during its backup, there are several hundred logs that indicate this is happening when backup exec is attempting to backup exchange.
All backups effected seem to have the same version of backup exec, and are backing up to external hard drives.. all versions of backup exec are fully up to date and all server software, firmware etc is fully updated.
Since the software is OEM bought with the server there is no maintenance agreements but believe I should be able to get some assistance here.
As a Symantec partner and reseller of the software i find it hard to believe that a fix has not been produced for this issue, if no solution can be found we will have no choice but to stop selling / installing backup exec (at this moment in time its causing more trouble than its worth).
##
If anyone can provide any assistance with this issue at all it would be most appreciated.

Many Thanks

Mike

Comments 31 CommentsJump to latest comment

Dev T's picture

Hello Mike,

As I understand the server hangs/lockup happens during exchange backups and the Backups are going on a External USB Drive connected to the Backup Exec server.

1  : Do you any specific service of Backup Exec Service eating up memory OR CPU while backing up Exchange?

2  : Only for the TEST purpose I would recomend you to perform exchange backups on a Backup to Disk folder located on the server and not on to the USB Drive.

Its a best practice to have a "Support Contract" with Symantec...
 

Mike Riley's picture
Dev,
Sorry for the late reply, have been away on annual leave.

I will schedule this in with one or our engineers to monitor the processes whilst the exchange backup is in progress, I will also arrange for a separate backup configuration to test a local backup to disk to see if this is only affecting the system whilst backing up to the USB hard disks.

At this moment in time we have this affecting one particular server every 3rd backup so should have some results over the next week.

Thanks for the suggestions, will keep you updated
Many Thanks
mbjorkqvist's picture

We have also ran into the same problem at some of our customer sites. The problem is related to running an Exchange backup task to USB disk. It's seems that the backup job completes without problems, it's right afterwards when the catalog is beeing updatet that the server freezes.

We have also monitored processes during backup and run PoolMon to check for memory leaks => CPU load and memory consumption is OK.

This is a problem with at least the following Backup Exec versions: 12 and 12.5 and Exchange 2003/2007. Any help on this would be appreciated.

Kind Regards
Mattias

Adco's picture

Hi,

I am having the exect same issue, backup exec 12.5 Small business Server eddition running on the SBS DC, I have a remote agent + SQL agent running on a seperate SQL server.

Every 3rd or 4th backup casues the server to hang,

Has there been any fix or workaround found for this?

Like the previous poster, this is causing major customer issues

thanks

John

Adco's picture

All indications are that it is when the exchange server is being backed up the system crashs

John

Adco's picture

Has anyone got any ideas about this

I really need help!

Do I just reinstall Backup exec and stop it from updating as it may have been an update that caused it

John

CraigV's picture

Hi Adco,

Do you have the exact same issue? Reinstalling BEWS might be a bit excessive at the moment.
Are there any specific errors showing in the job log?

Laters!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Adco's picture

I get plenty of errors when it happens

information store not snappable
Server not accessable
Databases not accessable etc
Error accessing different resources

Usually when the customer comes in in the morning, no one can access the server and email is not available, so customer goes to server, server is not responding and he must reboot the server.

I have 15GB of space on the server C: where backup exec is installed, is this enough for snapping and adding to catalogues? I have exchange on a seperate drive with plenty of space.

Backingup to 2 exteranal USB hard drives which are rotated every day.

Yesterday I reconfigured the server to backup to another PC backup to disk folder for test perposes to rule out the USB configuration as the problem.

CraigV's picture

Hi,

I use AOFO as it works with Exchange and clears the logs like it should.
However, Symantec recommend not using AOFO when backing up SQL and Exchange. Speaking from experience, when using AOFO with 1 of our clients, I found out the hard way I couldn't redirect an SQL restore...
So if it's on, take it off under Tools --> Options, as well as on each of your jobs and try again.

Laters!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

PeterMac 2's picture

My customers problem is similar. We were using 12.5d SP2 with all patches as of Jan 09, backing up an SBS 2003 R2 server to an external USB drive. On the weekend we upgraded to SP3 of BE and now the server hangs around the same time that it is backing up the Exchange server. If I redirect the backup to an internal HD the system has no such problems. I don't know if it is possible to remove SP3 to go back to the previous state.

Peter

CraigV's picture

Did you guys reinstall any drivers when SP3 was rolled out? Have you looked at updating the drivers for your device in BEWS?

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

PeterMac 2's picture

This is a standard USB External HD on the server, there are no drivers for it that I am aware of.

CraigV's picture

Was it configured as a removable drive in BEWS?

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

PeterMac 2's picture

Yup, the backup to the removable drive, now connected to the USB port of a workstation, is just about finished. Since the backup to one of the other internal drives finished ok I would assume it's having a problem with the USB drive. It's strange though I have two other SBS servers with similar backup drives running SP3 with no problems. However they are running BESR 7 not BESR 8.5 (backup exec system recovery), although no BESR backup is set to happen while BE is doing it's backup.

CraigV's picture

Not a possible hardware problem on that server?

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Adco's picture

I went and reinstalled backup exec for SBS  12.5 on Saturday.
Uninstalled the clients and remote agents and reinstalled them via the the program console
and the only update was service pack 3, which I installed.

In the release notes I may have found the cause

Post Requisites
- A full backup is recommended after installing this hotfix.
- The IDR recovery media (the ISO created through the IDR Wizard) needs to be recreated after installing this hotfix.
- Remote Agent for Windows Servers will need to be updated after installing this hotfix.
- Remote Agent for Linux/Unix/Macintosh Servers (RALUS/RAMS) will need to be updated after installing this hotfix.
- Desktop/Laptop clients will need to be updated after installing this hotfix.

Maybe the old agents had something to do with it. Anyway 2 successful backups no crashes and the backups are taking half the time, well so far.

jgus 2's picture

I too have the lockup of the Server when BE 12.5 is backing up the Exchange server.  It is BE 12.5 Small Busines on a Windows 2003 SBS server.  Hope someone has a real solution.  Mine is hit or miss.  Might only happen once in a month, but that it too much for a complete server lockup.  The server isn't completely dead when the lockup happens, it is ping-able but not a lot else is working on the server.  It appears Active Directory and Exchange are pretty well hosed at that point.  Here are some event log errors I see when BE starts to choke.  Please let me know if anyone else has a solution.  Thanks

Application Event Errors:
 
11:47:38 pm 15-Dec-09
MSExchangeDSAccess 
Topology 
N/A 
Process STORE.EXE (PID=5680). All the DS Servers in domain are not responding. For more information, click http://www.microsoft.com/contentredirect.asp.
 
11:48:02 pm 15-Dec-09
MSExchangeAL 
LDAP Operations 
N/A 
LDAP Bind was unsuccessful on directory server.bstreet.local for distinguished name ''. Directory returned error:[0x51] Server Down. For more information, click http://www.microsoft.com/contentredirect.asp.
 
11:48:46 pm 15-Dec-09
MSExchangeDSAccess 
Topology 
N/A 
Process MAD.EXE (PID=5764). All Domain Controller Servers in use are not responding: server.bstreet.local server2.bstreet.local For more information, click http://www.microsoft.com/contentredirect.asp.
DNS Event Errors:
 
Event Type: Error
Event Source: DNS
Event Category: None
Event ID: 4015
Date:  12/15/2009
Time:  10:07:35 PM
User:  N/A
Computer: SERVER
Description:
The DNS server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error debug information (which may be empty) is "". The event data contains the error.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 51 00 00 00               Q...   
 
Event Type: Error
Event Source: DNS
Event Category: None
Event ID: 4004
Date:  12/15/2009
Time:  10:09:41 PM
User:  N/A
Computer: SERVER
Description:
The DNS server was unable to complete directory service enumeration of zone ..  This DNS server is configured to use information obtained from Active Directory for this zone and is unable to load the zone without it.  Check that the Active Directory is functioning properly and repeat enumeration of the zone. The extended error debug information (which may be empty) is "". The event data contains the error.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 2a 23 00 00               *#..   
Event Type: Error
Event Source: DNS
Event Category: None
Event ID: 4000
Date:  12/15/2009
Time:  10:15:49 PM
User:  N/A
Computer: SERVER
Description:
The DNS server was unable to open Active Directory.  This DNS server is configured to obtain and use information from the directory for this zone and is unable to load the zone without it.  Check that the Active Directory is functioning properly and reload the zone. The event data is the error code.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 2a 23 00 00               *#..   
Lloyd Wolf's picture

Add me to the list iof people experiencing this same problem on two different servers. Both servers are running SBS2003 and Backup Exec v12.5.  The server totally freezes-up at the exact time of the Exchange Server backup.  We have tried changing the time of the backups - in case it was conflicting with some other task or process, but the server lockup followed the changed time. We tried seperating Exchange into a totally different job, scheduled tfor a totally different time, but the server lockup followed the changed job/time.

It happens randomly - it might occurs 2 days in one week, and then not again for another 3 weeks.

One server is backing-up to removeable a backup-to-disk folder on a eSATA disk (from Highly Reliable Systems). The other server is backing-up to a backup-to-disk folder on an external USB disk, which then duplicates to a REV drive.

Our customers are rather upset with the lockups.  Every night I go to sleep, I lay awake wondering if this will be the night the3 server(s) lockup again. Not good. We use backup software to help protect from problems/crashes., not be the cause of server problems/crashes.

I just opened a case with Symantec Tech Support - Case 410-663-529.

We'll see if an engineer can shed any light on the situation.

maegan1116's picture

I have this same problem too. Our server has locked up two days this week. I have temporarily unselected the exchange backup since it's a holiday weekend and I don't want it locking up when no one is able to get in to reboot it. I installed all the updates (SP3 and 3 other hotfixes).

Lloyd - let me know if Symantec Tech Support is able to figure it out b/c we currently don't have any support with them.

Lloyd Wolf's picture

A little more information...

I worked with Symantec Tech Support some more last week on my open ticket for two servers that are randomly experiencing this problem -  Symantec Tech Support Case 410-663-529.

The Support Engineer said that he found some prior cases where the problem was caused by the Advanced Open File option being enabled in a backup job that included the backup for Exchange and SQL in the selection list. In the past, we had separate jobs - one job for the c-drive, d-drive, system state and shadow copy components with Advanced Open File turned on, and a second job for Exchange, SQL and Sharepoint with Advanced Open File turned off.  During troubleshooting of this problem over the past few months, we made a bunch of changes to the job(s). When the tech looked at the server during this last support call, Advanced Open File was turned on in the job that included Exchange and SQL - so he focused on that as being the cause, even though I insisted that it was setup different in the past, and the problem had occurred with that setup in the past.

So, with him remotely connected to the server with me, we reconfigured the two separate jobs - one job for the c-drive, d-drive, system state and shadow copy components with Advanced Open File turned on, and a second job for Exchange, SQL and Sharepoint with Advanced Open File turned off.  So if it happened again, they could no longer focus on that as being the cause.

I then asked the support engineer, if/when the problem occurs again, what would you have me do next. He said he would enable more advanced logging for the Remote Agent via a registry setting. So I worked with him to go ahead and do that too, so that if.when it locked-up again, I would already be one step ahead for the next troubleshooting step.

Looking at the Windows application event logs, for about 10-15 minutes prior to the problem occurring, there are multiple entries for the backup of Exchange Server, including: beginning Information Store backup, beginning priv1.ebd backup, ending priv1.ebd backup, beginning priv1.stm backup, ending priv1.stm backup,  beginning pub1.ebd backup, ending pub1.ebd backup, beginning pub1.stm backup, ending pub1.stm backup, Information Store backup procedure has been successfully completed. Then there are some event log entries related to the backup of soem SQL databases, which are the next irtems int he job selection list. Then about 2 minutes later, a flood of errors begin to appear in the event log for things like: All Domain Controller Servers in use are not responding, All Global Catalog Servers in use are not responding, etc.

In working with the support engineer, he said that the event log entries for Exchange serevr related to the snapshot of the exchange files for the GRT backup, and that this completes, Backup Exec then writes the files and related catalog information to the destination backup disk, and that is when the lockup problem is occurring - so it seems to be somethign related to the Cataloging process when writing the backup files to the disk - not the actual backup/snapping of the exchange files.

We just had another lockup for one of my two servers experiencing this problem over the weekend - not good to lockup a server over the New Years holdiay weekend!!  I will be working with support this coming Monday. Hopefully they can identify the problem and a solution/fix real soon.

Our two clients have been as patient as we could ask, but I am very frustrated, and I am sure they are very frustrated too. We use the SYmantec Backup Exec backup software to help protect from problems/crashes., not be the cause of server problems/crashes. Symantec needs to help us identify a fix.

Has anyone else opened a support case with Symantec?  If we all created cases, and reference each other's cases, perhaps Symantec would be more aware of the problem and spend more resources on solving it.  If you have not already done so, please open a case with symantec and report the problem.  You can reference my case # Case 410-663-529.  

Make sure your jobs are setup as they asked me to setup - with two separate jobs. Also compare your event logs to see if you find similar events as to what I see in my event logs - whenever the problem occurs.

Thanks.

Lloyd

Adco's picture

Hi all again,

Happy New year.....

After a complete reinstall the issue reared it's ugly head again so that didn't solve it.

As there was not going to be anyone in over the Christmas period for my Client I put their backup schedules on hold, and surprise surprise, no server crashs over the period.

Thanks very much Lloyd for your input and please keep us up to date with how your case is going.

maegan1116's picture

Has anyone been able to get a solution for this yet? Currently I've been excluding the exchange from the selection list to avoid locking up the server but I can't do that forever.

CraigV's picture

Hi,

Have you checked to make sure this isn't a possible hardware-related issue?
I wasn't getting any sort of backup speeds out of a brand-new HP StorageWorks MSL2024. It would start out around 10MB/min, and slow down to a crawl before failing. I upgraded the SP, upgraded the drivers, upgraded the hardware's firmware, and reinstalled BEWS. By chance I checked the Array Configuration Utility (HP server), and there was a failed drive.
Replaced the drive and the problem went away.
Might not be the same problem, but it highlight's the hardware can be a possible cause too...

Laters!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

CraigV's picture

.

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

lisabbirk's picture

Hi There,
We are experiencing the same issues.  Very frustrating.  Lloyd did you ever get any resolution instructions from Symantec Support?

boeth's picture

Well I'm not sure it has anything to do with exchange.  I have experienced several blue screens toward the 98-99% completion of the backup server.  It seems to do with the verify or system state.  I can't seem to determine what file it is working on when it happens.  I do know it wasn't backing up my exchange server.  That backup seems to be doing well.  Mine is when it backs up it's self.  I'm not using the open file either.

I feel it's just a matter of time when it doesn't come back at all.

Ken Zhu's picture

Guys,

I have the similar issue. I am using BESR 2010. The Exchange server will hang during incremental backup once in 2/3/4 weeks. When that happens, I can only see the black screen, the server can be pinged but not RDPed, and the only thing I can do is hard reboot. I have done the hardware diagnosis and nothing indicates there is any hardware issue. I have raised the case and did 2 rounds of diaganosis with Symantec. But I reach nowhere so far. I am thinking about rebuilding the server due to growing pressure from business unit. But that will be a painful and risky process. Most of all, it may not fix the problem!

So I will be deeply appreciated if anyone can share your soultion regarding this problem.

Vik's picture

Hello

 

I have been looking for solution and tried numerous things but still experiancing this issues.

We run BEX 12.5 fully patched on SBS 2003, Backup to Ext. USB drive.

Server get frozen rendomly while backing up Exchange (probably while updating catalogs)

Is symantec going to come up with a sulution for this problem???

 

Any one has any ideas what is wrong here.

 

Thank you so much!

Viktor 

CraigV's picture

Hi Vik,

 

Have you considered upgraging to BE 2010 R2? It has better support for removable B2D, and probably for USB as well. Seems to have been enhanced a bit more to compensate for the issues being experienced on previous versions!
 

Craig

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Vik's picture

Craig,

 

Upgrading to BEX 2010 R2 was something I was planning to do and will actually do today for one of the clients that is experiancing this issue. Will keep you updated.

Thanks for quick reply

Viktor

Vik's picture

Hello

 

Finally I've upgraded two problematic servers with BEX2010 R2.

Now I will be monitoring these machines and reporting if there will be any issues.

 

Thanks

Viktor.