Video Screencast Help

Unconsidered hardware is sometimes the reason for backups failing in Backup Exec

Created: 12 Jan 2010
CraigV's picture
+1 1 Vote
Login to vote


Thought I'd share this with you all...
I had a server in Africa that was failing backups all the time. Initially it had an HP StorageWorks 1/8 G1 autoloader connected. I tried just about everything to get it working, and was really happy to get 2 working backups through. Backups then failed, and reparing the drive was going to be too costly.
An HP StorageWorks MSL2024 was sent to the site, along with new LTO2 tapes (not backing up much!). It was correctly cabled up, and Backup Exec 11D was upgraded too BEWS 12.5 with SP2 (which has seen my backup environment stabilising greatly!). I was able to inventory the slots, could unlock/lock the drive, and could initialise the device. i would complete about 90% of a tape erase, and it would then fail.
Backups would start out at 10MB/m, and drop down to 3MB/m, before hitting 0 MB/m and eventually failing.
I tried upgrading the service pack again, reinstalling the application from scratch, repairing the DB straight after that, using the latest Symantec drivers, as well as the HP drivers, upgraded the ProLiant Support Pack to PSP 8.20, and flashed the firmware of the library. Basically I threw everything but the kitchen sink at it, and had then thrown my hands up in defeat.
I happened to check the Array Configuration Utility (ACU...bundled with an HP server's SmartStart CD) and found that 1 disk in the array had failed, and asked the site rep to order another 1.
Once the drive was replaced, and the RAID array had rebuilt (RAID5), backups started flowing again, and have been completing successfully since.
What I suspect happened is that because the array was running in degraded mode with 1 less drive, there was no throughput to the library causing the backups to fail. With redundancy restored, this reverted back to normal.
So I'd like to encourage you guys to check hardware outside of the usual...tapes/disks etc...and check your server hardware for any failures; check cables for breakages...the solution to your problem might be simpler than you think!