Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

Tape jobs running forever

Created: 12 Mar 2014 | 10 comments

Having issues with a Tandberg auto-loader (LTO-3) and wondering if it's a hardware failure.

Tape jobs recently stopped finishing.  The drive completes operations (move, stow, write, etc) and then says "seeking" for hours on a five minute job (quick erase, inventory, etc), while Symantec Backup Exec just shows the job continuing to run.  

If I try to cancel the job, I get "cancel pending" for some time followed by a "storage read/write error" (but only when the job cancels, no errors before that.) 

However, all the self-tests in the loader run fine.  Is there a good way to narrow down what could be causing the problem?  The Tandberg TDTEST tool doesn't seem to count the drive as supported since it's in a loader.  

Operating Systems:

Comments 10 CommentsJump to latest comment

lmosla's picture
check to make sure you are using Symantec Drivers
 
Do a full power cycle of the Tape Library :
 
1. Power Off the Server completely 
2. Power off the Tape Drive then 
3. Power on the Tape Drive  wait till it is at a 'Ready' State then
4. Power on the Server
 
Check Windows Event Logs for System Errors: Event ID's 5,7,9,11,15 These errors point to hardware errors:
Event ID 5.This error can be caused by a faulty SCSI card or a faulty SCSI termination. for this error contact the hardware's manufacturer.
Event ID 7. indicates that bad blocks exist and can be caused by an outdated tape device driver, faulty media, or dirty read/write heads on the tape drive. Try updating to the most recent Backup Exec tape device drivers, run a cleaning job, and replace the media with new media, if possible. If the error continues, contact the hardware vendor.
Event ID 9. Points to SCSI bus timeouts.  To correct this, slow down the SCSI bus and install the most recent SCSI drivers and firmware. & sometimes this error is caused when another device shares the same cable. Try moving the SCSI card to a different PCI slot that does not share the same bus as a RAID controller.
Event ID 11.  controller errors, to correct this error, slow down the SCSI bus and install the most recent SCSI drivers and firmware.
Event ID 15. Possibly occurring if the incorrect drivers are loaded or the drivers are not up to date
 
see this Technote for more troubleshooting http://www.symantec.com/docs/TECH24414 
commdb's picture

Checked the drivers.

Did the server / tape drive sequence.

No 5/7/9/11/15 errors.  

Also cleaned the drive.

lmosla's picture

Are you upgraded to Backup Exec 2010 R3?

If not I recommend upgrading. Prior to the install Stop the SQL server BKUPEXEC service and make a copy of the DATA and CATALOGS folder. After the install run all of the Live Updates until fully patched (make sure the Remote agent is pushed out to the remote servers afterwards). http://www.symantec.com/docs/TECH66724 and http://www.symantec.com/docs/TECH159557

commdb's picture

pkh, thankyou.

That is not the problem as loader is appearing and I can see the tape moving back and forth in Backup Exec.

This is LTO3 and that problem is with LTO5/6.

Imosla, I don't see anything in changelog for resolving this problem, still, it may be worth a try.  Would it be a good idea to fully uninstall everything before upgrading?  (drivers/backup exec/etc)

pkh's picture

I am aware that my earlier reference is for LTO5/6, but if you have recently updated the firmware of your tape drive, you might want to check with Tandberg.

lmosla's picture

You can simply perform a direct upgrade to BE201OR3 from your existing version BE2010.  There is no need to uninstall anything.

as mentioned above before installing:
Stop all of the Backup Exec services, including the SQL(BKUPEXEC) service.
Copy the catalogs and data folders from the <volume>:\Program Files\Symantec\Backup Exec directory
to another location for safekeeping. 
 
The install can be found at  http://www.symantec.com/docs/TECH66724
 
After the install completes run all of the Live Updates to patch to the most recent update and then push install remote agent on your remote servers 
 
Hope that helps
Larry Fine's picture

The drive completes operations (move, stow, write, etc) and then says "seeking" for hours on a five minute job (quick erase, inventory, etc), while Symantec Backup Exec just shows the job continuing to run.

If the drive is "seeking", then there is nothing that BE can do but wait.  I suspect you have a tape drive or a tape media issue.  I would continue to try and use vendor diagnostics or vendor assistance to narrow it down.  Have you tried new tape cartridges?  Have you cleaned the drive?

If you find this is a solution for the thread, please mark it as such.

commdb's picture

Larry,

Thanks.  I've tried new/different tapes and all show the same behavior.  Same with cleaning the drive.

I have noticed that when I run a backup, the drive moves the tape, reads from the tapes, even writes to the tape, but BE still only shows the job is "queued" even though it shows the tape moving to the drive from the slot.  But there are no hardware errors indicating a problem with communication and all self tests run fine.

Is it possible that BE is causing a failure of communication?

Larry Fine's picture

BE can show the job as "queued" while it does some of the setup housekeeping before it actually starts streaming data to the tape drive.

I doubt that BE could be "causing" a failure of communication.  But BE could be a "victim" of a communication failure. But you already stated that there were no errors in BE or the Windows event log.  therefore, I suspect that you have a hardware or a media issue.

Tape drives will sometimes go to heroic lengths to try get get the job done.  Some SCSI commands can take 2 hours (or more) for EACH SCSI command to fail/timeout.

You may want to fire up the SCSI tracer once the device shows "seeking" so that you can examine the low level communications occuring.

http://www.symantec.com/docs/TECH49432

Troubleshooting hardware with Backup Exec for Windows Servers using the SCSI Trace Utility (tracer.exe).

And make sure you haven't overlooked anything here:

http://www.symantec.com/docs/TECH24414

How to troubleshoot issues with a Robotic Library (autoloader/changer) and/or Tape Drive(s).

If you find this is a solution for the thread, please mark it as such.