Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.

Backup Exec is ejecting a tape when it shouldn't...

Created: 31 Jul 2013 | 26 comments
deejerydoo's picture

Hi guys,

 

Backup Exec 2012 (patched to date).

Windows 2003 SBS SP2

Single HP Ultrium 3 internal SAS tape drive.

 

Last Friday I inserted a brand new tape into the backup rotation. For this I simply put the new tape in the drive and ran a label job on it. This puts the tape, as you probably already know, into the scratch media pool. Saturday morning I discovered I had received repeated emails, from BENT, asking me to remove the media. I connected to the server and noticed that the Friday night backup job was still running, with 0KB written to the tape, and a remove media alert was visible in BENT. When I checked the properties of the brand new tape in the drive I noticed it had been allocated to one of my backup media pools that are set to overwrite protect the backup data for a set period of time. However, it appears BENT had done this to the tape before it had run the backup; thereby preventing itself from overwriting the blank tape (this mishandling of the media pool allocation is a separate issue I am still dealing with on a Symantec support case)!? So, to get the backup job running again I thought, all I have to do is cancel the backup job, reallocate the tape back to the scratch pool and rerun the job. So, I ignored the media remove request and cancelled the job. Only to find that BENT had ejected the tape anyway!?

Surely, this is a fault in the logic of BENT. The tape should not be ejected until after the operator acknowledges the media removal alert. The backup job hasn't been set up to eject the tape on completion and I didn't acknowledge the media removal request, so BENT has a bug (faulty logic) whereby the tape is still ejected, in this scenario. In this instance, I had to ask my client to go into their office, on a weekend, so they can push the tape back into the drive. Whereby I moved the tape back into the scratch pool and kicked the job off again, which completed without further errors.

What should happen, when a tape that is overwrite protected and a backup job is trying to overwrite the tape is as follows:

1. A media removal alert is generated.

2. The operator is given the opportunity to check the backup job's settings and the state of the media in the drive and take actions to remediate the problem. In my case, this was to cancel the backup job, reallocate the blank tape to the scratch pool and then kick the backup job off again. However, I was precluded from doing this because BENT had already shosen to eject the tape.

3. If my chosen course of action was to swap the media then, and only then, Backup Exec should eject the tape if I acknowledge the media removal alert.

As far as I can see this is a fundamental flaw in the logic of the product that precludes it from being suitable for use at sites that have a single tape backup unit where backups need to be managed unattended.

Any help or suggestions would be appreciated.

Cheers,

 

David

 

 

Operating Systems:

Comments 26 CommentsJump to latest comment

CraigV's picture

Hi,

 

BENT?

Backup Exec will eject a tape if it is either write protected or not appendable. This is by design, and this has been done over virtually all the versions. With a tape autoloader/library it would simply check the next slot...with a stand-alone drive, it will eject the tape if it is unusable.

So there is no design flaw...I know of other backup vendor products that do this.

Have you tried to add that tape into the specific media set that will run for that day?

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

Larry Fine's picture

BENT?

 

"Backup Exec for NT", the old product name before Microsoft changed the OS name, which triggered a product name change to BEWS, "Backup Exec for Windows Servers".  I think this happened back around BE 8 or 9.  Old habits die hard :)

If you find this is a solution for the thread, please mark it as such.

CraigV's picture

...hehe! Thought as much but wanted to make sure it wasn't some other vendor's product.!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

deejerydoo's picture

Hi Craig,

My issue is not the fact that BEWS media protection pauses the job and alerts the operator. That part of the design is great. The problem is that, regardless of what's going on, the tape is immediately ejected and the opportunity for rescuing the job has been lost.

The design flaw is only in the default behaviour of the tape being ejected. Simply tweak the design so that the tweak makes the job more recoverable and then make the tape eject an operator controlled event, from acknowledging the media removal event.

I know of other backup vendors that do this too...

The media was a brand new blank tape that had been labelled in BEWS that afternoon. The default behaviour, as you undoubtedly already know, is for newly labelled media to be allocated to the scratch pool. The scratch pool is exactly where this tape sat until the BEWS job engine decided to stuff it up.

If all I did was label the new tape in BEWS, the BE2012 interface makes allocating the tape to any pool other than the default scratch pool, impossible. To do this I would have had to select the "Online Tape" media group and then gone to the properties of the media before I could allocate it to another media pool. Something that I would have only done if I was delirious or having a psychotic episode. Neither of which was the case, at the time. :-)

CraigV's picture

Yep, gathered that much, hence suggesting you add it in as an Idea. The first thing you're going to be told is to make sure the tape inserted is available for a write...but, if there was a delay between ejecting the tape that you could set, it would make more sense...add that in and see what happens.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

pkh's picture

BE does not move tapes from the scratch media set to any other media set, unless it has written to it as part of a job.

When a tape cannot be overwritten, it is always ejected and a media insert alert issued to request for an overwritable tape.  This is by design and is this way since forever.

deejerydoo's picture

Hi guys,

Thanks for your reponses.

BENT, Sorry... I've been using Backup Exec since the days of Arcada (pre Seagate/Veritas and Symantec) and ironically, the acronym seems to fit better with 2012 (sorry, couldnt resist that one)!

I am aware of the way in which, BEWS and BENT before it, works with tape media that is overwrite or append protected. My issue is as follows:

1. BEWS detects the media as not being suitable for the current backup type of the job currently running. This can be because of operator error, allocating a tape to the wrong media pool. Or, in my specific case, this appears to have been the job engine, somehow, allocating the blank (scratch media pool) tape to a media set that is overwrite protected, before it started to write the data for the backup job, thereby locking iteslf out from using the blank tape).

1.2. As BEWS has detected the media issue it issues a media removal alert and ejects the tape. A little point aboput this alert is that it can be acknowledged or cancelled.

Hmm... If there is only one outcome then "Cancel" is pointless and this alert should only be acknowledgeable... Unless the cancel option gives us a chance to remediate the issue with the media and save the backup job (see below)!

3. My issue is that because it has ejected the tape, in an unattended backup scenario, the remote operator is not able to remediate the incorrect tape media set association, until somebody attends the site to push the tape back in. Thereby making proper unattended operations not possible.

4. This situation would be better handled as follows:

4.1. BEWS detects the media as not being suitable for the current backup type of the job currently running.

4.2. BEWS issues a media removal alert, but does not eject the tape yet.

4.3. The remote operator now has a chance to take a look at the media properties and ensure the media is either associated with the correct media set or ejected. If the tape is to be ejected the operator acknowlegdes the media removal request and the tape is ejected.

4.4. However, if the tape can simply be allocated to the correct media set, the operator can cancel the media removal request and the backup job can continue.

Hooray!

The key think here is that, all that needs to happen is the tape eject should not be made the default behaviour, as there is no benefit to this. Media set protection has done it's job and paused the job awaiting input from an operator. If the tape is not ejected the job can be recovered remotely. All that Symantec developers need to do is simply move the event of the tape being ejected, from a default state, regardless of what's going on, to something that only happens when the operator acknowledges the tape to be removed. Simple really...

PS: Just because something has always been or behaved the same, does not mean it is right or cannot be improved upon. If this was the case we would all still be running around in furs clubbing eachother on the head.

CraigV's picture

...that might be worth an idea. although ejecting a tape would be by design...have you checked the settings to see if it's a simple case of deselecting a setting to prevent the tape ejecting?

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

deejerydoo's picture

Hi Craig,

Yea... tried that in the extensive testing with L1 and L2 engineers.

The current behaviour of BEWS is to have a hissy fit and spit the tape (not the dummy). Whereas changing this from a default behaviour would, as far as I can see, only improve the recoverability of otherwise lost jobs.

Come to think of it, I can't actually think of a scenario where ejecting the tape, by default, is a good idea at all... Can you think of any?

Surely the only time the tape shoul dbe ejected as the immediate default behaviour would be if the tape is full or the job, has finished, and if the job definition specifies it. If the remote operator wants to eject a tape there are many ways this can be done. If somebody is on site and only has physical access to the tape drive, they can also eject the tape. Why make it the default action and only increase the chances of jobs failing? As long as your media protection is set correctly you should never need to have the tape ejected as a default action, unless the media fills up or the job has finished. The latter of which I do not have set for this job.

Note to all of you who are saying this is the way it has always been done. Stating this does not help us. I am not disputing how long this has been the behaviour. I am challenging what I think is, at best, a great improvement for the product and, at worst, a fix for a design flaw.

CraigV's picture

For sure...and I've actually experienced this before. I looked after sites in some real backwaters places in Africa. Using stand-alone tape drives, when a tape ejected, it took a site engineer hours to drive to the site to put a tape in, or they had to rely on someone to put the tape in. Most times this never happened, and therefore queue up 5% backup success rates for example. Hence I got the go-ahead to put in HP StorageWorks MSL2024 G3 libraries...

I'd definitely vote this up if it was an Idea...post back with the link when you have done so.

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

deejerydoo's picture

Thanks Craig. I have other clients with BEWS and autoloaders and, obviously, the default behaviour makes a lot more sense. However, just about every small/remote/unattended site using BEWS will be experiencing this problem and will have done, in theory, for multiple revisions of BEWS/BENT. One of my points about this case is that this is a design error; with an effect that is just as real as any software coding error and therefore should be handled with a bug fix and not by a feature request. Currently, this behaviour essentially makes BEWS not fit for the purpose for which it was intended (in this case remote unattended backup sites). Also, considering the screaming issues BE2012 is currently suffering, unless Symantec accept this as a failure/fault in their design and issue a bug-fix number, I don't think it's going to get looked at for a good while, if ever.

I've seen plenty of other posts about this issue and you have also said you have encountered it and were lucky enough to get the autoloaders. However, the resistance from Symantec support, I have had regarding this, has been significant and the engineers have been persisting with their "this is by design" mantra. So, this is going to be one of those problems that most people let go because they hit the "by design" firewall from support. But I will not lie down... Oh no... I'm mad now...

CraigV's picture

Hehe...well, I'd say support will ALWAYS say that. They're kind of "scripted" with what they can do, but Connect is the place where you yourself can put an idea forward, and with enough votes, get it considered for some future release.

I doubt that Symantec will retrofit this into older products (as in BE 2010 R3 for instance), but going forward it would make putting an approved idea in easier.

Thanks!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

CraigV's picture

Easiest way to do so is to put that link in as your signature which you can do on your profile.

I've voted the idea up too!

Cheers!

Alternative ways to access Backup Exec Technical Support:

https://www-secure.symantec.com/connect/blogs/alte...

pkh's picture

Why make it the default action and only increase the chances of jobs failing?

How?

As long as your media protection is set correctly

Going by the discussions posted in the forum, there are a lot of OPP not set correctly.

 

Ejecting the tape saves the operator a step.  Also, the operator may press the power button instead of the eject button, thus causing more problems.

deejerydoo's picture

Hi PKH,

How?

As I have already stated, by making eject the default immediate action you remove ALL opportunities to even try to recover a backup job that has media overwrite/append issues.

Going by the discussions posted in the forum, there are a lot of OPP not set correctly.

I think you are not crediting the majority of people who use this product correctly and don't screw up their OPP. Also, there is always the opportunity for the operator to work out OPP and avoid issues with that. At the moment, it doesn't matter how well you understand OPP... There is no choice with the the tape being spat out... FULL STOP! Why write the software to cater for the minority of people not using it correctly, when the majority who are using it correctly will benefit from the change/fix I am proposing?

Ejecting the tape saves the operator a step.

Forcing the operator to deal with an ejected tape in a remote backup job is a far bigger problem than perhaps saving them the odd eject job. The majority of reasons to eject the tape are covered in other default behaviours and job specific settings in BEWS. i.e. Tapes can be set to eject at the end of a job, as part of the job definition. Also, tapes will automatically eject if they are full. Both very sensible automatic tape eject scenarios.

There are no other scenarios where you would want the tape to be ejected automatically!!

Not when there's a mismatch between the job and the OPP, that's for sure! Just because the tape hasn't been ejected doesn't mean it's being overwritten. Tape not ejected does not mean the same as tape being written to! Remember... My point is that the media remove alert still has to be acknowledged or cancelled before the tape is ejected or written to.

Also, the operator may press the power button instead of the eject button, thus causing more problems.

Please!!!! The operator may also decide to stick their head in a lit oven. How do you propose Backup Exec prevents this?

None of your reasons so far equate to the fundamental failures caused by the tape being ejected as the default, immediate action, with no operator action, for backups at unattended sites.

Larry Fine's picture

IIRC, the old DLT class of tape drives could actually "suck" the tape back into the drive when it was partially ejected.  Therefore, in the old days, I think that the alert options to acknowledge or cancel the eject actually caused different results.

I am simply throwing this out as an historical aside.  In no way am I dismissing your suggestions for improvements with modern hardware.

If you find this is a solution for the thread, please mark it as such.

deejerydoo's picture

Hi Larry,

Have you experienced this problem with BEWS?

If yes, please vote up my idea! :-)

https://www-secure.symantec.com/connect/ideas/stop-default-tape-eject-behaviour-be-2012

Cheers,

David.

Larry Fine's picture

I think there are a couple scenarios where a remote admin will be frustrated by an ejected tape.  so far, I don't see a downside to reducing BE's sometimes overly agressive ejhect behavior, so I upvoted the idea so that it gets investigated.

If you find this is a solution for the thread, please mark it as such.

deejerydoo's picture

I have found out how the meida in my OP got misallocated in the scheduled backup job. It wasn't the backup job engine it was this...

If a scheduled backup job has a problem it is often immediately put on hold by BEWS. This means that the job can easily miss it's next scheduled execution time, before the operator is able to get to it. This is fair enough...

Once the operator has had a chance to resolve whatever caused the backup job to be put on hold, AFAIK, there is no way to simply tell BEWS to not run the missed job (if it is still within its schedule window).

What you have to do is:

1. Take the job off hold and allow it to start running.

2. Once the job is running, then, and only then, you can cancel the job.

The problem with this is that if you don't cancel the job before it assigns the tape in the drive to your backup media set, you have to remember to go back to the media and assign it to a nedia set that will allow the next nights scheduled job to run.

Now I understand what has happened I can adjust my working processes to cater for it. However, could the behaviour of BEWS be enhanced to improve this situation? That bit I haven;t yet had time to apply myslef to... I'm keener on the bigger and currently unresolvable issue of tape ejections!

Moe Howard's picture

There may be a workaround to pull the tape back in to the drive without user intervention.

I believe Windows allows a SCSI command to be sent to a SCSI Port driver:
http://msdn.microsoft.com/en-us/library/ff565387%2...

Sending the LOAD UNLOAD SCSI command (1B) to the tape drive would pul the tape back in to the drive after the media has been unloaded. The HP LTO drives support this feature.

 

 

 

deejerydoo's picture

Hi Moe, Thanks for your input.

If that is the case then you may have just given the Symantec developers and alternate way of fixing the problem. All they would need to do is to get the "Media Removal Alert" to behave differently by making it so, that:

BEWS issues that SCSI command if the operator hits the "Cancel" button in the media removal alert.

However, my solution would also work for drives that don't support pulling the media back in. This is because they wouldn't need to pull the tape back in, in the first place, if BEWS didn't spit the tape out when the media removal alert was created. i.e. Only spit the tape out if the remote operator acknowledges the media removal alert.

PS: If you have encountered annoying tape ejection issues in BEWS or just think my idea is a good one, please vote it up!

https://www-secure.symantec.com/connect/ideas/stop-default-tape-eject-behaviour-be-2012

shreekumar's picture

Hi friends,

I am having backup tool as Backup Exec 2010 R3 with windows as OS and a stand alone tape drive , every morning the local I.T person rotates the media and insert an over writable media, but in my end the backup job is in Queued state and to reflect the media in the drive i cancel the job and run the inventory , by the time the inventory is complete i am able to see the over writable media , but as soon as i run the backup job again the job gets into queue state and now i am unable to see the media in the drive its get ejected automatically.

what is the issue ? would be helpfull if you get a solution for this .  

RickkeeC's picture

I manage a hundred remote serves, and luckily only one is still using BE.  I have ya'll beat as I have been using the product since the Maynard / Connor days.  They should go back to the old Connor interface and start from scratch with this software as it has been added on and patched for over 20 years now.

Some drives supported an 'inject' command which I believe was in the Connor days and quickly removed from Backup Exec as other features were removed and made more complicated.  By Design.  Yes.  Poor Design.

Even if they put back the 'inject' command, RD1000 drives don't support it, so please give me a simple option to "DISABLE TAPE/DISK EJECTION" across the board.  Easy.  Just a tick-box that disables it.

Now the problem is that running remotly, we all know how much maintenance this product needs, it jumps the track at least monthly, and when it does, it requires someones intervention to go get the key to the server room, and push the tape back in. 

This is 2014, and it's simply assinine that someone has to walk up to the machine to push a stubborn disk back in the drive.  

Guess what?  In order for it to not immediately spit the disk back out, you have to:
1. Place all the jobs on hold.  (Not the server on hold as you can't do anything with the 'server' on hold
2.) Go into each job and retarget the date to 'start' on a date  today, or later than today.
3.) Take all the jobs off hold.
4.) Watch for a job to start, because even though you told all the jobs to start on a different day, there is usually one that didn't get the command and it will start/stop and eject the disk again.
5.) Quickly cancel that subborn job.
6.) Delete the media on the disk in the drive
7.) Manually log in at the job execution time to make sure it worked.

Countless hours and embarasement sending a client to the server room each month to poke a tape or disk back in the drive.  Maybe they can come up with a USB controlled servo-finger to mount on the top of the server that woud push the tape back in.  Now there is an idea they might look at, an add on feature to "automate" this task because it would mean more revenue.  "Symantec Magic Finger USB Tape Injector"  :)

Can't wait to get this one client off of Symantec an on to BackupAssist that will let me have my way with my data without any fuss.  Does Symantec even look at what the up and comming competitors are doing when they write a backup program from scratch?  Or are they content in raising the price every year, making you purchase the program again and again with no discount, as they continually introduce more bugs into every new and improved edition.... Content to sit by idle as the client base slowly erodes and moves to different competitors?  Only time will tell.  I have voted with my purchase of over 100 copies of BackupAssist.  This seems the only way to send a message that will get some attention.  Unfortunately, I will never look back.  Ditto with the Endpoint Security malware produced by this company.

deejerydoo's picture

Hi Guys,

 

If you've posted your woes on this thread, please make sure you have also posted your support for the suggestion for a fix on the following article:

https://www-secure.symantec.com/connect/ideas/stop-default-tape-eject-behaviour-be-2012

I've got a sinking feeling in my gut that tapes are still going to be spat out aplenty in 2014!

Cheers,

David

deejerydoo's picture

And Backup Exec has done it to me again!

Spat the tape out when it shouldn't have. It should always give me a chance to rectify or override andy media restrictions on the tape, if I need to, without spitting it out!

How Backup Exec got it wrong this time:

1. Backup job failed after writing a small amount of data to the tape. The media set prevents overwrite for 6 days but allows append for 5.

2. I restarted to server in the process of resolving the issue that caused the backup failure.

3. Kicked off the backup job again, to the same tape in the drive.

4. Backup Exec reads the tape and decides, for some unknown reason that it doesn't like it, spits it out and asks me to insert overwritable media!?

5. What about the perfectly good appendable media that was available on the tape already in there? There was heaps of space for all backup jobs to finish, with plenty to spare, even accounting for the small amount of data from the earlier failed job!

Symantec, You have got to redesign the way your software interacts with stand alone tape drives. This is ridiculous and unforgivable that this has been the abhorrent behaviour in Backup Exec for years and for no good reason! Not a single person has been able to come up with a sensible reason as to why your product is so eject happy, other than poor product design.