Video Screencast Help
Give us your opinion and win with Symantec! Please help us by taking this survey to tell us about your experience with Symantec Connect, so that we can continue to grow and improve.  Take the survey.

Jobs stuck indefinitely in Queued Status

Created: 13 Jan 2014 | 10 comments

We have had an ongoing issue for about 2 months now, and since we have had a clean build (3 times now) for Backup Exec 2012. We have opened numerous cases with Symantec to resolve this, and they claim the first time that HotFix 209149 (see http://www.symantec.com/business/support/index?pag... ) corrects the issue. The issue is also noted by seeing Robotic Element errors stating that OST media is full and it brings the "virtual" slot for the PureDisk or OST storage device offline. Restarting the services in BE only makes the problem worse and causes a snowball effect whereby jobs constantly error in the ADAMM log files. Essentially, the jobs never can get a concurrency/virtual slot and they stay Queued forever.

I have seen others on this Forum with this problem, and while the Forum administrator seems to mark them as "Solved", they are not - because I see the threads drop off with no resolution identified.

Are other people having this problem? If so, how are you overcoming it? Once it starts the environment is essentially dead in the water because the jobs never start (they sit Queued forever) - save for one concurrency which for our size environment is only 1/4 the need we have.

We use CAS and 4 MMS servers with 2008 R2 with all patches applied, PureDisk 32TB volume on each MMS, Data Domain OST connection to DD-670 OST with OST-Plug 2.62 (28TB), replicated catalogs, and duplicate jobs for optimized deduplication between MMSs. We run BE 2012 SP3 clean - we reinstalled with SP3 slipstreamed because Symantec said this problem could be fixed through database repair by them manually or by reinstall...we chose reinstall (even though they did offer to "fix" the issue with database repair). We chose reinstall to validate whether SP3 truly fixes the issue. It is clear to us it does not.

We are looking for anyone else who has had this problem to report into this forum.

Thank you,

Dana

Operating Systems:

Comments 10 CommentsJump to latest comment

Steven L.'s picture

Hi drmont1,

  Supposdly, Symantec silently changed the way they write data to the .bkf file in SP3.  I learned this while diving deep into my VSS issues...which you confirmed..(thanks).  Check with EMC DD to see if they have a codeset or firmware upgrade.

-Steven

Snip3r6659's picture

drmont1,

I share your frustration in this.  we to have the same issue with jobs being stuck in queue.  Although it is a bit more random.  Sometimes i will come in and find that a job is stuck since the night before,  and i cancel it and it runs fine the next night!  sometimes,  recreating the job fixes the issue.  and in some cases nobody seems to know why it is stuck!

We to have done a clean build of our MMS putting it entirly on a stand alone server (symantec recomendation)  We had been using what Symantec called a non supported disc for our deduplication,  so we replaced it with a "supported" device.  although i highly doubt that makes any difference.  and neither of these suggestions have solved the issue.

We are preforming our backups to deduplication disc.  and they always seem to pick and choose who wants to stick on which night!  yet ive found that if i run a backup to disc on the same server that a deduplication job is stuck on,  it runs without an issue?

Frustrated and confused...  is there a patch for that?

kkoekkoek's picture

We've been having the same issue with a pretty similar setup. I actually found that disconnecting the MMS from the CAS and just running it in stand alone after running a re-inventory seems to make the issue go away but as soon as we reconnect it to the CAS it starts randomly queing jobs forever. So I'm pretty much no help but at least you know there's more frustrated users.

Snip3r6659's picture

All,

Same problem even running backup exec 2014,  i to have a CAS MMS environment,  and what weird is that we have the exact same hardware at each site!

We are running a virtual environment server 2008r2 and using a Drobo b1200i which is iscsi connected.  The drobo has all SAS drives with an SSD data tiering stage.  when it runs it runs extremly fast!  the keyword however is when it runs.

Deduplication is 100% random...
sometimes recreating the job works although it is random.  we have scanned the drive for bad media with stsinv,  reinstalled,  repaired installs, run network tests, speed tests to disks,  permission checks, you name it we have tried it!  one weird thing i have noticed is that sometimes running a small job of one file,  seems to "jump start" a backup and make it run for a few days again?!?!  funny part is,  even if a backup job will not run,  a restore seems to always work!

honestly at this point i am about ready to give up on deduplication all together,  as every disc backup i run to the Drobo runs fast and consistently! possibly even to the point of considering another backup solution.  i cant have this inconsistency in my environment.  if it wasnt for the space saving features of dedup i would not even be having this conversation.  and it seems like the further you go the less support you seem to get from symantec technicians!  i get it they are confused and probably frustrated as well...  but how do you think it makes the end user feel who is not getting reliable backups?  i use to have success asking my software vendor to call and poke the bear so to say.  but even now its hard to get a call back.

I get it maybe you are confused and dont understand what is going on symantec...  but dont leave us hanging!  shouldnt the pure volume of posts like this on the web prompt action?  can this really be this big of an unsolvable issue?

100% frustrated with no direction to go over here.  and the fact that i am not alone is no comfort what so ever!

Jimmm's picture

No sure if this thread is still active, but i share the frustration of the above posters. I have been using Backup Exec since it ran on a Novell server & was made by Arcata.  Ever since the upgrade to 2012, then 2014, the platform has become so unreliable that I have almost lost confidence in Symantec's ability to fix it. 

I am having the same Queue issue, as well as multiple external media issues, too numerous to detail in this post. When I talk to Symantec, their answer is always recreate the job, recreate the media, recreate the catalogs ... How about making a product that does not require constant fixing?  

Please, someone tell me that I am mistaken & that there is a way to make this software reliable again. 

Thanks, 

Jim

Snip3r6659's picture

Jimmm,

My problem still exists,  as does the circle of unrelaible help and support.  i have tried everything i can think of.  swymantec support has deleted my dedup disks 3 times now.  and it seems no matter how loud i yell nobody wants to take ownership of the issue.

They implement a "fix"  and most times i never here back from the tech during the "report back" period.

I have invested in backup exec consultants,  that look at our environment.  and to be honest im starting to look at alternate softwares.  i love backup exec but i feel like i am stuck in a loop,  and its not exactly comforting to see this many other users with the same issue still unresolved!

unfortunatley i have nothing helpful to report.  as i find it difficult to even find reliable support tech that calls me back.

dhansen10's picture

That's sad to hear that this wasn't corrected in 2014.  I've had this issue in the 2010 version and the eventual fix was to identify the bad pieces of media that it was trying to find and removing them through SQL.  Although I understand the concepts behind fixing it you need to pretty adept at using SQL and know how to identify the bad pieces of media.  It took me six months of working with support to finally get it escalated to the correct person, but once in his hands it was resolved somewhat quickly.  If you all continue to have these issues have you tried talking to the on-call duty manager or tried to get this escalated? Good luck and if not time to get rid of it.

dennisblotenburg's picture

I have also the same issues as described in this request..
Is there already a solution to fix this?

steve prindeville's picture

I have a similar issue, but only when writing to disk backup. the backups to the Amazon VTL are working fine