Video Screencast Help
Symantec Appoints Michael A. Brown CEO. Learn more.

Jobs stuck indefinitely in Queued Status

Created: 13 Jan 2014 | 4 comments

We have had an ongoing issue for about 2 months now, and since we have had a clean build (3 times now) for Backup Exec 2012. We have opened numerous cases with Symantec to resolve this, and they claim the first time that HotFix 209149 (see http://www.symantec.com/business/support/index?pag... ) corrects the issue. The issue is also noted by seeing Robotic Element errors stating that OST media is full and it brings the "virtual" slot for the PureDisk or OST storage device offline. Restarting the services in BE only makes the problem worse and causes a snowball effect whereby jobs constantly error in the ADAMM log files. Essentially, the jobs never can get a concurrency/virtual slot and they stay Queued forever.

I have seen others on this Forum with this problem, and while the Forum administrator seems to mark them as "Solved", they are not - because I see the threads drop off with no resolution identified.

Are other people having this problem? If so, how are you overcoming it? Once it starts the environment is essentially dead in the water because the jobs never start (they sit Queued forever) - save for one concurrency which for our size environment is only 1/4 the need we have.

We use CAS and 4 MMS servers with 2008 R2 with all patches applied, PureDisk 32TB volume on each MMS, Data Domain OST connection to DD-670 OST with OST-Plug 2.62 (28TB), replicated catalogs, and duplicate jobs for optimized deduplication between MMSs. We run BE 2012 SP3 clean - we reinstalled with SP3 slipstreamed because Symantec said this problem could be fixed through database repair by them manually or by reinstall...we chose reinstall (even though they did offer to "fix" the issue with database repair). We chose reinstall to validate whether SP3 truly fixes the issue. It is clear to us it does not.

We are looking for anyone else who has had this problem to report into this forum.

Thank you,

Dana

 

Operating Systems:

Comments 4 CommentsJump to latest comment

Steven L.'s picture

Hi drmont1,

  Supposdly, Symantec silently changed the way they write data to the .bkf file in SP3.  I learned this while diving deep into my VSS issues...which you confirmed..(thanks).  Check with EMC DD to see if they have a codeset or firmware upgrade.

-Steven

Snip3r6659's picture

drmont1,

I share your frustration in this.  we to have the same issue with jobs being stuck in queue.  Although it is a bit more random.  Sometimes i will come in and find that a job is stuck since the night before,  and i cancel it and it runs fine the next night!  sometimes,  recreating the job fixes the issue.  and in some cases nobody seems to know why it is stuck!

We to have done a clean build of our MMS putting it entirly on a stand alone server (symantec recomendation)  We had been using what Symantec called a non supported disc for our deduplication,  so we replaced it with a "supported" device.  although i highly doubt that makes any difference.  and neither of these suggestions have solved the issue.

We are preforming our backups to deduplication disc.  and they always seem to pick and choose who wants to stick on which night!  yet ive found that if i run a backup to disc on the same server that a deduplication job is stuck on,  it runs without an issue?

Frustrated and confused...  is there a patch for that?

kkoekkoek's picture

We've been having the same issue with a pretty similar setup. I actually found that disconnecting the MMS from the CAS and just running it in stand alone after running a re-inventory seems to make the issue go away but as soon as we reconnect it to the CAS it starts randomly queing jobs forever. So I'm pretty much no help but at least you know there's more frustrated users.

Snip3r6659's picture

All,

Same problem even running backup exec 2014,  i to have a CAS MMS environment,  and what weird is that we have the exact same hardware at each site!

We are running a virtual environment server 2008r2 and using a Drobo b1200i which is iscsi connected.  The drobo has all SAS drives with an SSD data tiering stage.  when it runs it runs extremly fast!  the keyword however is when it runs.

Deduplication is 100% random...
sometimes recreating the job works although it is random.  we have scanned the drive for bad media with stsinv,  reinstalled,  repaired installs, run network tests, speed tests to disks,  permission checks, you name it we have tried it!  one weird thing i have noticed is that sometimes running a small job of one file,  seems to "jump start" a backup and make it run for a few days again?!?!  funny part is,  even if a backup job will not run,  a restore seems to always work!

honestly at this point i am about ready to give up on deduplication all together,  as every disc backup i run to the Drobo runs fast and consistently! possibly even to the point of considering another backup solution.  i cant have this inconsistency in my environment.  if it wasnt for the space saving features of dedup i would not even be having this conversation.  and it seems like the further you go the less support you seem to get from symantec technicians!  i get it they are confused and probably frustrated as well...  but how do you think it makes the end user feel who is not getting reliable backups?  i use to have success asking my software vendor to call and poke the bear so to say.  but even now its hard to get a call back.

 

I get it maybe you are confused and dont understand what is going on symantec...  but dont leave us hanging!  shouldnt the pure volume of posts like this on the web prompt action?  can this really be this big of an unsolvable issue?

100% frustrated with no direction to go over here.  and the fact that i am not alone is no comfort what so ever!