Deployment and Imaging Group

 View Only
  • 1.  DS 6.9 sp5 - automatic retry on error?

    Posted Oct 25, 2011 01:27 PM

    Recently I've been having issues with random jobs stalling out for no apparent reason, mostly with status code 10061: "No connection could be made because the target machine actively refused it."

    The really annoying part of these is that i can always just right-click and retry task, and it works perfectly, and usually goes on to finish the rest of the build job just fine.  

    I'd love it if there were a simple way to make these tasks automatically retry themselves as just another error handler.  I have yet to see one of these fail again after i hit retry task on it, so i'm not worried about it getting stuck in a loop (though it would be nifty if altiris/symantec had written something into the logic to break itself out of such a loop).  
     

    so it'd be nice if it could be done with an error handler - i don't want it to stop, don't want it to continue (since i need it to do the stuff in the task), and i don't want to write a task for every step in every job - that'd be several hundred new jobs.  

     

    any suggestions?



  • 2.  RE: DS 6.9 sp5 - automatic retry on error?

    Posted Oct 26, 2011 08:58 AM

    Shouldn't you fix the error if it applies to several hundred jobs?
    http://www.symantec.com/docs/TECH38767
    http://www.symantec.com/docs/TECH25766

    Is an IP address and port listed as part of the error?



  • 3.  RE: DS 6.9 sp5 - automatic retry on error?

    Posted Nov 15, 2011 05:03 PM

    we are only rarely scheduling to multiple machines at once, and even if i do schedule multiple machines, it fails just as rarely.  we do have a pretty huge chained build job for both XP and Win7, and i've done a good bit of work (all in the past, none recently) to modularize it and break it down into lots of smaller parts rather than one massive job.  i don't use file copy tasks - i do everything with robocopy if i copy anything, and the steps it seems to fail on most often are anything but long-running - the most common one is a little vbscript to add an AD group to the local admins group, which when it doesn't fail this way, runs nearly instantly.  the step before that is simply setting the machine to use a specific power profile - also near instant.

    no ports are being blocked.  windows firewall is disabled and the machine has no antivirus at that point in the build process.  simply retrying the task (and having it work) proves that it's not a blocked port, and that it's not a too-big transaction log.  

    i already use wlogevent quite a bit to put more useful status messages in the console for my techs.

    i have noticed that it mostly only happens on tasks that use vbscript, and only during production.  thing is, it only fails on about 1 in every 10 machines.  so i can't easily reproduce it. 



  • 4.  RE: DS 6.9 sp5 - automatic retry on error?

    Posted Nov 28, 2011 05:16 PM

    Hey Jason

     

    We solved alot of our job hanging issues by changing them to robocopy locally, then install from there.

     



  • 5.  RE: DS 6.9 sp5 - automatic retry on error?

    Posted Nov 28, 2011 05:38 PM

    i'm actually doing all the robocopying during WinPE right after the image rather than for each individual job.  that alone made things a bit more reliable (sometimes robocopy wouldn't copy or would fail partway thru when being called in production), and the installers "feel" like they go faster since they're running with less delay in between each one.  so basically the only thing coming from the DS while in production are the actual task contents (axscript.vbs, etc).