Deployment Solution

 View Only
  • 1.  DS 6.9 SP6 delay with run this job immediatly

    Posted Aug 14, 2015 01:07 PM

     

    Usually DS 6.9 will start a job immediately, even when the "allow task to be deferred up to 5 minutes" button is checked.  During our busy season, (school startup), the engine starts to wait a full 5 minutes with a blank status before deciding to run the task.  I do not know what this magical threashhold is but I believe it's either 100 or 200 scheduled jobs.  Here's what I've observed with the database.

    event Holds a list of every job in the console and stores the description and event_id for each

    computer Holds a list of every computer that has an agent on it.  Each is assigned a computer_id

    event_scheule seems to have a list of all jobs that were scheduled as well as the job history of jobs that were run.  Key columns are as follows.

    • computer_id
    • event_id (each job in DS has an event_ID associated with it)
    • staus (the text that displays in the status column in the console
    • start_time (when you told the job to run)
    • broker_time (when the job actually started running)
    • end_time (when the job finished)
    • status_code (the return code of the job after it finishes)
    • defer_mins (the option on the schedule tab for allowing the job to be delayed up to x minutes)

    Using this query, I find we have about 212 jobs scheduled in the future

    select computer.computer_name,event_id,event_schedule.status
    from computer
    left join event_schedule on computer.computer_id = event_schedule.computer_id
    where event_schedule.status='' or event_schedule.status is NULL

    When the status is Null that usually means that start_time is in the future.  However, I've found several jobs in the past where both event_id and status are null.  In these cases, the jobs do not show up in the console in the run history.  I can't fathom how anything got into this table without an event_id and so consider it a glitch.  If I remove all the jobs with a NULL event_ID, I'm down to 72 from 212.  

    For the jobs scheduled as "run now" that are waiting 5 minutes, the status shows up as blank.  It then turns to "running script" when the 5 minutes elapses.  This behavior occurs whether or not defer_mins is 5 or 0. DS doesn't seem to respect this setting.  

    We have three separate installs of DS and the one with only 150 events scheduled (comparable to the 212 number), it doesn't delay anything.  Thus it seems to be counting these erroneous entries with NULL event_id's as for determining if it is overwhelmed.  

    So, it'd be nice to be able to crank this threshold higher to maybe 1000 and or cleanup these errant table rows to restore normal functionality, but I'm not sure what the impact would be.  I'll consider backing up the database and giving it a try to see how it goes.  

     



  • 2.  RE: DS 6.9 SP6 delay with run this job immediatly

    Posted Aug 16, 2015 08:37 AM

    I have found that the NULL value for the start & end columns are because the computer in question is currently running another job that either had the same scheduled start time or was also targeted with "run now" job schedules.  Jobs scheduled with a furture date & time will have a start column value. 



  • 3.  RE: DS 6.9 SP6 delay with run this job immediatly

    Posted Aug 18, 2015 12:36 PM

    I'll keep an eye on that for computers with multiple jobs being run.

    We still have several machines that should never have had a job run on them (servers) that show up in the list with no event_id and null values.  These machines show no history in console.



  • 4.  RE: DS 6.9 SP6 delay with run this job immediatly

    Posted Aug 31, 2015 05:21 PM

    I have managed to solve this problem with a little database cleanup.  The best I can figure, the more jobs scheduled the slower it goes and the culprit is older jobs that never completed and are just sitting there. 

    We deleted a whole branch of old jobs in DS using the console.  Upon deletion, DS warned us that some of the items were scheduled to run and wanted to confirm the deletion.  Just deleting jobs probably won't help, but if they are scheduled and stuck, this will help. 

    Here are the queries to clean up the other stuff that can cause slowness with jobs starting.

    1. Backup your database before making edits
    2. Turn off all three Altiris services on the server running the console
    3. Make sure your express database is highlighted when running these queries. 
    4. List of jobs scheduled to run but never started.  They could be set to run on machines that no longer exist or be experiencing the 5 minute delay problem because too many jobs are scheduled.
    5. select computer.name as ComputerName,event.name as EventName,event_schedule.* 
      from event_schedule
      left join computer on computer.computer_id = event_schedule.computer_id
      left join dbo.event on event.event_id = event_schedule.event_id
      where start_time IS NOT NULL and broker_time is NULL
      order by start_time

       

    6. Delete all scheduled jobs that have not run yet

    7. Delete
      From event_schedule
      where start_time IS NOT NULL and broker_time is NULL

       

    8. List scheduled jobs that don't have a start time defined.  These can happen if someone chooses the "Do Not Schedule" option when scheduling a job.

    9. select computer.name as ComputerName,event.name as EventName,event_schedule.* 
      from event_schedule
      left join computer on computer.computer_id = event_schedule.computer_id
      left join dbo.event on event.event_id = event_schedule.event_id
      where (start_time ='' or start_time IS NULL) and event_schedule.event_id is not NULL

       

    10. Delete jobs that don't have a start time defined. 

    11. Delete
      From Event_Schedule
      where (start_time ='' or start_time IS NULL) and event_schedule.event_id is not NULL

       

    12. This last rule is just a general cleanup for the DS history. It's quicker than checking each job and deleting old history in the console. In our case, if a job wasn't used since May (we reimage all computers over the summer). the software is no longer relevant this year. 

    13. select computer.name as ComputerName,event.name as EventName,event_schedule.* 
      from event_schedule
      left join computer on computer.computer_id = event_schedule.computer_id
      left join dbo.event on event.event_id = event_schedule.event_id
      where start_time < '2015-04-30'
      order by start_time

       

    14. And the delete for this one is:

    15. Delete 
      from event_schedule
      Where start_time < '2015-04-30'