DS 6.x Job Shortcut Accelerator
Many of us DS 6.x administrators manage the distribution of our Deployment Server 6.x jobs using shortcuts. These are brilliant constructs - like file system shortcuts, no matter now many shortcuts are distributed across your Deployment Server, the integrity of the target job is preserved. If you've ever suffered from job 'version hell' on your server, you'll quickly understand why these shortcuts are so valuable.
The implementation of shortcut objects however comes at a price -each shortcut's execution introduces (by design) a scheduling delay in the execution of any jobs that follow them in the job chain. You see, when a shortcut's target is injected into a chain of scheduled jobs, all the jobs that follow it are rescheduled for the next minute. This poses a problem in scenarios where you execute a large number of fast acting jobs -the delays that accumulate from each shortcut's execution can be crippling.
Today I'll demonstrate a simple script which will allow you to bypass the native DS scheduler delays when executing your shortcuts. It's simple and nifty. If you are a shortcut admirer this could be the answer you've never known you needed, but have always been looking for.... ;-)
Deployment Server shortcuts are superb little inventions by Altiris administrators. Shortcuts are jobs which consist of a single dummy task which executes another task on completion. At first glance they can seem a little pointless as they do not afterall enhance the deployment of the job on the target machine. Their power however is not seen in the deployment of your jobs, but in the administration of them. Job Shortcuts help preserve job integrity and reduce job sprawl on your deployment servers.
At first glance the overhead of these shortcuts is minimal. When executing the dummy task (which is often just a single line REM statement) the total elapsed time is rarely more than a couple of seconds. However when scheduling large sequences of jobs using shortcuts a problem emerges; as each shortcuts executes, the engine injects the target job into the scheduled job chain and reschedules the subsequent jobs for the next minute. When scheduling folders of many shortcuts onto your client computers, you'll appreciate that this constant re-timing of your job schedules can accumulate delays which needlessly gnaw away at your deployment statistics. The administration gains however from using shortcuts generally outweigh this scheduling annoyance, so most admins just accept this hit as inevitable and move on.
Personally, I love shortcuts and we use them extensively here;
- Image Deployments
Every image we deploy has a number of post-deployment jobs attached. These jobs typically deploy latest browser plugins and software which can't reside in the image (for technical or legal reasons). With perhaps 10 shortcuts used in each deployment, that's an extra 5 minutes wasted in a typical deployment.
- Image Creation
Our image creation process is entirely automated. Depending on the department, the number of shortcuts used to create the images varies from 20 to as much as 100. Many of these target jobs take just a few seconds to execute which means when creating our images the entire process can be slowed down for as much as 40 minutes whilst the client awaits the engine's minute hand to tick over.
When I say we use shortcuts extensively, I really mean it. Our master deployment server has over 4000 shortcuts in use today.
Although the shortcut delay has been an issue for several years, it's actually only gained visibility over the last year. Two things have changed in this period which have made these delays more significant. First, our primary method of image delivery has changed. Prior to about 18 months ago, all our imaging sessions were delivered over the network. In that scenario, it might take an hour to deliver an image which dwarfed the 20 minutes of post-image jobs that followed. When our primary delivery method changed to USB Flash, the situation reversed. Image deployments became so fast (perhaps 5-10 minutes), that the bulk of the time in a computer deployment was actually spent waiting for the post-deployment jobs to complete.
The second change in our process was the introduction of our Deployment Server job progress utility we call DS JobMon. This is a handy HTA application (written by Darren Collins) which displays on the login screen the progress of any scheduled jobs. This visibility of post-deployment jobs comes at a price. If we have a package which in principle should take only 10 seconds to install, people will note with frustration when it appears to be taking over a minute.
To see what I mean, below is a screenshot of a client's logon screen which is displaying the JobMon HTA following an image deployment,
We could of course abandon JobMon... but we can't. It's too pretty.
So we now have a situation where the speed of image deployment is so fast that it makes the subsequent jobs executions appear slow by comparison. Anyone watching JobMon's progress can't help but test out their latent Jedi powers to make it go faster.
And now for the final complication -we now aim to have a target total deployment time of 30 minutes. Shaving an extra 5-10 minutes off our total deployment schedule by improving shortcuts execution now seems very attractive. If we achieved this, we'd be able to extend the time between our image rebuilds (as we can add more post-install jobs as image fixes).
We could of course abandon shortcuts, but as they are administratively such a boon ditching them is not viable. The only answer is to find a way to get around the delays they introduce.
2. Our Definition Of A Job Shortcut
After mulling over some options, I concluded that the best solution to this problem was to pre-empt the engine's shortcut resolving code, and programmatically replace all the scheduled job shortcuts with their targets at execution time myself. The result would be that the shortcuts themselves would never actually be executed by the engine -the first job in the scheduler chain would be my code which removes the shortcuts by resolving them in situ to their targets.
This approach bypasses the scheduler delay (as the re-scheduling sequence is no longer required) whilst allowing us to continue to reap the administrative boons of using shortcuts. Simple and neat. And like Darren says, he doesn't know why I didn't think of it earlier....
Below is an example of a job folder on our deployment server which contains nothing but shortcuts,
I know they are all shortcuts as they include the arrow string, '->', in their name (which is our standard). If I were to drill into the Adobe Flash Player shortcut, this is what we find;
So, the job contains a single run script. Let's now drill down further into this script,
As we can see this is just a trivial batch script which executes in Windows. It's job is simply to exit with a return code of 0. If we click 'Next' here, we'd see the 'Script Information' screen which would reveal that,
- The 'Run Script' task was scheduled to within the production environment
- The script executes in the system context
The only requirement is that it's a task which executes and returns quickly with success, and these default settings certainly achieve that. What is important is the next screen where we define the return code behaviour,
And this is the heart of the shortcut magic. Once the dummy 'Run Script' task executes, the job "[I] Adobe Flash Player Install (Latest Packaged)" is called. This is a real job, which is indicated to us by the absence of the shortcut string '->' in the job name.
This quick tour of our shortcuts allows us to summarise a shortcut as follows. It is a job which,
- Contains the substring '->'
- Contains a single 'run script' task
- On successful execution of the single script task, another job is subsequently called.
3. The Shortcut Scheduling Problem
Before we proceed further, I think it best to illustrate clearly the scheduling delays which are incurred when using shortcuts. For this, I'm going to create a job which executes a server-side 10 second ping (ping -n 10 127.0.0.1) and a shortcut pointing to this job;
The idea here is to create a job which executes fairly quickly, thus simulating the delay of a configuration script or perhaps the installation of a small software package.
Now, let's create two folders. Let one contain 10 copies of the shortcut, and let the other contain 10 copies of the original job;
Let's first schedule the 'Folder of Jobs' folder on a computer and see what happens. Noting that I scheduled the folder at 10:45am, below is the results screen I saw about a minute and a half later,
This is an example of a normal job schedule result. You can see that all the jobs were scheduled at 10:45am, and that each took between 10-11 seconds to execute. All the jobs completed at 10:46.
Let's now see what happens when we schedule the folder of shortcuts. This time, I scheduled the folder at 11:10 am.
There you can see that although I scheduled the folder of shortcuts at 11:10am, the scheduling times have all been driven forward incrementally by a minute. So, our chain of jobs which intrinsically should have a deployment time of 1m40s in reality took 10 minutes. That's 6 times longer!
Although this seems to be an extreme case, it's amazing how many software installs and configuration tasks we have which take just a few seconds to run. Yes we have big office apps and .NET packages, but these are not that common. After taking some stats from our Deployment Server, we found that we had a string skew for fast running jobs,
And yes -this statistic excludes the shortcuts themselves ;-)
So, about half of all our jobs execution times take less than a minute to run, and the skew is certainly towards the faster executing tasks even within that minute range. So, our imagined extreme case isn't so extreme after all. It's quite likely in fact.
Having assured ourselves that we are indeed tackling a real problem, let's see what we plasters we can code to make it all better....
4. Casting the Shortcut Definition into T-SQL
We now need to convert our job definition of a shortcut into something more definitive and T-SQL like. The tables which are going to be of interest to us for this exercise are,
This table stores the job details like it's id (the event_id), it's name and the folder it's stored in
This table keeps track of scheduled jobs. It stores the schedule_id, event_id (the job that's been scheduled), the computer it's been scheduled on, the scheduling time, its current status and so on.
This table stores the breakdown of the tasks within each event. Loosely speaking, tasks are categorised by the condition index (cond_seq), the task index within each condition and the task type. A good example for how tasks are broken down in events is to look at the DAgent Upgrade task which comes in the 'Samples' folder. Below I show the task list for each condition,
From the tasks table point of view, this job then has 12 task entries. The condition mappings are: 0 = IsMachineX86, 1 = isMachineX64, 2 = IsMachineIA64, 3 = Default. Under each condition, we have 4 tasks which are indexed by the task_sequence number (which must be unique for each condition) and task type. A task type of 12 is for "Copy File" tasks, and a task type of 10 is for "Run Script".
Each task (as uniquely defined by the combination of event_id, cond_seq and task_seq) can perform different actions depending on its return code. The important field in this table for us is the handler which determines what should be done for the return codes specified,
handler=0 means the job stops
handler=1 means the job continues
handler=xxxxxxx means that another job should be executed
From the above we can form a more database-like definition for a job shortcut,
- Event.name like '%->%'
It must look like a shortcut
- SUM(task_type) across all tasks in event = 10
The minimum value of task_type is 10 and this refers to a 'Run Script' Task. So if we sum all the task_type values in any job and the sum equals 10, then this means there can only be one task, and further it must be a 'run script' task
- SUM(handler) across all tasks dbo.task_return_handlers for event > 1
The single run script task must on completion call another job. This must be the case if the sum of handlers (of which there will be exactly one if above criteria are met) is greater than one.
Which is more nicely written as,
select * from eventwhere name like '%->%'and (select sum(handler) from task_return_handlers where event_id=event.event_id) > 1and (select sum(task_type) from task where event_id=event.event_id) =10
4.1 The T-SQL for the DS Shortcut Accelerator
update event_scheduleset event_schedule.event_id = (select handler from task_return_handlerswhere event_id=event_schedule.event_id),next_task_seq=0,cond_seq=NULLwhere (select name from event where event_id=event_schedule.event_id and event_schedule.status_code is NULL)like '%->%'and (select sum(handler) from task_return_handlers where event_id=event_schedule.event_id) > 1and (select sum(task_type) from task where event_id=event_schedule.event_id) =10and event_schedule.computer_id=%ID%
5. Creating and Using the DS Shortcut Accelerator
- Create a new job called "0 - DS Shortcut Accelerator"
- Add a 'Run Script' task and in the script window paste the following code,
REM Resolve Shortcuts using SQLCMD.EXE SET SQLCMD="%PROGRAMFILES%\Microsoft SQL Server\90\Tools\Binn\SQLCMD.EXE" %SQLCMD% -d "eXpress" -Q "update event_schedule set event_schedule.event_id = (select handler from task_return_handlers where event_id=event_schedule.event_id), next_task_seq=0,cond_seq=0 where (select name from event where event_id=event_schedule.event_id and event_schedule.status_code is NULL) like '%%->%%' and (select sum(handler) from task_return_handlers where event_id=event_schedule.event_id) > 1 and (select sum(task_type) from task where event_id=event_schedule.event_id) =10 and event_schedule.computer_id=%ID%"
This code runs SQLCMD.EXE from the SQL tools folder, so you'll need to make sure you have this installed and set the path above appropriately.
- Configure the script to run on the server, and allow it to execute when the agent is disconnected.
Now, copy this job into any deployment folders you have which heavily rely on shortcuts. Below I've shown a job folder for plug-ins where I've added the accelerator as the first job,
Now, just drop this onto your target computer and see what happens...
So Presto! You should see that all the shortcuts have been resolved and will now execute without delay. Now, although in the example above I've illustrated this for resolving shortcuts within a folder, remember it works by resolving any scheduled shortcuts. So, if you schedule a nested folder of jobs, all will be resolved as long as the accelerator job is prior to the shortcuts.
Dead easy really.
Today I've tried to resolve a long standing niggle with our Deployment Server job executions. Shortcuts are great, but the delays they induce can be frustrating when you are trying to optimise your deployment times.
The shortcut accelerator works by resolving job shortcuts before they are executed. The result is that at job execution time, the shortcuts only appear fleetingly. Once the accelerator job is run, all the currently scheduled shortcuts on the target machine will magically and instantly mutate into their respective target jobs.
In the majority of our deployments we expect this to shave between 5-10 minutes off our total delivery times. For some of our job sets which build our production images, we expect savings of up to 40 minutes. Pretty impressive I think you'll admit for a tiny piece of T-SQL.