Maintaining History with Deployment Server Job Replication
Deployment Solution provides the utilities axExport and axImport to allow replication of Deployment Server jobs to down-level servers. However, there is a major niggle in that the job import mechanism -it deletes the client job history as seen in the GUI. This document investigates why this happens, and shows what can be done to prevent it.
Deployment Server Infrastructure Choices
Before delving into the issue of job synchronisation, many of you might be thinking what the heck is job syncronisation, and why would you want it. Well, it all comes down to your DS infrastructure design. For simplicity, let's assume we can divide any DS infrastructure design into two main scenarios,
- Single DS Infrastructure
Generally suitable for the small to medium sized organisation. One organisational Deployment Server is installed, and remote sites are catered for by installing remote PXE servers as required which report back to the organisational DS. If Altiris CMS is used, the PXE services are typically installed on site Package Servers.All sites are controlled by the centrally managed Deployment Server.
- Multiple DS Infrastructure
Generally suitable for the larger (or more federated) organisations. One Master Deployment Server replicates to slave Deployment Servers on each site. As above, the Deployment Servers are typically also used as the site's Package Server.Each site is controlled by its own managed Deployment Server, but job replication ensures that software and image deployments are managed centrally.
Naturally, each scenario has its complexities, pros and cons, and could easily find that as a small organisation you might still have a requirement to move to the larger organisational model with multiple DS servers. Small organisations (<1000 machines) might be forced to use the larger organisational model for one of several reasons,
- Technical -to reduce server load (can be a problem is security is enabled, and views are restricted)
- Risk management reasons -dividing computers between several Deployment Servers reduces risk of imaging entire or rebooting entire site
- Process -you might want to have production and development environment. It's useful to have a dev server to finalise and develop jobs on so that they can be synced to you production server(s) (through a suitable change-management process).
- Organisation politics and boundaries - Some organisations do not suite a 100% centrally managed infrastructure. Centrally managed Site Deployment Servers provide a high degree of local control.
In short -you might want to move to a multiple DS infrastructure sooner than you might think -even if you don't have a super corporate node count.
To replicate jobs between Deployment Servers on a multi-DS site is a bit involved. You need to,
- Identify jobs for replication
- Make replicated jobs suitable for replication
- Configure export and import scripts
- Put in place replication processes for jobs, software and image stores
Making jobs suitable for replication means ensuring that software and image deployments are initiated using local file stores (over fast LAN connections). This means using either Microsoft's Distributed File System (DFS) or the Altiris Package Server technology to synchronise your site filestores. If you omit this step when your sites are connected over slow WAN links prepare too look puzzled as you start imaging at 1Mbps...
Let's now look at job replication, and the issue which crops up regarding job execution history in the GUI.
Job Replication Basics
In order to replicate jobs from your master Deployment Server to your slave boxes, you need to,
- Write a script to export your jobs
- Write a script to import your jobs
The utilities Altiris provides for this are respectively axExport and axImport which are tucked away on your deployment share.
Exporting a Global Job Folder
Let's take a look at a simple setup where have a Master Deployment Server with a job folder called "Global" which we'd like to replicate to our slave Deployment Servers. Below you can see such a folder, and it houses a few simple deployment jobs.
To create an export binary, I would use the axExport utility. Create and execute the batch file Export_Jobs.bat with the contents as shown below,
cd "C:\Program Files\Altiris\eXpress\Deployment Server" axexport /f "global" /s global.bin pause
This script first sets the working folder as the express share and then calls the axExport utility. The arguments export the job folder called "global" into the binary file global.bin. The /f switch allows you to enter the name of the specific job folder to be exported, and the /s switch allows the processing any sub-folders should they be present.
Running this script gives the following output,
Notice that the axExport utility has correctly identified that only 6 jobs in 1 folder have been found and processed. So far, pretty simple.
For a full list of AxExport switches, see Altiris KB:32576 "What are all of the axEvent.exe, axExport.exe, axImport.exe and axSched.exe command line options?"
Importing a global Job Folder
Now let's use this binary to copy these jobs over to one of the slave Deployment server. On your slave deployment server, create and execute the following batch file, calling it Import_Jobs.bat,
@echo off cd "C:\Program Files\Altiris\eXpress\Deployment Server" aximport \\master_ds_server\express\global.bin /o pause
Where take note to change master_ds_server in the above to suit your environment. On execution, you should see a command output very similar to the export output, only a little more complex. When it's finished, you'll see a brand new Global folder in your jobs pane, populated with the jobs exported from the master server.
The Job History Problem
As I've already said, natively synchronising jobs will wipe out the job history from the GUI. Lets see this in action.
Below is a screenshot from a sample Deployment Server where we have a local jobs folder in addition to the Global folder. Notice that I've run all the jobs in the local and global folders on the target computer STAF-VMWARE.
Lets now run the import batch script again and see what happens.
All the job history on computer from anything executed from within the synchronised Global folder has disappeared!
How job synchronisation affects the Job tables
In order to understand what's going on, lets take a look how jobs are stored in the database;
- Jobs are stored in the event table
- Job folder hierarchy information is stored in the event_folder table
- Job condition information stored in the event_condition table
- Tasks are stored in tables like install_task, config_task, copyfile_task, image_task, script_task, reboot_task & wait_task tables
- Job execution history is stored in the event_schedule and event_schedule_info tables
The primary key for the event table is the event_id, and it's this key which ties all the job and task tables together.
So, what happens when we do a job import? Let's take a look the FireFox install event entry before and after synchronising the Global folder with the following SQL script,
select event_id,folder_id,[name] from event where [name] = '[I] Firefox 3.05 Install'
The output before doing the sync is,
Event_id Folder_id name 2020549 1002274 [I] Firefox 3.05 Install
And puzzlingly the output after the sync is exactly the same,
Event_id Folder_id name 2020549 1002274 [I] Firefox 3.05 Install
The Event_id hasn't changed. This is a bit confusing, because as at first glance an obvious reason for the job history being removed is because the import deletes the old jobs and creates new ones with new event_id's. This evidently isn't the case -the event_id's are preserved in the job import process.
So let's look at the event_schedule table using the following SQL,
select schedule_id, computer_id,event_id from event_schedule
Before the sync this looked like,
Schedule_id computer_id Event_id 100000023 5000001 2020550 [I] Install Microsoft Office 2007 100000024 5000001 2020547 [I] Adobe AIR Runtime 1.5.1.8210 Install 100000025 5000001 2020546 [I] Adobe FlashPlayer v9.0.124 Install 100000026 5000001 2020544 [I] Adobe Shockwave Player 11.0r429 100000027 5000001 2020545 [I] Audacity 1.2.6 Install 100000028 5000001 2020548 [I] Citrix ICA Client 10.1 Install 100000029 5000001 2020549 [I] Firefox 3.05 Install
Where I've added some human readability to the Event_id field. And after the job sync we have,
Schedule_id computer_id Event_id 100000023 5000001 2020550 [I] Install Microsoft Office 2007
So, here we have found the root of the problem. The event_schedule table is pruned when jobs are imported, removing any schedule information for jobs present in the target jobs folder heiracy. As this table is used to present the job execution history in the UI, after synchronisation all we see is a lot of white space.
A SQL trace reveals the import process and the reason for losing the job history. The process is roughly as follows,
- Jobs in the target synchronisation folder are backed up to a temporary table
- Jobs in the target synchronisation folder are deleted
- Triggers on the event table then remove the corresponding rows from the other tables, including the event_schedule table
- Jobs are imported back into the event table preserving event_id data using the backed up data in the temporary table
Critically for us, the event_schedule table is not restored in this process. This is likely a design choice as the job may have changed and therefore the audit trail left in the event_schedule table is no longer necessarily trustworthy.
Restoring the job history
Now we know why we are losing all this data, we can now make steps to restore it. To preserve the job execution history, the logic flow for a job sync should be something like,
- Make a backup of the event_schedule table
- Sync the jobs
- Restore the missing rows from the event_schedule table
1. Making a backup of the event_schedule table
This can be done with the following simple SQL code,
select * into [event_schedule_backup] from [event_schedule]
This SQL code simply selects every piece of data in the event_schedule data and slaps it into the event_schedule_backup table which it creates for us on the fly with all fields. So we don't even need to use a CREATE TABLE command in advance here for the backup table.
2. Synchronising the jobs
This is done with our existing job import script, Import_Jobs.bat
3. Restore the missing rows from the event_schedule table
This is a little more involved. We want to copy job schedule information from the backup table to the live table, but we don't on reflection want to copy all the jobs in the backup,
- event_schedule IDs for jobs which haven't been deleted shouldn't be copied back. For example, the jobs schedule info for jobs in the local folder (the Office 2007 install job) will be present in both the backup, and the live table after the sync. Attempting to insert duplicate rows into a table will fail horribly, so our SQL needs to avoid this.
- It is possible that the imported job binary will have some legacy jobs pruned. It is therefore likely that some event IDs will be removed by the import. We do not want to re-insert schedule IDs for such jobs.
The SQL for this is as follows,
INSERT into [event_schedule] select * from [event_schedule_backup] where event_id in (select event_id from event_schedule_backup except select event_id from event_schedule) and event_id in (select event_id from event)
Translating this into English this pretty much says,
"Insert into the event_schedule table all data from the event_schedule_backup table whose event_id's are not already present in the event_schedule table, and whose IDs are also present in the event table"
Probably as clear as mud. But I never said it was going to be easy...
Putting it all together
So, what we really want is a nice tidy script which will do all this for us. The script must,
- Perform a SQL task to backup the event_schedule table
- Import the jobs binary
- Perform a SQL task to restore the event_schedule table
So, create a folder on you slave deployment server called JobImport. In here create the following import.bat file,
REM DS Job Syncronisation Script REM Written by Ian Atkin, ICTST, July 2009 SET SQLPATH=C:\Program Files\Microsoft SQL Server\90\Tools\Binn SET DSPATH=C:\Program Files\Altiris\eXpress\Deployment Server "%SQLPATH%\sqlcmd" -i backup_history.sql REM Import the Global Jobs Folder "%DSPATH%\aximport" \\master_ds_server\express\global.bin /o REM Restore deleted history "%SQLPATH%\sqlcmd" -i restore_history.sql pause
Be sure to change the above reference to the master_ds_server again to suit your environment!!
Now create the following backup_history.sql file,
use express if exists (select * from sys.tables where name='event_schedule_backup') BEGIN DROP TABLE event_schedule_backup END select * into [event_schedule_backup] from [event_schedule]
and the following restore_history.sql file,
use express INSERT into [event_schedule] select * from [event_schedule_backup] where event_id in (select event_id from event_schedule_backup except select event_id from event_schedule) and event_id in (select event_id from event) drop table event_schedule_backup
Notice I've made the following small house-keeping additions,
- if the event_schedule_backup table exists at the start of the process, its removed.
- at the end of the process the event_schedule_backup table is removed
And that should do it. By executing your new batch file you should now be able to synchronise your jobs without losing your job history.
But always remember that this information we restore using this procedure is removed by Altiris by design in the import process. You see it possible that your jobs have seen major revisions since the last syncronisation, as its not trivial to check for this -especially if images and batch files are involved. As a result, some jobs you think have been applied by looking at the UI may have in fact been applied in their previous incarnation.
For example, you might have a job to install web plugins. This job might undergo several revisions in its lifespan. First it might have only installed Java, later Shockwave is added, and later again Flash. It is not immediately obvious which incarnation was executed.
So, my advice is if you intend to use a technique like this to restore job history, put in some good change management to ensure that production level jobs are not revised in-situ, but are created afresh and the old one retired.
Kind Regards,
Ian./
wow
nice job on this thanks for posting
Jonathan Jesse
Practice Principle
ITS Partners
Humbled as usual
Ian, thanks for taking the time to share all the stuff you come up with!
Jim Harings
Technical Solutions Consultant
Xcend Group
http://xcendgroup.com
: - O
Fantastic solution and documentation. Wow. The Gold Standard!
-Geo
Don't forget to mark the solution to your forum post if it has been answered!
It is neat isn't it?
It is neat isn't it? ;-)
Thanks for the positive comments -appreciated! One of the most absolutely fabulous design choices Altiris ever made was to forgoe the the use of a proprietory DB and use MS-SQL as a backend server. Makes for a much more transparent and customisable product.
Kind Regards,
Ian./
Ian Atkin
Senior Developer for the ICT Support Team,
Oxford University, UK
Perfect
I'm just in the process of setting up synchronisation and found this post.
Awesome Ian Thanks C
Hojiblanca
Would you like to reply?
Login or Register to post your comment.