Altiris Software Dev Kit (ASDK)

 View Only

Randomization Technique for Risk Mitigation in Software Deployments 

Dec 19, 2016 03:34 PM

Reasons for Randomization

I am an Altiris administrator in a very large organization with over 50k managed endpoints. When our team is tasked with a software deployment, there could be some unknown conflict with another application that is used in the company. Whenever we are asked to deploy software or a patch, we build a package and replicate it out to our task servers, test deploy it to a few machines, and then build a policy to deploy it. Deploying to all of the computers at once can be a very dangerous strategy. What if we are upgrading Microsoft Office, which could lead to significant calls to the help-desk about the application looking or working differently? What if we are upgrading our call center's soft-phone client and something goes wrong with the installation? These types of risks exist in every automated software deployment because it's virtually impossible to test every possible scenario that could exist out in the real world.

Deployment Strategy

We schedule most software deployments over the course of a week. Typically we target about 1/16th of our computers on the first day, add 2/16th on the second day, add 3/16th on the third day, add 4/16th on the fourth day, then add the remaining 6/16th on the fifth day. This gives us the ability to control the upgrade to ensure that machines are not getting adversely affected. At any point we can halt the deployment while limiting the number of computers that have been affected. If we deem the deployment a higher risk, we may spread out the deployment over multiple weeks, starting with groups as small as 1/256th in the first day, and adding as we deem appropriate.

Workstation Groups

The key to this deployment strategy is having well defined workstation groups of the appropriate size. We use the fact that all computers have a unique computer ID (Guid) and that all of the hex digits are randomly assigned. We initially created 16 filters, WG: 0, WG: 1, WG: 2, ..., WG: F. The filter WG: 0 would have the following definition.

SELECT Guid
FROM vRM_Computer
WHERE Guid LIKE '0%'

Because of the size of our organization, the 1/16th groups contained over 3,000 endpoints each. This led us to create two additional sets of groups; 64ths and 256ths. Included below are examples of these queries.

-- 1/64th of all computers
-- computers with guid starting 00,01,02,or 03.

SELECT Guid
FROM vRM_Computer
WHERE Guid LIKE '0[0-3]%'

-- 1/256th of all computers
-- computers with guid starting 00

SELECT Guid
FROM vRM_Computer
WHERE Guid LIKE '00%'

Creating these filters manually would have been very tedious, so I wrote a VBS script to create them using the ASDK (ZIP attached). You would need to run the scripts provided from the console server, the parent if in a hierarchy. The constant FolderGuid should be updated to the guid of the folder where you want them created, we created the structure shown below to keep things organized. The script creates the 64th filters with the naming convention of WG 64: 00-03, and the 256th as WG 256: 00. We named our filters this way because they are easily found when creating targets that use them.

WG folder structure.PNG

Staging Filter

Create a staging filter that you will use to control membership into the new policy. We name our staging filters CS: Software Name. Initially, this filter would be left with no definition. As you deploy, you add workstation groups to the "Filters included in this filter" definition. as shown below.

120px_CS_Some Software 1.5.png

You can also add individual computers to the staging filter to ensure that they are upgraded first, as an advanced pilot.

Policy Target

If you are upgrading from one version to another, exclude the staging filter from the original target. Create a policy target for the new deployment, and at the end, add an Include Only with the staging filter. If a computer falls in the staging filter, it will be excluded from the original policy and added to the new policy. Once you have completely staged all of the workstation groups, you can disable to original policy, remove the staging filter Include Only entry from the target, and delete the staging filter.

To demonstrate I created a filter for computers that require 'Some Software', CR: Some Software. Based on my original target, the Some Software 1.0 policy was being applied to 1,749 computers. With WG: 0 in the staging filter, which represents 1/16th of all computers, I've updated the original 1.0 target and created a 1.5 target for the new software. With the updated targets, 110 (just over 1/16) computers are now being targeted by the 1.5 policy and 1,639 remain in the 1.0. 

PT_Some Software.png

Benefits

This technique lets you control the deployment of any software policy, while also randomizing it across your endpoints. This can help spot conflicts early without adversely affecting an entire department. Because we use the staging filter as an exclusion, this technique will work if your base target is all computers, or any subset of all computers. Based on the size of your base target, or the risk in deploying it, you can easily adjust the deployment schedule to suit your needs. I find another benefit using this filter technique is in our hierarchy. During deployments, you only need to update the staging filter definition by adding a workstation group, then replicating the filter down to the children.

Other Uses - Patch Management

I assume we follow a similar schedule for Windows Patches. In the days following patch Tuesday, we deploy to Staging Group A, a static group of non-production IT computers just to verify that they install properly and we don't see anything weird. We then continue on to Staging Group B, which is comprised of IT production computers (desktop support, developers, application owners, etc) to verify that the patches don't conflict with any of our production applications. We then have 3 more staging groups, Staging Group C (1/8), Staging Group D (1/4), and Staging Group E (1/2). Below is an example of our Staging Group C target. 

PM_Staging_C.png

On a side note, adjusting the patch management policies through the various staging groups can be extremely time consuming. I plan to build a Workflow process that will make it easier to manage and will post an article, should it come to fruition.

Other Uses - Software Management Policy Schedules

We use Managed Delivery Policies for all software deployments which means that we have hundreds of them. In most cases, we have a midnight no repeat schedule, which ensures that the policy runs the first time a computer is notified that it is in the policy target, and we have a schedule that runs daily to verify compliance. This is the default schedule we use as set in Managed Delivery Settings.

default policy schedule_0.PNG

If you leave the schedule set to the default for all of your policies, every computer will run every policy at the same time. This generally doesn't affect the endpoints, but they will all report their policy compliance schedule at the same time which can flood the server with NSEs to process. We came up with a randomization technique similar to the workstation groups, that sets the policy times based on the guid of the policy. The schedule below keeps policy compliance checks from running on the quarter-hour schedule, and randomizes all of the policies throughout the day.

policy schedules.PNG

Most of our polices follow this schedule, although there are times when we add an On Boot schedule, or some other modification depending on the situation. I built a Workflow that sets the schedule based on the guid, which we call with a custom right-click action.

Statistics
0 Favorited
2 Views
1 Files
0 Shares
0 Downloads
Attachment(s)
zip file
Create Filters VB.zip   1 KB   1 version
Uploaded - Feb 25, 2020

Tags and Keywords

Related Entries and Links

No Related Resource entered.