Someone contacted me via email to ask about our experiences migrating from DS 6.9 to 7.1 considering my "DS 7.1 Performance" post concerning the longer times involved in completing 7.1 jobs compared to 6.9. I figured this information would be of use to the Connect community so here it is in article form.
Our current environment is roughly 500 desktops/laptops, most currently on WinXP SP3, with a fast-track towards Win7 migration. We have DS 6.9 SP5 running on a VM (xen) server with automation via PXE. We have a dedicated server for SMP 7.1 and are currently only really using the DS piece heavily. We deploy all our desktops and laptops with DS as well as most software. In 6.9 we used run script return codes to chain individual jobs together to "build" our images (hardware independent). We use the same jobs to deploy individual pieces of software and to install the software in our image so that when we need to update a piece of software, there's one job to update. When we have enough changes to software or settings we just kick off a new "build" chain and have all changes incorporated into the new image. On the deployment end we deploy the image with DS and run jobs to perform machine-specific installations and configuration.
We were really looking forward to 7.1 for the job builder view and conditions which accomplishes the same thing as job chaining but with an actual UI instead of having to manually follow the chain of individual jobs in 6.9. We hired a consultant (Expressability, highly recommended) to help configure SMP7.1 SP1 and get up to speed on basic features. In two weeks with the consultant we got it configured (with a lot of gotchas and workarounds) enough to be confident that I could continue on my own and duplicate the build and deployment processes we had in 6.9. We went with automation folders in 7.1 so that we could leave 6.9 up and running with PXE and use both environments. We're using USB drives to do initial deployment in 7.1. It took me about two months to mirror the build process and get it working reliably, mainly due to various issues I ran up against which I had to figure out how to work around. The bulk of the real work beyond workarounds involved making software releases for all of our main pieces of software along with installation jobs for each, and various scripts to configure settings automatically. Note that there is no importing of jobs from 6.9 to 7.1. All jobs must be recreated from scratch.
At this point I have an automated build process working in 7.1 as it did in 6.9 and it seems pretty reliable. Performance is the big hit. As I outlined in the post you're referring to, basically anytime you do any portion of a job (tasks, other jobs or conditions)it takes ten seconds longer than in 6.9. When you consider the fact that our Win7 standard build runs over 80 separate tasks and conditions, that's a big chunk of wasted time. What's worse for us is deployment, which has over 50 tasks and happens each and every time we deploy or re-image a system. We had the deploy time in 6.9 down to 15-20 minutes, whereas in 7.1 it takes about twice that.
The upside to 7.1 is that it's a lot easier to track multiple versions of software and eventually (since we did it right and made software releases for everything including detection rules) we can leverage our software into managed delivery policies for automatic updating etc. The Job UI is fairly nice as well, beyond the usual frustration of working with inherent latency in any dynamic web-based UI. The Silverlight UI is certainly leaps and bounds better than NS 6.5.
Issues Discovered During Migration:
- No way to specify success codes for run script tasks: all jobs with run scripts returning non-zero exit codes must ignore task failures or the jobs will stop when tasks return a non-zero code, even if a condition is specified for the code.
- Conditions are buggy: only the first few tasks in a job can be targeted by conditions. The UI will only let you pick the top few and if you try to work around it by moving things around the conditions do not work reliably.
- Automation does not perform hardware inventory
- Initial Deployment no longer provides a way to rename a system in automation; configuration tasks only work in production.
- If there is a pending agent update (new sub-agent to be installed, etc) the update can run in the middle of tasks running and cause tasks to fail with very unspecific general errors.
- "Success" conditions don't seem to work correctly. When a software delivery task returns a pre-defined success code, if a condition is set to trigger on that success it oftentimes always returns false.
- Changes to custom unattended files are not picked up by deploy image tasks, even if you go into the deploy task and edit the custom configuration ("save changes" button does not get enabled).
- Capture image tasks when re-ran create new image resources named the same as the previous image. All deploy image tasks much be manually updated to use the new image resources.
- Deleting old image resources is a convoluted process.
- Java folder browser in console does not work correctly in Win7 with default security settings.
- When using automation folders in an image, windows automatically resets the boot menu timeout to 30 seconds when mini-setup runs, causing long delays in the several reboots needed during Win7 setup after deployment.
- There is no GUI method in the console of updating automation settings such as adding WinPE packages.
- Each and every component of a job takes a 10 second round trip between server and client agent.
- Manage->Computers does not show real-time connection status of client computers in the tree view.
- Reboot to... tasks, both Automation and Production, will reboot a client regardless of current environment. If you are in automation and you run a "Reboot to Automation" task, the client will reboot and go back into automation before proceeding.
Workarounds for Issues:
- In production, make software packages for scripts and specify success codes. Use Quick Delivery tasks with these instead of run script tasks
- Whenever you need a condition beyond the first few tasks of a job, make a new "sub-job" with the condition logic and result set of tasks and call it from the parent job
- Use WMIC in automation; in production run hardware inventory manually
- Develop a custom way to query for machine specific data in automation (we made an HTA), save it and pass it on to production, then use condition logic to run configuration tasks based on that data
- Be sure to run an "Update Configuration" task before doing any long jobs which have the potential to be interrupted. Avoid scheduling tasks during new sub-agent rollouts
- Just always use return-value conditions. They work even with success codes
- In the deploy task toggle the Sysprep Configuration radio button between "Generate..." and "Custom...", then "save changes" becomes active to re-save it
- Rename the image resource (Manage->All Resources->Software Component->Image Resource) before capturing a new image to make it easier to distinguish between old and new
- Delete the image resource (see previous workaround for location). Make note of the GUID! Delete the GUID folder in \\<dsserver>\Deployment\Task Handler\Image. Do the same for all site servers (yes, manually, on each and ever site server)
- Run gpedit.msc, change "Computer Configuration->Windows Settings->Security Settings->Local Policies->Security Options->Network Security: LAN Manager authentication Level" to "Send LM & NTLM - use use NTLMv2 session security if negotiated"
- In automation before image capture, using bcdedit: reset the boot menu timeout, export the boot menu settings, delete the automation folder bcd entry. In production once deployment is complete: restore the exported boot menu settings with bcdedit
- Create identically named configurations in bootwiz, use it to set what you want, create automation folder installer and uninstaller and overwrite those created by the console (\\<dsserver>\NSCap\bin\Win64\x64\Deployment\Automation\PEInstall_x64\PEInstall_x64.exe, etc). Never run 'Recreate Preboot Configurations' in the console after this!
- No workaround for this. Jobs WILL take longer to complete in 7.1 than in 6.9 by at least 10 seconds per task (double that if using a condition).
- You can get a tree showing connection status by going to "Settings->Notification Server->Site Server Settings" and expand the tree down to a site server, "Services", "Task Service". Convoluted, but it works if you absolutely need to know (and know which site server the machine is on).
- See this post for how to create a job that detects the environment and only reboots if necessary.
Pros and Cons specific to DS 7.1
- Integration with the rest of SMP
- Ability to run jobs from within other jobs
- Ability to handle exit codes via conditions within the job builder UI
- Automation folders work much better than old automation partitions (direct access to production drive in WinPE)
- Pre-defining custom tokens is much nicer than building custom tokens in-line
- Forced to have jobs with run-script-based conditions ignore all task failures (see Issue #1)
- Forced to run hardware inventory manually during deployment if hardware information is needed
- Initial Deployment is lacking configuration features in automation
- Agent updates and task handling do not play well together
- Conditions are buggy
- Image management is convoluted
- Editing preboot configurations beyond additional driver installation is a manual process
- Web UI is much less responsive than native 6.9 UI (but better than NS 6.5).
- Web UI only works in Internet Explorer, negating any cross-platform benefits of using a web ui in the first place...