Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.

My struggles with DS7.5

Created: 28 Oct 2013 | 20 comments
YannickVR's picture
In the past few weeks i've been setting up CMS/DS7.5 for a lucky customer. Here are some of the problems i've encountered;
 
  • Importing Windows 7 for SOI: put all the files in a folder called "Sources" and import it via the console. The installer will map a E:\ drive and look for E:\sources\setup.exe to start windows 7 install
  • Importing Windows XP for SOI: import the i386 folder. Installer will map a E:\ drive and look for E:\i386\winnt32.exe to start the installation
  • PXE while debugging: if you cancel a Client Deloyment Job that has a PXE task and the agent on the client cannot report to the NS server, the client will not get a PXE filename anymore. You have to boot to windows and have the SMA report and update, or delete the machine from the database and add it again (as predefined) until you can give it a PXE job again. Giving it a seperate PXE job sometimes helps as well. Please add an option to see some logging with the decision that PXE makes and why (eg. "No bootfile given to xx-xx-xx-xx-xx because no task is available".)
  • Client Job for deploying an image that has a Software Install job will give an error thats something like: "There are no licensed agents available" If the SMA is active on the computer but doesn't have the software management plugin installed.
  • After booting to WinPE it takes 5 minutes for the actual deployment to start. Any way to speed this up?
  • After deploying a sysprepped Windows image, it takes 5 minutes for the agent to start and continue the deployment job. Any way to speed this up?

Hope this helps people and improves the product. If you've got any questions about my findings, let me know. 

Operating Systems:

Comments 20 CommentsJump to latest comment

Thomas Baird's picture

Ahhh... we've seen some of this.

 

don't know about the first two.

As for logging, yes, it'd be nice to have reasons given in the logs.  That's a lift though - getting that info from the server to the client.  The client hits a web service, that calls a stored procedure, so the only part of the process that currently knows is the stored procedure, and you'd have to update all 3 steps to get this through.  Good idea, but not a quick fix.

The software install thing we need to look at again.  We had a similar problem before.  <sigh>

The 5 minute thing sounds like you're having issues binding to Task Server.  Should be working a bit faster.  6 minutes is the default check-in interval for Task, IF the client is not getting tickled to check in sooner.  As for speeding things up - there are many things that need speeding, not just this.

Anyway, we'll look into the SWD issue.  <sigh again>

Thomas Baird
Private Consultant & open to full-time opportunities.
That means I CAN help beyond the forum (directly).

 

YannickVR's picture

Hehehe, let me know if you need something tested :)

As for the 5 minute thing: I noticed the Agent on the client is starting quite slowly as well, checked the logs and got this:

<![CDATA[StartService('AeXNSClient'): Failed to lock service database, error 0x0000041F (Error 1055 (no description available)

I'll investigate the logs some more during deployment, eager to know whats holding it up.

Thomas Baird's picture

Yeah, there is also now a built-in delay in agent startup I'm not 100% positive about - it might also answer some of that.  <sigh>  Here's to knowing everything eh?

Thomas Baird
Private Consultant & open to full-time opportunities.
That means I CAN help beyond the forum (directly).

 

YannickVR's picture

I was looking at some logs and it seems that the agent has a 2 minute delayed start, but starts if a signal is given that all the services have been successfully started.. If i find it again i'll dump it here.

YannickVR's picture

Ok i've managed to speed it up some. I'm deploying Windows XP (yes i know..) so i've added this to the sysprep.inf file that is used for the deploy image task:

[GuiRunOnce]

"sc config AeXNSClient start= demand"
"net start AeXNSClient"

This sets the AeXNSClient service to manual and starts it on the first boot to windows.

Also i've configured from the sysprep.inf file to log in automatically once and then lock the workstation. After all my software is deployed the system reboots and is usable right away. Shaved about 3 minutes from the deployment time!

Next: speeding up the start of deployment in WinPE. Thomas: Is the PECTAgent the only thing used in this environment? 

 

YannickVR's picture

Excerpt from Pectagent.log:

[2013/11/04 16:57:10.924 1668:1672 2] CClientConfig::GetResourceGuidFromWebService:517 CClientConfig::GetResourceGuidFromWebService, Request XML = <request policyKey="AAAAAQABliosVcgb6fbqlZjmy2ndFFYA3CLL0CEu4DdjxKLYcGVo0uDOArnmSld/FrrfCjLitScsaiYvQTC+VG8KwoIuT86lnIxUlrZHehq+jnCcqAdJ9paKC2MDeb/+XsaXeE0LVUEXkaDm8ONhnYsJqS8GIfB9JhFtbb5duH0GSrhcT5y8ghAbhTbzeIgEi8m5Ih5l7riAimpfrMKKs8XUjsAT53yVyw+EDbOtLLf4l5JQYxkYmE3+fGfWkGi8qNqh1PG/M0Pj8QIiKgeHGjA64kbsc0pum3NoOG32AJMkQ1j6hpbLsogL1utzmckdKpSMpJ0ro1HC97DCs73b4TFbzG6LzQ==" typeGuid="{2C3CB3BB-FEE9-48DF-804F-90856198B600}">
<resourcekeys>
<key name="uniqueid" value="/PQTPLu1IOxQ0Y6vY18NwQ=="/>
<key name="uniqueid" value="rIEyvwLAPchR7Q7rA/L7Yg=="/>
<key name="uniqueid" value="IR5aqXWjtiMljkzsCsWA6Q=="/>
<key name="uniqueid" value="EQNB26xk4N8DDj3IjjZ1aA=="/>
<key name="uniqueid" value="L07nmYDzazJ5y+V/eTmpXQ=="/>
<key name="uniqueid" value="/GWlPXvAXx8grVJ6bc8r7g=="/>
<key name="snbios" value="VMware-42 3b 72 6c d4 c5 00 2b-73 40 23 b1 54 0a 20 93"/>
<key name="snboard" value="None"/>
<key name="uuid" value="6C723B42-C5D4-2B00-7340-23B1540A2093"/>
<key name="uuidtrans" value="423B726C-D4C5-002B-7340-23B1540A2093"/>
<key name="mac" value="00-50-56-BB-60-51"/>
</resourcekeys>
</request>
 
[2013/11/04 16:59:11.373 1668:1672 2] apps\DeploymentClient\PECTAgent\ClientConfig.cpp:588 CClientConfig::GetResourceGuidFromWebService, Response xml =<response><Resources><Resource guid="{C39AF151-F3B8-4C30-89B0-776FB960BF45}" typeGuid="{2c3cb3bb-fee9-48df-804f-90856198b600}" name="helion-test01" ref="c39af151-f3b8-4c30-89b0-776fb960bf45" existing="True" IsPredefined="false" IsManaged="true" IsHardwareIdentified="true" /></Resources></response>
 
If you look at the timestamps, it takes about exactly 2 minutes for the agent to get a response for GetResourceGuidFromWebService. Any ideas why? I cannot find anything usefull about the time spent between in any logs..
 
Edit: ran it again with the Altiris Log viewer next to it.. PECTAgent.log says that it has done the request, but altiris log doesn't. After 2 minutes of silence the request pops up in the Altiris log and pectagent continues the process. Problem is in the PECTAgent waiting to actually process the request.. Weird. 
mikeholmes's picture

We're also seeing the delay in PE before Ghost starts.  Any news on the subject?

YannickVR's picture

Unfortunately not.. I've tried tinkering with some proxysettings but that didn't fix it. Logging isn't really useful as seen in my previous post, it just stops for 2 minutes.. no idea why.

Edit - I will be installing the SP this afternoon and rebuilding the images, will report back again :)

mikeholmes's picture

I went to 7.5 HF1, and the situation has not improved.  Seeing general slowness with task services, compared to 7.1 SP2.  Agents have a delay in binding to the task server inside production (occasionally losing the registration as well), which is likely related to the issues inside automation.  Going to contact support on it.

YannickVR's picture

Same here, no change (although I have to rebuild the boot images still, did you do that?)

Any news from contacting support on your side? My customer is coming from 6.5 which is slower than the new 7.5 task sequence for deploying their computers so it's ok for now, but it's still a shame that it's waiting for 2 minutes before starting deployment..

mikeholmes's picture

I did rebuild the boot environments, to no avail...still sits in automation for ~5 minutes before Ghost kicks off.  I should time it.

On the support front, I've got an open case, but have yet to chat with my analyst due to workload on Symantec's side, and my being unavailable at work yesterday.  I believe the analyst will be calling me today. 

YannickVR's picture

Just a thought, which I saw in my invironment; do you have the Deployment Agent for clients enabled? The way I set up the environment was;

-Enable NBS
-Load Windows Installation files and License
-Perform a scripted os installation
-Capture with Ghost

For some reason the deployment agent was installed during the SOI, but I never enabled it. Because it was installed I never got any errors until (yesterday) I started trying to use the "copy file" task on imported computers from the old environment. I enabled the agent now but I still have to rebuild + test.. my thoughts are that the deployment agent first has to be downloaded or something from the PE environment somewhere before it can start..

mikeholmes's picture

We're deploying a ghost image that includes the SMA and all agents in CMS.  We've not done much customization for the PE environment aside from some driver additions. 

Had good luck over the past week or so, but today we saw some general problems with task services.  Agents losing their registration/bind with the task server was fixed by restarting Altiris Service on the NS, and tasks remaining as queued in the console was fixed by restarting the CTDL service twice on the task server.  Not the first time we've seen these types of issues since moving to 7.5, but they were in 7.1 also.  Much more frequent in 7.5.

The delay in PE before the first task (ghost) starts in our environment takes about 10 minutes to start currently.  I directed support to this thread yesterday, hopefully that helps.

jpellet2's picture

Same experience here. WinPE starts, agent loads and then we wait for about 10 minutes and the finally a task starts to image the machine. Didn't have this issue with 7.1 and it was not this bad with 7.5. It seems a little worse since HF1. Not sure if I rebuild the PE images if that will help or not.

etk1131's picture

Any updates on this?  I'm seeing the same behavior in 7.5 HF2 as well. 

mikeholmes's picture

The delay appears to be a known issue at this point.  I went to HF2 also, with no change to the behavior. 

Check this KB article: http://www.symantec.com/business/support/index?page=content&id=TECH213571

Seems like this would've been seen and addressed while the product was held back for a year.

etk1131's picture

Thanks Mike.

I wish the delay was just in Automation, instead of in general with the task services.  We're having agents lose the task server in the middle of jobs.  Have you been able to work around that?

mikeholmes's picture

Could you elaborate on what you're seeing with task services?  I've got an open support case for both the delay issue (which is now secondary) and tasks ceasing to process.  

In our environment, we've got a single site server acting as the task server for all clients, NBS server, and package server.  Our NS does not run task for clients.  Since 7.5, about every 1.5 to 2 days, the task engine fails completely, halting all jobs in-progress and not allowing new tasks to start.  All client agents at this point lose their task server binding, and cannot re-register.  Usually restarting the Altiris service on the NS fixes it, but sometimes it also takes a restart of the CT DataLoader service on the site server to get tasks out of the queued state. 

I made it to backline task support where they installed and ran the SMPDiag tool (pretty neat, didn't know about it) and multiple SQL queries.  The diagnosis was that the task engine was being overwhelmed, so I'm taking steps to improve database performance and free up the load on task in general.  So far, the issue has reoccurred once (the following day), which prompted me to go ahead and truncate some tables in the database per an article referenced in SMPDiag. 

Sorry for the long-winded reply...my issue is a first according to support, and it would be interesting if anyone else is seeing this.

etk1131's picture

We have 4 site servers running task, NBS, and package services.  The NS services tasks as well, but only 500 clients max. 

What happens is during a job, the client will sometimes lose its site server connection, and if you reset the task agent, it will come back eventually.  In other occasions, the agent has to be restarted for it to come back, or it'll just magically re-bind to the site server on its own.  During deployments it's really noticeable, as after a machine joins the domain, it will sit there for 30 minutes before picking up and going to the next task. 

It's random, and I haven't had a way to fix the situation permanently.  Today, a client got the first task (of two) in a job, then decided it didn't feel like doing the second task and sat there with no task server.  This sort of thing didn't happen in my environment when I was on 7.1 SP2 (MP1.1).  My techs are starting to complain more loudly than usual.

I only have data for my location, but have reached out to my remote techs to see if they've had any issues.  The other thing is that my site is the only one with a Server 2003 R2 (32-bit) site server.  Everywhere else has Server 2008 R2, so I'm wondering if it's just the machine.

 

YannickVR's picture

Good to see that it's a known issue now, should be fixed quite fast.

I've only got one productionenvironment running at a customer now and because of migrationpressure, we've decided to leave the delay be a delay; it's not blocking anything in our environment and total deployment including all middleware takes about 30 minutes now which is acceptable (but ages compared to my MDT W8 deployment of 5 minutes)

I will wait with upgrading anything until the delayerror is fixed ;)