Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.
Endpoint Management Community Blog

"Error 5 Building Script" with DS6.9 SP6

Never seen this before!
Created: 02 Jul 2014 • 15 comments
ianatkin's picture
0 0 Votes
Login to vote

Just imported some job binary for normally well behaved jobs on a new DS6.9 SP6 server (on Windows 2008 R2 with SQL Server 2012) and started getting this error on tasks running embedded scripts.

Never seen this before which is odd as I've been using these job sets for years (since SP4). It's an access denied problem...  and sometimes it happens... sometimes it doesn't.

Feels like a support ticket is looming... :-(

Comments 15 CommentsJump to latest comment

Network23's picture

Hi Ian,

I´ve also seen this issue at different customers - never found a solution. Please update if you are able to resolve this error.

Thanks
Network23

0
Login to vote
michael cole's picture

Hi Ian, is there any other errors or details you can paste into the thread to give this more searchability from the internet?

Michael Cole

Principal Business Critical Engineer

Business Critical Services

0
Login to vote
michael cole's picture

Just had a look for you, due to the intermittent nature of the job the potential solutions were varied and numerous and there is also a KB which i'm not keen to link.

One potential solution was that recreating the entire job fixed it.

Duplicating the issue is proving tricky.

Michael Cole

Principal Business Critical Engineer

Business Critical Services

0
Login to vote
schuttwm's picture

We have seen this error to in the past year or so.  Usually on the Mac side when running bash scripts client side.  A retry usually allows it to continue so we have not investigated too far.  This has been on SP5 and SP6 if memory serves.   As you say, I cannot find a common ground as to when it happens though.   For it certaintly happens about 1% of the time, but that one percent I cannot find the commonalities.    2008r2 seperate DB server running SQL 2008 on 2008r2 as well. 

0
Login to vote
ianatkin's picture

Sorry the information is scant on this... there really isn't much to give. The DS engine is just randomly failing to build server-side embedded scripts.

The exact error (for internet searchability) is "Error 5 Building Script 'C:\Program Files (x86)\Altiris\eXpress\Deployment Server\Temp\5000001.bat" 

The logs don't reveal anything. And repeated testing reveals that the same job works *most* of the time. Have reproduced it on a VM running on a new Samsung SSD just in case it was a virtual infrastructure disk contention issue. The VM has one client only as is freshly built... so high load on the hardware is unlikely to be the root cause.

Interestingly we don't see this on all our SP6 boxes. Bizzarely the ones with the highest load have never had this.

Have opened a case and attached all the logs just in case support have anything to add.

Kind Regards,
Ian./

 

P.S. But dead happy that I've got a 500GB samsung 840EVO in my machine for my VMs. 70,000 IOPS!!!!!

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote
plit's picture

Ian,

 

I've tried to dupe it, using your job. I've slightly changed the XML path (since i dont have same xml files you have)

SET XML=\\SERVERNAME\express\temp\deadbeef.xml

And job went through till 5th task (where it needs to go into automation).

So it's defienetly an environmental. I had 6.9 SP6 on 2008 R2 server. It might be some kind of "timing issue" or not enough delay between those 2 scripts. As for each Run Script engine would require to create %ID%.bat in Temp folder. and possibly it's not able to launch 2nd script since first file not yet fully deleted? :)

I've tried to call you to discuss it further, but reaching voicemail. Will try later, and we'll figure it out.

0
Login to vote
plit's picture

So i've tried to dupe it again.

Managed to get this "error 5" at one point. but wasnt able to dupe it 2nd time.

Tried to assign to 10+ machines at same time - no errors

Changed DS Console option to scan resource every 1 second - no errors.

Made Temp folder to be Encrypted and Compressed and Indexed - no errors.

Running out of ideas how to force this error to happen.

0
Login to vote
ianatkin's picture

I've got this happening on couple of servers and I've only ever seen this with SP6. Can't this be escallated to find out exactly at what point in the DS code flow this error is produced?

Pauses don't help. It's not a "cannot locate" error (for which we have scripts which loop until the file can be accessed). When this build script error occurs the bat file doesn't appear to even hit the server's disk.

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote
ianatkin's picture

Hi Pavel, Turned off SQL Profier as wasn't catching it... but then ran the job again and got error 5 again. Will be more patient from now on....

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote
ianatkin's picture

Attached are now two log files, build_ok.txt and build_error.txt

The DS Console was closed for the entire duration of the traces to give cleaner traffic.

The event_id to look at here is 2000069; this is my job which contains the server side script tasks. The logs show queries for both task sequences 1 and 2 in these and only build_ok.txt progresses to task_seq=3

I can't however see the SQL queries for the token replacement SQL in build_error.txt file which are present in the second server side script. So that means that the query ran to find the second server-side script, but the engine failed before getting as far as doing any token replacement on it.

(quick search strings are event_id = 2000069 and cond_seq = 0 and task_seq = 0 and event_id = 2000069 and cond_seq = 0 and task_seq = 1)

Not really sure what this tells us without a flow chart of how the DS engine works though... ;-)

AttachmentSize
build_error.zip 112.93 KB

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote
ianatkin's picture

"Error Building Script" still plagues me...

Here's the relevant extract of the engine logs from this session...

 

[07/23/14 11:56:51.678] e34-Processing scheduled event: EVENT_ID=2000069; NEXT_TASK_SEQ=1; STATUS_CODE=0; COMPUTER_ID=5000002
[07/23/14 11:56:51.678] e34-CRxDatabase::GetTaskType - Entered.
[07/23/14 11:56:51.678] e34-CRxDatabase::GetTaskType - Exited.
[07/23/14 11:56:51.678] e34-CRxDatabase::IsEventCpuArchSafe - Entered.
[07/23/14 11:56:51.678] e34-CRxDatabase::IsEventCpuArchSafe - Exited.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::ProcessExecuteScriptOnLocalDS - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::ProcessScriptFile - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::ProcessTaskErrorResponse - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Exited.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::ProcessTaskErrorResponse - Exited.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::CompleteScheduleEx - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Exited.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Exited.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Exited.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Entered.
[07/23/14 11:56:51.678] e34-CRxCtrlServer::GetDatabase - Exited.
[07/23/14 11:56:51.678] e34-CRxDatabase::CompleteSchedule - Entered.
[07/23/14 11:56:51.678] e34-CRxDatabase::ExecuteSQL - Entered.
[07/23/14 11:56:51.694] e34-CRxDatabase::ExecuteSQL - Exited.
[07/23/14 11:56:51.694] e34-CRxDatabase::CompleteSchedule - Exited.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::OnEventCompletion - Entered.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::OnEventCompletion - Exited.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::CompleteScheduleEx - Exited.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::AddScheduleCheck - Entered.
[07/23/14 11:56:51.694] e34-AddScheduleCheck waiting for lock on schedule list...
[07/23/14 11:56:51.694] e34-AddScheduleCheck processing...
[07/23/14 11:56:51.694] e34-CRxCtrlServer::AddScheduleCheck - Exited.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::ProcessScriptFile - Exited.
[07/23/14 11:56:51.694] e34-ProcessExecuteScriptOnLocalDS - ProcessScriptFile failed.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::ProcessExecuteScriptOnLocalDS - Exited.
[07/23/14 11:56:51.694] e34-CRxCtrlServer::ProcessExpiredSchedule - Exited.

As you can see the engine logs aren't very revealing here. The UI actually provides more data than the logs. The logs say the script file failed; nothing else. The UI though says it failed with error 5 on building the script.

Still puzzled...

 

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote
ianatkin's picture

Now convinced this is race issue between the DS engine and Windows in it's filesystem cleanups.

When the first "run script" file is created and executed, say 500003.bat, the engine is commanded to delete the file once the task completes.  This idea being that when it passes to the next task it is now clear to create the a new batch file. However, I think the windows api deletes the file asyncronously; this means the delete call only marks the file for deletion and DOES NOT WAIT until the delete has occurred before it returns.

This means that sometimes when the next 'run script' task runs (that utilises the same interpreter and thus has exactly the same filename), we can hit a scenario where the previous incarnation of the script file has not yet been deleted and is in a "DELETE PENDING" state. The subsequent CreateFile call then fails which results in the UI error "Error 5 Building Script".

And yes, error code 5 is the classic "ACCESS DENIED" which traditionally means permissions, but this transient file inaccessibility due to the delete pending gives exactly this error code (see link below) when using the CreateFile Api.

Putting this another way, the successful creation of the second local script file is dependent on the sequencing of two events that cannot be 100% guaranteed to occur in the order we need,
 
  1. The script file transitions from the "DELETE PENDING" state to being actually deleted
     
  2. The engine moves to the next script task and creates a new script file.
     

The fact this order can't be guaranteed (without adding some extra checks) gives us a race condition. This is why this issue is only reproducible on a statistical level and likely also has specific hardware dependencies which favour the DS engine moving faster then the NTFS delete.

 

Helpful Stuff

Useful (though largely not very understandable right now) doc I found whilst looking into this issue,
 
 
And someone else coding file create and deletes, suffering from this same issue,
 

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote
ninetoes28's picture

Thanks for posting this, been dealing with the error 5 issue since march (still are)

however this explains exactly what is happening in our enviroment.

0
Login to vote
david_evans's picture

I'll have the Dev team take a look at this to see if there is anything we can do. Thanks for the information.

Regards,

David

________________________________

David Evans
Senior Product Manager
Deployment Solution, Ghost Solution Suite, Server Management Suite, Task Server
www.symantec.com
________________________________

0
Login to vote
ianatkin's picture

Our workaround here is to insert 'dummy' scripts of an alternate flavour between consecutive server side 'run script' tasks.

For example if we have these two tasks in a job,

  1. 'Run Script' (cmd script)
  2. 'Run Script' (cmd script)

We change this to,

  1. 'Run Script' (cmd script)
  2. 'Run Script' (vbscript) (dummy, empty script)
  3. 'Run Script' (cmd script)

It's a bit cumbersome, but does the job of introducing a natural delay which allows time for the asynchronous NTFS file deletes to occur.

 

Ian Atkin, IT Services, Oxford University, UK

Connect Etiquette: "Mark as Solution" those posts which assist you most in resolving your problem, and give a thumbs up to useful articles and downloads

0
Login to vote