GSS2 Client Errors - Out of Proc

bhawver's picture

Getting errors on client machines "Out of Proc". This happens both when pushing items to the machine and when you are not even running a task on that particular machine. Any ideas?

bhawver's picture

*Bump*

Bruce Wang's picture

Hello Brian,
Just trying get a clearer picture on your situation, are you able to give more information on what happens?

When pushing 'items' do you mean files, images or both. And at what stage of the process do you get the 'Out of Proc' error. Also do you see this error in a log file (if so please post it) or does it appear in a separate window? Does it produce an error log in the Event Viewer?

Cheers,
Bruce

bhawver's picture

Not much more I can say about it other than the error that pops up on clients is an "Out of Process" error. This happens when nothing is even being done on the clients (client is idle).

When I say pushing items, I mean executing tasks (images, AI Packages, configuration, etc.).

Nigel Bree's picture

Hmm. That's not an error message that I'm familiar with; is that the exact text? I just did a search of the GSS source code and the error message that seems to be closest to that is "Out-of-order producer completion", which comes from some new code in GSS2 and I have been trying to eliminate.

Normally, with pop-up message boxes you can hit Control-C to copy the text of the message to the clipboard, and that's an easier way to get the exact message into the forum given it doesn't support posting screenshots directly.

bhawver's picture

Hi Nigel,

Sorry for the delayed response. But it can take a while for end-users to report the actual errors.

First pop-up error:

Title bar: c:\program files\symantec\ghost\ngctw32.exe

aiobuf::sync()position error


Second pop-up error:

Title bar: c:\program files\symantec\ghost\ngctw32.exe

Out_of_order producer completion


Any ideas?

Nigel Bree's picture

No problem; for this particular error, the root cause seems to be a threading bug in the client code, and it causes a range of different outcomes (and thus, visible errors) depending on the exact sequence two parts of the code are interleaved. I do have a development fix for this if you want to get in touch with me at nigel dot bree at gmail dot com

The first error message will indeed have been "out of order producer completion", that's another outcome of the same underlying bug.

Derek Nelson's picture

Hi,

I just upgraded from GSS 1 to GSS 2 last week and experienced this error on 112 computers in a lab that prevented them from doing their automated shutdown for maintenance over the weekend. Is there a repair for this error or do I have to retrofit some 300 computers that I have upgraded to prevent this from happening again?

Thanks

Nigel Bree's picture

You can get in touch with me for an update at nigel dot bree at gmail dot com; it may not be completely perfect just yet (even in the release version of GSS2 this was incredibly rare in our test labs, making it a difficult one to diagnose and trace through) but it certainly helps a lot and you're welcome to try it.

BobRamsey's picture

We're experiencing this very intermittently as well. Will there be any official patch or update soon? Any idea what causes it or if there are any workarounds?

Thanks!

Nigel Bree's picture

As explained above, it's a threading bug in the code I wrote; there's unfortunately no way to mitigate this yourselves, you need a new executable.

I can't give you a date as to when a full product update would be made generally available for all customers.

gooner666's picture

I have exactly the same issues as outlined above.  I shall e-nmail you for the temp fix.
Joseph Tucker's picture

same problem here, could you email me the fix tuck777 at gmail
Thomas Cheng's picture

Getting exactly the same pop ups.

Any plans for a service pack soon?

Steven Middleton's picture

We are getting the same GSS2 Client Error - "Out-of-order producer completion" message randomly on clients on a control systems network.  The bad news is we can not do anything with the system once the error appears and it comes right back as soon as we acknowledge it.

After repeatedly hitting the only button option provided - "OK" - finally gave up and rebooted the system.  Alas, the first event was on THE configuration server for the entire server cluster...

I've already emailed Nigel for a copy of the "fix" so this post is just to further document the issue.

Nigel Bree's picture

Pity this forum software unfortunately does not allow us to create "stickies". Anyway, the full LiveUpdate patch that resolves this issue (amongst others) has been released as announced here. I don't have a complete list of all the changes, but this is definitely intended to be among them.

Thomas Cheng's picture

Unfortunately I have to report that I am still getting the same error message "Out-of-order producer completion" even after installing the LiveUpdate patch. I am now on version 11.0.1.1533 and I have just seen this on my server, and also a few of my colleagues' server.
 
Is there anything that I can do to help nail this down? The error is really frustrating us and really should be resolved ASAP.
Ben Evans's picture

I have experienced this error on one (and, so far at least, only one) client machine here, though that PC has had the Ghost client installed for a couple of months and the server has certainly been updated to 11.0.1.1533. Does LiveUpdate need to be run on the client somehow as well? I am trying a reinstallation of the client from the updated Console to see if this helps.

Nigel Bree's picture

Thanks for that information, Ben. Basically, there must be one other mechanism for triggering this error in the code, but this far I've not found what it might be. We run all kinds of tests here pretty much continuously and in the months since 2.0.1 I have not heard of one single occurrence of this here in our labs - so any mechanism by which this still happens is fiendishly rare under normal circumstances.

We do have solid information that a small number of customers do have this happen a lot - for them, it's very frequent. Since it's happening about a million times more often for them than anyone else, there has to be a reason - a particular trigger of some kind like an interaction with another program and if we could just find out what it is, we'd be able to make it happen in our labs too and fix it in short order.

As it happens, we seem to have found a particular piece of software that is common to places where this is frequent and so something that program is doing might be triggering something in ours - we're pursuing this lead but as yet we don't have hard evidence one way or the other and until we do it's open season on this bug.

Skip's picture

I just ran across this thread while researching the same problem. It occured on a computer that was just ghosted from an image created from a fresh installation of WXP sp2. It is being used by the Principle of my school where the only applications that have been installed on it are Acrobat 7.0, Office 2003 Pro, Winsurf, Sophos, and Bigfix. Could it be that there is an interaction with the Sophos, Bigfix, and Ghost agents?
FYI: If you stop the ngctwe32.exe process you can close the message box without rebooting. After that I restarted the Ghost agent. So far the problem hasn't come up again.
Nigel Bree's picture

Could it be that there is an interaction with the Sophos, Bigfix, and Ghost agents?


It's certainly possible, but I don't have a reason to suspect those in particular - I do appreciate you taking the time to give us the data point though, every bit helps. I don't think an interaction is the "cause" as such, it's just one lead we're following since there seem to be some environments where it's hugely more common than others and one unusual third-party program has turned up in a couple of them.

The main thing we're after right now is still a way to trigger this in a controlled environment so we can diagnose the real root cause, not the final symptom - the error that's being reported in the pop-up is inconsistency in a really important piece of data, but the cause of that inconsistency isn't clear yet.

Skip's picture

Nigel, School opened on Aug. 20th, and now this problem is popping up all over. It is now much more annoying because my users cannot close the message box. As a result they call me and I don't have the time. Are you any closer to a solution?
As I mentioned in an email I sent you this is a huge problem for our district. I have 200 computers at my school; but distrist wide we have close to 100,000. Last year the district purchased an enterprise license. This summer we started ghosting everything.
You mentioned a 3rd party software that was showing up as a common thread, what software is that?
Please stay in touch with me on this. I need to find a solution soon.
Thanks
Nigel Bree's picture

This is a ... complicated issue. Amongst the various work I've been doing, I think I have things so that the current builds don't do this - it's impossible to say with certainly whether I've really addressed the root cause since we have still been unable to reproduce it here, but I've got to a point where I can make it go away.

Now, we intend to release a 2.0.2 update at some point (which is as specific as I can be as to timing; I'm breaking the rules even by saying that I'm working on it). Unfortunately, however, the existing upgrade process itself seems to provoke an error too :-(. Upgrading from build 1533 to the current one would be worse for the majority of customers I can get the upgrade issues sorted out and that's the major focus of the next week or two. Fortunately that's a more tractable problem that this issue has been, but it's still going to take a little time to work through.

I do wish I had some more definitive and somewhat better news, but that's my understanding of where things stand with this issue.

Mark Berning's picture

I have received this error on one of our ghost clients (actually it happened on Windows 2003 server box)
I am running a patched GSS 2.0.1 version 11.0.1.1546
I stopped the service and restarted and have not seen it yet.
45z's picture

Nigel,  I have been getting this message in a very specific lab, and if you're interested I can burn &ship you a copy of our image files so you can see what's installed and get you our hardware specs to see if that helps.  I first started to see this error after I modified the nightly inventory report to gather all installed software, rather than just the default inventory report. I am reachable by e-mail at adam dot zahn at fcps dot edu should you be interested in the image files.

Thanks for all the help, this one must be annoying, reading from the thread..

Nigel Bree's picture

Thanks for the offer of help, Adam. If you can drop me a line at nigel.bree@gmail.com I'd like to hear more about the situation in your lab.

Since there are a few different threads going on, the current situation as of now is what I posted here. I do think I've got the problem mostly licked but any situation where a problem is occurring more frequently is still of real interest, because it may involve a different pathway for the problem to occur.