Video Screencast Help

Helpdesk went Kaboom - Runtime Errors, Slow Loads, etc

Created: 03 Jan 2012 • Updated: 04 Jan 2012 | 3 comments
This issue has been solved. See solution.

Hopefully someone here is still using HD6 and can offer me some advice. Over the recent EOY break, with nobody actually on site, our HD6 install died. Everything was working fine around the 23rd of December... then this morning Helpdesk took more than 50 seconds per click when navigating around. I logged on to the NS Server (v6.0.6074 - SP3 + R13, with SQL 2005 x64 off-box, running Server 2003 Standard, btw) and found nothing out of the ordinary i.e. pretty standard CPU and memory usage, and next to no network traffic (as this is still holiday time for most people). I ran HD6 (v6.308 SP5, btw) on the server itself and it STILL took ages. One of the techs then said that they were getting "Runtime Error" messages when clicking on the notification emails for new jobs. It's broken.

I checked Windows updates, and none had been applied. i checked AntiVirus, and that had done a few updates, but didn't seem to be much of an issue. I then ran the "Altiris Console" (installed on the same server, using the same DB instance as "Incidents") and this worked fine... so I wouldn't expect it to be a database or network issue.

Here are the issues I have identified so far;

  1. Helpdesk is horribly slow. One some very rare cases, the server responds in about 2 seconds. Almost all the time now it is between 35 and 55 seconds per click. It was working fine last year.
  2. You can't use the "redirect". Accessing the links in the emails (i.e. http://servername/AeXHD/default.aspx?cmd=viewItem&id=12345) or accessing http://servername/AeXHD/ results in a "Server Error in '/AeXHD' Application. Runtime Error" message. You need to manually set the "worker".
  3. After making a request (i.e. clicking on something), the page status sits at "Waiting for servername", and meanwhile there is the smallest blip of network activity on the server. Some 30+ seconds later, the w3svc hits a quick 30% or so, and then the data is returned to the web browser a second or so later. So it's like Helpdesk is having a nap between each request and response.

I found an article recommending a repair on the "TaskServer" and "Helpdesk" components, which didn't help - in fact it broke some permissions/settings in IIS that required some fixes to get it back to the "mostly broken" state (as opposed to the "totally broken" state it was in).

I have checked the DB server, and it looks fine performance wise.

I hate day 1 gremlins.

Any thoughts?

Comments 3 CommentsJump to latest comment

EdT's picture

Could this be a DNS or WINS issue?

If your issue has been solved, please use the "Mark as Solution" link on the most relevant thread.

MurrayW's picture

I thought about that, but;

a) It's not seeming to affect the NS Console navigation, reports, queries, etc. which are on the same box and SQL instance.

b) It now appears that, once you have loaded a queue (or waited for the "New Incident" or "Edit Incident" pages to load), that the rest of the refreshes and actions are somewhat normal. By this I mean, I click "My Incidents" and it takes 50 seconds to load. I then open a job and it's near instant. I click "Next" at the top and it's fine. I click ANY of the queues again ( or "New Incident", or "Edit Incident") and it takes FOREVER.

c) As stated previously, I have performed the test ON the NS server itself using "http://localhost/aexhd" and still experience the same issues - even after I remove the "custom.config" and reset the "Default.aspx" so that I am running vanilla SP5.

Any ASP pages I have written and added to the AEXHD folder (extra search/query pages) work fine. NS Console works fine. It's only Helpdesk that has seemingly died. And that makes me think that it's a HD6 issue (or maybe IIS Application Pool issue).

Quote: "I know not what other much is else. But what other else is there?" - ***** (OP hidden as that was too awful! Lol)

MurrayW's picture

Easy fix, but unusual fix. You see, we have a second instance of SQL on the same DB Server, one where Helpdesk USED to be installed before we migrated it to fix a collation issue (after a Server 2000 -> Server 2005 migration)..a and that's the cause of the issue. This "other instance", for some unknwon reason, had caused a hang in the Microsoft Full-Text-Search engine (MSFTESQL.EXE process). While all looked good, one of those processes was showing a usage of 111MB in Task Manager. When we took a guess and stopped the "OldAltiris" instance FTE Service, the big process dissappeared from Task Manager... though, that should have been isolated from the new DB instance... so that's pretty weird.

What we found after this, was that the new DB instance FTE service could NOT be restarted. It hung on "Stopping" and required a restart to fix it. So it appears that the FTE processes, which SHOULD be isolated, are actually working together... and a crash on one instance had effectively killed the HD instance FTE Service. After fixing this, everything went back to normal. 

We now have all the customisations back in place, and it's all working as well as it did last year (which is still not as quick as it should be). Grrrr. That took longer to fix than it should have.

Quote: "I know not what other much is else. But what other else is there?" - ***** (OP hidden as that was too awful! Lol)