Endpoint Protection

 View Only
Expand all | Collapse all

SNAC leads to BSODS

Migration User

Migration UserApr 28, 2009 12:26 PM

  • 1.  SNAC leads to BSODS

    Posted Apr 23, 2009 03:09 PM
    Hi everyone,

    A couple of weeks ago I started up snac on a test SEPM, and tested the functionality without any problems.  This week I enabled snac on my production sepms.  A couple of hours after I enabled snac, our helpdesk got flooded with pcs getting bsods.  And once the pc got a bsod, it would continue to bsod after a reboot.  I did not rollout a host integrity policy.  However, clients started their Symantec Network Access Control service, and many started to become unusable.  I called support, and as a workaround, I removed the snac.xml file from the license directory and restarted the SEPM service.  Then in some cases, machines were fine after they retrieved a new policy.  However, on the whole, a large number of my computers contine to get the bsods even with an updated policy and the snac service no longer running on the client.  Clients are running MR2.  We tried uninstalling and reinstalling MR2.  We tried cleanwipe.  I also tried upgrading a client to MR4 MP1a.  None of these steps has resolved the issue.  Has anyone else experienced such a problem?  I recommend to anyone reading this, that if you are considering snac, you need to do a very thorough amount of testing before you consider rolling it out.  Also, if you enable snac on a sepm, clients communicating with that sepm will start their snac service regardless of whether or not you have assigned a host integrity policy.

    Bob


  • 2.  RE: SNAC leads to BSODS

    Posted Apr 23, 2009 03:31 PM
    Are you running windows vista service pack 1? If this is the case then there is a problem with the Wgx(64).sys driver that shipped with SEP 11.0 and SNAC 11.0 not being compatible with Vista SP1. However this issue was suppose to be fixed with MR2 and since you said that the upgrade to MR4 didn't help I am not sure this is the problem. Post if you are running Vista SP1, and if you are we might be able to trouble shoot the issue.



  • 3.  RE: SNAC leads to BSODS

    Posted Apr 23, 2009 05:21 PM
    Hi, the clients are all running XP.  I also have a case open 281-571-536.  Right now I have run sylindrop on one client so it is communicating with a test sepm that never have snac installed.  Waiting to see if that helps.  The frustrating thing is I cannot find a pattern yet.


  • 4.  RE: SNAC leads to BSODS

    Posted Apr 23, 2009 05:24 PM
    Its good you have a case open. I will keep trying to brainstorm on this end as well, but if you guys come up with a solution it would be helpful if you could post it here as well. Thanks.


  • 5.  RE: SNAC leads to BSODS

    Posted Apr 23, 2009 06:01 PM
    Usually BSOD's appear if there's a problem with the driver / hardware problem. Can you post the System Event log of the BSOD?


  • 6.  RE: SNAC leads to BSODS

    Posted Apr 23, 2009 07:02 PM
    This is the event in the log:

    Error 4/23/2009 5:43:27 PM System Error (102) 1003 N/A WXPTAPCGMP771


  • 7.  RE: SNAC leads to BSODS

    Posted Apr 23, 2009 07:03 PM
    Hi, please post full log.. the log you posted is just an entry. Please post the description.


  • 8.  RE: SNAC leads to BSODS

    Posted Apr 26, 2009 09:36 AM
    Hi everyone,

    Today is the fifth day we have had the problem.  It appears to be a problem with a file called wgx.sys.  Besides that the machines are BSODing on fw.sys which is a Checkpoint secureclient file.  At this moment, my case is escalated to the highest level and developers are working on the problem.  They have given me one fix that replaces he wgx.sys file and that did fix the bsod on 2 test machines.  However, it did break wifi permanently for one of the two machines.  Also, they keep telling me to turn off snac on the back end and force a policy update.  However, I did that 5 days ago.  I even tried using sylinkdrop.exe to migrate impacted hosts to a sepm that never had snac.  Symantec is telling me that we need to make sure that the Symantec Network Access Control server is not running.  I am pretty sure we make this happen but still get bsods.  I will provide more info when time permits and I progress towards a solution.

    Bob


  • 9.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 02:50 AM
    If you don't mind, Can you also post the dump here for research purpose.

    Thanking you.



  • 10.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 08:10 AM
    We aren't running the SNAC bit, but we noticed the same thing on a handful of computers and the one thing they all had in common was XP, SP2 or SP3, but most of all, the offending machines were running Cisco VPN software.
    There was something about the VPN software and SEP installs - we ended up imaging the computers and installing the software locally and not over a network and all was fine again.
    Is there something about your machines that might have intereferred with the clean installation of SEP/SNAC?


  • 11.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 10:37 AM
    The initial install of SEP did not create BSODs.  We didn't see BSODs for the 8 months we were running SEP until snac was enabled on the SEPMs.

    So we think we have a solution.  We are running a script that does the following:

    smc -stop
    net stop wgx
    ren c:\windows\system32\drivers\wgx.sys wgx.back
    smc -start

    Then reboot.  This is possible if we can get into a machine.  Often times we see that BSODs occur as soon as the wifi card starts and gets associated.  If we get a connection online, then we can push the script to impacted users.  If a user cannot in anyway get logged in, we found that disabling the wifi in the bios first and then having user login, we can get script run on machine. 

    SEP is supposed to remove or disable the wgx.sys when the snac.xml file is removed (and sepm service restarted on sepms), however we did not see this behavior.  So disabling snac seems to be insufficient to stop the bsods. 

    Note that our standard was still mr2, and symantec says mr4 mp1a fixes this.  I have heard we still got bsods on mr4 mp1a, but I have not witnessed this myself. 


  • 12.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 12:09 PM
    Hi Bob, please post the dump, it will definitely help in troubleshooting if SNAC component is having compatibility problems.


  • 13.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 12:14 PM
    Paul,  Symantec support has the dump already and has done the analysis.  It was confirmed that fw.sys was crashed due to a compatibility with wgx.sys.  It has been confirmed that rolling back snac does not effectively de-activate wgx on clients.  So we had to engineer our own solution with the help of Symantec support.  The ultimate solution is to install MR4 MP1a, but this was not immediately possible.  Trying to push a 120MB install to hundreds of machines was going to take too much time.  Disabling the wgx.sys file is very quick and seems to stop the bsods.  It is not an officially supported solution, but it should get us out of the woods so we can distribute MR4 MP1a on our own terms.


  • 14.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 12:22 PM
    Hi Bob, fw.sys (SecuRemote Miniport VPN Service) am I correct? Can you disable this?


  • 15.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 04:34 PM
    Paul,

    We do not want to disable this. 

    So we have addressed a few dozen pcs today that had the bsod issue, but stopping smc and wgx, renaming wgx.sys and then restarting smc.  So far, no repeats of the bsod issue, and no report of negative impact.  It also possible that some machines resolved themselves after I did a database restore to a backup version taken before snac was installed.  I can't prove that - this is just a theory.  In conclusion, upgrading MR2 MP1 to MR4 MP1a fixed the issue but is time consuming - especially for remote users.  Renaming wgx.sys seems to have fixed the issue and it was very quick.  We are waiting to confirm this when people take their systems home and use wireless.  It seems that wireless connections are a common factor.  Also, Symantec and my company agree that the issue is a conflict with wgx.sys and fw.sys.  Based on what I have seen, the problem occurred on MR2 and both NGX R56 and NGX R60.  Upgrading NGX did not fix the problem.  I also believe that MR4 MP1a on either version of NGX fixed the problem. 

    If you are running sep and have not rolled out snac yet, make sure you do very solid testing.  Build a test sepm, install snac on it and not production, migrate test hosts to the test sepm (we use sylinkdrop) and test any version of the sep client in your environment.  I advise you don't bother with snac on MR2.  I acknowledge that my issue was related to the existence of secureclient.  Also, make sure you back up your databases before installing snac on sepms.  Finally, the mere existence of snac on sepms willl make your clients start the snac service, even without a host integrity policy. 

    I will try to report more tomorrow morning.  The team thinks we are basically out of the woods.

    Bob


  • 16.  RE: SNAC leads to BSODS

    Posted Apr 27, 2009 07:58 PM
    Hi Bob, just curious is this happening on all workstations, I haven't tried SNAC before but planning on enabling it.


  • 17.  RE: SNAC leads to BSODS

    Posted Apr 28, 2009 02:44 AM
     Doesnt BSOD mean a problem with hardware and drivers typically. I would tend to think so. Very weird to see this happening because you you enabling SNAC. Whilst we have not tested SNAC, I think we will do so now to ensure that we dont face similar issues.


  • 18.  RE: SNAC leads to BSODS

    Posted Apr 28, 2009 12:26 PM
    Through testing, Engineering and support found that the reason why BSOD was still occurring between Checkpoint and SNAC was because the wgx.sys driver was still running, even after SNAC had been stopped. We ahve provided the following steps below to have the customer confirm that the wgx.sys driver was no longer going to load after boot, even though SNAC.exe was no longer starting.  Our findings will be implemented in an upcoming release.  TBD for more details.

    On the SEPM:
    1. rename SNAC.xml in tomcat\etc\license to SNAC.xml.bak
    2. restart SEPM service
    3. make change to Policy (ex. change message snooze time from 3min to 4 min)

    On the Client:
    1. have client update policy
    2. SNAC service should change from automatic 'started' to manual 'not started'
    3. Reboot Client PC into Safe mode
    4. Boot into Safe mode
    5. Goto C:\Windows\System32\Drivers and rename ‘wgx.sys’ to ‘wgx.bak’
    6. Reboot PC

    Steps to check and confirm:
    1. on command prompt type ‘sc query wgx’
    2. Confirm the state is ‘STOPPED’
    3. On command prompt type ‘sc query snac’
    4. Confirm state is ‘STOPPED’


  • 19.  RE: SNAC leads to BSODS

    Posted Apr 28, 2009 12:26 PM
    see comments below


  • 20.  RE: SNAC leads to BSODS

    Posted Apr 28, 2009 12:38 PM
    Hi ESS, is this happening on all types of PC's? or just specific models?


  • 21.  RE: SNAC leads to BSODS

    Posted Apr 28, 2009 12:47 PM
    Hi Paul,

    We were unable to reproduce the problem in-house.  Through dump analysis, and testing with removal of SNAC through the SEPM (Along with strenous weekend man hours), we were able to come up with a successful workaround.  We are still investigating the issue with WGX.sys running, even though SNAC.exe is no longer running.  The customer noticed the issue on 2 separate makes, as well as 2 separate models, respectively.  if you notice the same behavior, this would be the first direction you should take prior to exercising other options.  The full dump was key in understanding the customer issue in this regard.


  • 22.  RE: SNAC leads to BSODS
    Best Answer

    Posted Apr 30, 2009 10:06 AM
    I agree with this.  The problem with the situation was we suddenly had 500 users calling in with problems.  So we were scrambling to get users working.  Also, our machines only do mini dumps by default and supported needed a full dump.  So when I initially provided the dump, I only gave the mini dump.  A full dump generates a very large file which takes time to generate.  Then you have to upload the file (in our case a 2GB file up to Symantec).  I have also tried to engage CheckPoint about this, but I have not been given much help.  They also want the dump, so I am trying get that to them.  I am also confirming that it is ok for me to give Symantec the Checkpoint software.  

    About a day or two ago, we distributed the workaround that renames the wgx.sys file, and so far we have had 100% success with this solution.  We have not had anyone call back with a bsod if the wgx.sys was renamed or if we upgraded the client to MR4 MP1a.  I am now working on a plan to distribute MR4 MP1a out to the whole user population.  

    I will add that I also have servers managed by the same sepm.  Not one reported a bsod.  In fact, we have not found a single case where the bsod occurred on a machine without secureclient installed. 


  • 23.  RE: SNAC leads to BSODS

    Posted May 05, 2009 11:42 AM
    We rolled out the wgx.sys workaround to the whole company.  We have not had a report that this did not work.  Note that the tricky part of this solution is, one may have to disable the wireless card (hardware switch or bios) to get Windows to even run.  The best solution is to deploy MR4 MP1a, but the wgx.sys workaround allowed the help desk to get people working again.