Clients disconnecting/reconnecting constantly
Updated: 21 May 2010 | 34 comments
We made some IP address changes at our co-location in preparation for a company wide change. The IP address of our SEP server changed but all DNS entries have been updated and a majority of our computers connect just fine. Problem is that a lot of will connect/disconnect/connect/disconnect non stop, all day long. I have found no problems with settings. IP addresses are ok, sylink.xml lists the correct server name and IP address. It seemed like stopping and restarting the Symantec IIS web server seemed to let the clients connect but that only lasts about 15 minutes then start acting up again.
Any suggestions on where to look next?
discussion Filed Under:
Comments
What server?
What is the server information? (OS? Type (32 or 64)?
How do the computers that disconnect connect? Are they on a different subnet/IP range?
Any other changes having been made?
Version of SEPM?
Server 2003, 32bit They are
Server 2003, 32bit
They are on different subnets and IP range. (colo is 10.129.x.x now, rest of company is still 192.168.x.x)
Only other change has been upgrading management server to 11.0.5. Clients still running SEP version 11.0.4202 have the same issue.
IIS Logs might have
IIS Logs might have something to say..
VMWARE-- SEP 12.1 vs McAfee vs Trend Micro
Where are the logs located?
Where are the logs located?
Start -run - logfiles or
Start -run - logfiles
or WINDOWS\system32\Logfiles
before that i would suggest enable logging on secars
in the IIS manager -expand SEPM website -right click on secars-properties-check Log Visits.
Then in the log files directory
W3SVC1 ( for default website ) or W3SVCxxxxxx
there will be a dated file inside this folder.
http://service1.symantec.com/SUPPORT/ent-security.nsf/docid/2007090612034148
VMWARE-- SEP 12.1 vs McAfee vs Trend Micro
Done. Should I be looking
Done.
Should I be looking for any particular error messages?
htp://........./secars.dll
htp://........./secars.dll SMC 401 xx or 500 etc
VMWARE-- SEP 12.1 vs McAfee vs Trend Micro
Only thing I see after the
Only thing I see after the IP address of a machine I'm assuming is connecting is "Smc 200 0 1236"
hmm....200 means
hmm....200 means communication is fine...but i think this is the log when client gets connected..
Any errors in exsecars.log ??
Program files\Sym...\Symantec Endpoint Protection Manager\data\inbox\log\exsecars.log
VMWARE-- SEP 12.1 vs McAfee vs Trend Micro
Multiple Servers or Locations
Do you have muliple SEPM servers? Or multiple locations in a policy?
I have seen specific setups cause "endless sever/location switching". Usually it's caused by two SEPM servers that have different policies because they haven't replicated -- but it can happen for other reasons.
For additional information I would start with the exSecars.log (mentioned by Vikram) and the Sylink.log file on the client.
To enable the Sylink.log on the client, see:
http://service1.symantec.com/SUPPORT/ent-security.nsf/docid/2008041812561948
To enable the exSecars detailed logging, set this registry key:
[HKEY_LOCAL_MACHINE\SOFTWARE\Symantec\Symantec Endpoint Protection\SEPM]
"DebugLevel"="4"
And then restart IIS.
There is only one SEPM
There is only one SEPM server but there used to be two. I removed the secondary server some time ago but for some reason, the old server still shows a certificate or something in the sylink.xml file on all machines. Could that be messing with the policies? I have several locations set up by gateway addresses.
Also, I can ping the SEPM
Also, I can ping the SEPM server (by fqdn and ip address) and even telnet to it using the default port 8014. When I ran the SEP support tool, everything passed except it couldn't contact the SEPM server but for no obvious reason.
Location Specific connection settings
Check if you have a location-specific connection setting in the policy. This can be a problem if you have multiple locations in a policy.
To check, Open Clients button --> Policy Tab.
At the bottom of each of your locations, there should be a blue bar with a + symbol that says "Location-specific connection settings:"
Open the plus symbol.
If the setting is Group-Push, or Group-Pull, it's fine. Move to the next location.
If you find a location that is in Local-Pull, Local-Push, or Standalone, then this could be a problem and you'll want to continue to investigate.
What we are looking for is a condition that makes the client switch to, say, Location B -- but once the client switches to that location some condition changes (such as the server connection) which makes the client unqualified to be in Location B. So it switches immediately back to Location A -- which qualifies it for Location B, etc. etc.
You can switch the group from "Local" to "Group" by click the blue "Task >>" link to the left and clicking the communication option.
See if that fixes the issue. If you do want a Local communication setting, perhaps it would be best to set them all to Push and then change the ones you want on Local one at a time to isolate which location causes the issue. Once you figure that out you can adjust the location switching rules for that location to fix the issue.
I'd put a screen shot in but I'm not near a server at the moment!
If all of the above does nothing for you, then it's time to start looking at the Sylink and exSecars log.
And about the 2nd certificate in the Sylink.xml file -- don't worry about it. It's left sort of like an "incase" you turn the other server back on. But it does not in and of itself contain a policy or in any way affect communication.
All of the location specific
All of the location specific settings are correct. None are set to Local.
Down to the bottom
Then I'd say it's time to get those Sylink and exSecars logs and see what's happening.
As mentioned before,
For additional information I would start with the exSecars.log (mentioned by Vikram) and the Sylink.log file on the client.
To enable the Sylink.log on the client, see:
http://service1.symantec.com/SUPPORT/ent-security.nsf/docid/2008041812561948
To enable the exSecars detailed logging, set this registry key:
[HKEY_LOCAL_MACHINE\SOFTWARE\Symantec\Symantec Endpoint Protection\SEPM]
"DebugLevel"="4"
Open %SEPM Install Dir%\tomcat\etc\conf.properties
Add this line:
scm.log.loglevel=FINE
Save the file.
And then restart IIS.
You find the exSecars log file at %SEPM Install Dir%\data\inbox\log\exsecars.log
Here's what I got right after
Here's what I got right after enabling the sylink.log in the registry.
10/22 15:54:12 [3644] ~~~Sylink log started. (SEP Product Version in registry: 11.0.5002.333, Sylink File Version: 11.0.5002.301)
http://service1.symantec.com
http://service1.symantec.com/support/ent-security.nsf/854fa02b4f5013678825731a007d06af/012986c18a2387b88825750c007870b9?OpenDocument
VMWARE-- SEP 12.1 vs McAfee vs Trend Micro
The way it was installed
If Vikram is right, then this may happen because you automate the installation in such a way that the product is installed as the System user.
I don't reconize this error right away, but it appears to be saying, "I'm not even able to put data on the network -- something goes wrong before I send my first packet out."
Can you be a bit more
Can you be a bit more specific about "you automate installation in such a way that the product is installed as the System user"?
I have had a steady green dot since turning on my laptop this morning at 8am. In the system log it has not shown a single disconnection which is a bit unusual since it typically flickers quite often or just stays disconnected.
I will keep an eye on it and post a section from the sylink.log file if it disconnects again today. I can email the entire log since starting it yesterday to any of you to disect if you would like.
Vikram, I checked the reg
Vikram, I checked the reg key from that article and there is not a GlobalUserOffline value.
Here's some more of the log.
Here's some more of the log. The green dot disappeared a little after 12pm and has not come back online yet.
Sylink: first look
Hi,
Taking a look at your Sylink file I see:
1) You had a successful connection to the SEPM server in Pull mode at 12:06. The heartbeat is set to half-an-hour so your next beat would be around 12:36.
2) There was a failure attempting to communicate with the GUP server to get content.
3) Even though there is a failure to communicate with the GUP server, communication with the SEPM server is working just fine.
So, this brings up the question to me, "Does a GUP communication failure cause the green light to go out?" It should not. But here's how you can test it.
If the green light goes out, and you click "Update Policy", does it come right back on?
If yes, then the GUP failure may be causing the light to go out. (Need to match Secars log and light on/off event to be sure)
If no, then perhaps this log does not capture the moment your light went out. Did the light go out after 12:06 but before 12:26? Do you know within 10 minutes or so when the light went out?
What I mean by an automated installation:
Some companies use channels such as SMS to install product onto client computers. The can do this even if no 'user' is logged into the system. Under these conditions it's possible that the automation lauches as the "SYSTEM" user instead of an administrative user -- which could cause the installer to behave abnormally. This is just a 'possibility', I don't see any evidence for sure in this forum.
P.S. It looks better if you can post the log as a attachment, not a requirement though.
A little out of the box, but
A little out of the box, but if you look at the management server list applied to your group, is your retired management server still listed? If so, were you configured for failover or load balancing? I have seen situations where retired servers remaining in a server list have caused "green dot" issues similar to yours.
That doesn't sound
That doesn't sound "out-of-the-box." Out-of-the-box thinking would be more akin to asking if the server rack was installed up-side-down or something like that. :-)
Eric C. Lukens IT Security Policy and Risk Assessment Analyst University of Northern Iowa
I will post the log from my
I will post the log from my laptop tomorrow.
What doesn't make sense is that the client on my laptop stays connected all morning but will disconnect after noon. Some times it happens around 1pm, a couple days ago it disconnected around 4:30pm and wouldn't reconnect until the next morning.
This part of the log has me
This part of the log has me wondering what's going on
CInternetException: <IndexHeartbeatProc>: The system cannot find the file specified.
What file is it trying to open what does that error code mean? How did it fail to open internet?
Could you try communication
Could you try communication without https/certificate, if you sometime have used one?
I've seen clients connecting/disconnecting/connecting/disconnecting when there's been certificate error and "Verify Server Certificate" box checked. You might not have that anymore on your management server list, but if it once was there, clients might have old sylink.xml file.
Not using HTTPS
The sylink log shows that HTTPS, or SSL, is not being used.
Ok, i didn't read the log,
Ok, i didn't read the log, just noticed message earlier saying;
"There is only one SEPM server but there used to be two. I removed the secondary server some time ago but for some reason, the old server still shows a certificate or something in the sylink.xml file on all machines"
I checked the properties of
I checked the properties of the clients designated as GUPs in the console and on the General tab, the last entry labeled "Group Update Provider" was showing False for all of the GUPs I assigned. Could that have been because I upgraded the management server to MR5 but didn't update the clients on the GUPs to the latest version as well? No clue but after installing the latest SEP client on them, GUP in the client properties shows "True" now. That may be a separate issue but hopefully that's what has been causing this problem.
The MR4 clients did not have
The MR4 clients did not have the "Tell SEPM my GUP status" feature. So that's why MR4 clients show False but MR5 clients will show True.
Hopefully that's not related to your current issue. If you are not aware of the new GUP features in MR5, I think it would be worth your time to look into them.
The main thing we haven't seen yet is a Sylink log where you can say, "At x time in the slink log I'm showing now, the client's green light went out". That would be the most useful. Other than that, we are all still guessing here.
@JT_T
It's true, there is an application level certificate present in the Sylink.xml file. And, you can turn off the validation the clients use this certificate for. So you idea is correct. But in this case, a) I doubt that's the issue (but I don't know for sure) and b) If that was the issue, normally it's only the symptom of a large issue. I'd prefer to see the larger issue first -- if there is one.
I looked through the log
I looked through the log after the green dot shut off and it looked just like the parts of the log I already posted. There wasn't a section that plainly stated that it disconnected, just that on it's next attempt it couldn't connect.
If you are in a state where
If you are in a state where the "Update Policy" does not make the green dot appear, then every time you say "Update Policy", there should be some activity, such as an error, in the Sylink.log.
You mentioned "on it's next attempt it couldn't connect".
So a few things could be happening.
1) The server is returning an error code, like HTTP 500 -- this should be shown in the Sylink.log.
2) The server is not responding. This will be shown as a timeout. I don't think the Sylink.log file uses the word "timeout" -- but it does give an error, and you can see from the time delay that it is timing out. I think it would report this as HTTP 0. Not sure.
3) The client is having some internal error that prevents it from making the request -- we should also see something in the Sylink.log
If you have your client in a state where it won't connect, the best place to start looking is near the
<SendRegistrationRequest:> and <GetIndexFileRequest:> markers. This is where I expect to see the errors.
Your first Sylink.log posting shows an error near the SendRegistrationRequest marker. I would not expect the error to be near this point if the client is already connected and then disconnects abruptly. I really expect the error to be near a GetIndexFileRequest marker.
You should be able to see the error repeatedly by clicking "Update Policy".
I would recommend using a program such as BareTail ( http://www.baremetalsoft.com/baretail/ ) to watch your Sylink.log. It's a program that 'tails' your log file.
When the issue accures, launch BareTail. After you have the Sylink.log opened in BareTail, click "Update Policy" on SEP.
You should immediently see the new text scroll by. The error should be located within this new content. If you could give a posting from last successful heartbeat, to the failed error (probably about half-an-hour apart) that would be the most useful.
Well, the issue appears to
Well, the issue appears to have been resolved by installing the latest SEP client on the GUPs. Clients have been staying connected since without any issues.
Would you like to reply?
Login or Register to post your comment.