Video Screencast Help
Search Video Help Close Back
to help
New in the Rewards Catalog: Vouchers for "Symantec Technical Specialist" and "Symantec Certified Specialist" exams.

Add alerting for CPU and disk utilization

Updated: 06 Nov 2009 | 5 comments
jmock's picture
16 Agree
1 Disagree
+15 17 Votes
Login to vote
Status: In Review

We have had several instances where are scanners were inoperable without any notification.  In all cases the CPU was at 100% or the root file system was 100% utilized.  Threshold monitoring of these two items would be greatly appreciated.

Comments

Cricket17's picture
18
Aug
2009
0 Votes 0
Login to vote

Have you looked at using SNMP monitoring?

We are using SNMP and feed the data to our alerting systems.  We are monitoring outbound/deferred queues, memory, load average (we can't find CPU - the SNMP counter doesn't seem to be updated),  disk usage (root, data, opt), and network traffic.

imagebrowser image

Ian McShane's picture
06
Nov
2009
0 Votes 0
Login to vote

Reviewed.

Thanks, this request has been noted.

Please do continue to add further details as necessary. 

Cricket17's picture
02
Feb
2010
0 Votes 0
Login to vote

See Case 411-245-831

We have been monitoring

ssCPURaw[Nice|System|User|Idle].  It appears you have integer overflow on these gauges.

To correctly plot the various ssCPURaw values, you need to compute the difference between current and previous, sum these differences and then divide each by the sum to get % utilization for each gauge.

It appears when the counter gets to 2^32-1 you don't correctly wrap to zero. Restarting the SNMP service doesn't fix this, but rebooting the box does. Also, Tech Support doesn't want to support this monitoring since the on-box MIBS don't include these counters.  However they are documented in document #  2008010311490354 SNMP OIDs and description that can be queried on Symantec Brightmail Gateway appliances

AdnanH's picture
02
Feb
2010
0 Votes 0
Login to vote

How long your system had been

How long your system had been up before the counter reached its max value?

Can you not use ssCpuUser, ssCpuSystem, ssCpuIdle instead of their raw counterparts?  I guess these are not prone to overflow.

Regards,

Adnan

Cricket17's picture
24
Mar
2010
0 Votes 0
Login to vote

Case # 411-245-831 is now

Case # 411-245-831 is now scheduled for the next 9.x release.  Turned out  only exhibits after 62 days of uptime (on an eight core processor).