Troubleshooting Correlation service issues and why correlation engine sometime stops to create new incidents

Article:TECH95134  |  Created: 2009-01-22  |  Updated: 2013-01-10  |  Article URL http://www.symantec.com/docs/TECH95134
Article Type
Technical Solution


Subject

Issue



You should monitor how many opened incidents and alerts you have in your database. To many incidents/alerts will slow down the system and make correlation in some situation incorrect.

 


Error



Typically there are no error displayed in the console. However the Incidents tab doesn't display new incident or the WebUI display some queues are full (Red bar in event service )

Sometime you might get the following error message (icesvc.log):


2010-02-05 00:00:35,216 [DATA] ERROR com.symantec.sim.icesvc.MessageProcessor - Unable to process event: 153005 (log at depth 4)
com.ibm.db2.jcc.a.SqlException: Error for batch element #2: DB2 SQL error: SQLCODE: -530, SQLSTATE: 23503, SQLERRMC: SYMCMGMT.SYMC_SIM_EVENT.REFSYMC_SIM_CON178

or

2010-02-05 00:00:35,216 [DATA] ERROR com.symantec.sim.icesvc.MessageProcessor - Skipping logging beyond depth of 5 for error: Unable to process event: 153005

or

2010-02-05 00:00:35,232 [DATA] ERROR com.symantec.sim.icesvc.MessageProcessor - Unable to process a message
com.ibm.db2.jcc.a.sg: Non-atomic batch failure. The batch was submitted, but at least one exception occurred on an individual member of the batch. Use getNextException() to retrieve the exceptions for specific batched elements.
 

If there are no errors in the icesvc.log about unable to process events, look in the simcm.out file.  This file should be empty, but in some cases it may contain the error:

Exception in thread "main" java.lang.NoSuchMethodError: com.symantec.sim.cm.plugins.Assignee.<init>(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)V
     at com.symantec.sim.monitors.state.StateMonitor.init(StateMonitor.java:139)
     at com.symantec.sim.cm.svc.PluginManager$PluginEntry.load(PluginManager.java:421)

Cause



To many open Incidents is one possible cause.

Sudden power loss which damaged the System State Monitor is the known cause to the problem which writes the error to the simcm.out file.


Solution



Too Many Assets in the Database

Even though SSIM has an Assets component, it is not an Asset Management product.  The Assets component of SSIM is primarily used for Incident Prioritization and only the most critical assets in environment should be added to Assets in SSIM.

There is no hard number that is too many because every environment is different.  From the number of Incidents, vulnerabilities, services listed per Asset, to how events are correlated per second and the frequency rules trigger on the SSIM.

Note: The following procedure allows you to keep your critical assets, but Incident, vulnerability, and service information is not maintained.

To quickly reduce the number of Assets are in the SSIM

  • Find the Assets you need to keep using the Group by and Filter functions and export those assets to CSV.
  • After critical assets are exported, run the asset removal tool to remove all Assets from the SSIM.  For more information on the Asset Removal Tool read the article How to remove Assets without the SSIM Client
  • After all Assets are removed, import the CSV file which contains the Critial Assets you need to keep.

Too Many Incidents in the Database

Import and run the attached queries to see how many Incidents are in the SSIM:

  • All SSIM Incidents shows the count of all Incidents grouped by Rule.
  • Open SSIM Incidents shows the cound of all currently open Incidents grouped by Rule.
  • Ref ID of last Inc in DB shows the Reference ID of the last Incident that has been written to the Database and should match the Reference ID of the most recent Incident that displays in the SSIM Client.

Import and run these queries to see which rule creates the most conclusions and how many events are attached to Incidents:

  • Count of Conclusions by Rule shows the count of all conclusions grouped by Rule.
  • Count of Events by Rule Name shows the count of Events attached to Incidents grouped by Rule.

Damaged StateMonitor

Disable the System State Monitor in the Rules Tyle under System Monitors to have the SSIM process the event flow again.

For further assistance, contact Symantec Support.

Technical Information

You may need to clear the ICE queues to remove any problem incidents.

Warning: Before performing these steps make sure you are aware that there will be Incident data loss.  The queue you remove cannot be used later.

  1. Connect with Putty, or login to the console.
  2. As root, run the commands:
     
    service icesvc stop
    service simserver stop
  3. Navigate to /opt/Symantec/simserver/queues/ice/input

    Delete all queue files with the command rm -f *.queue
     
  4. Navigate to /opt/Symantec/simserver/queues/ice/output

    Delete all queue files with the command rm -f *.queue
     
  5. Run the commands:

    service simserver start
    service icesvc start

Attachments

List of SQL queries that can be imported directly in SSIM UI
ICE.zip (3 kBytes)
shows the count of all Incidents grouped by Rule
All SSIM Incidents.qml (1 kBytes)


shows the count of all currently open Incidents grouped by Rule.
Open SSIM Incidents.qml (1 kBytes)
shows the Reference ID of the last Incident that has been written to the Database and should match the Reference ID of the most recent Incident that displays in the SSIM Client.
Ref ID of last Inc in DB.qml (1 kBytes)
shows the count of all conclusions grouped by Rule.
Count of Conclusions by Rule.qml (1 kBytes)
shows the count of Events attached to Incidents grouped by Rule.
Count of Events by Rule Name.qml (1 kBytes)


Legacy ID



2009072208080554


Article URL http://www.symantec.com/docs/TECH95134


Terms of use for this information are found in Legal Notices