Taskinstances table growing rapidly
| Article:TECH144662 | | | Created: 2010-11-19 | | | Updated: 2012-10-09 | | | Article URL http://www.symantec.com/docs/TECH144662 |
Problem
The DataBase is growing fast, all tasks tables are growing, in the log viewer you see a looping cleanup task:
Error deleting: Altiris.TaskManagement.Data.TaskExecutionInstanceNotFoundException: Unable to find task instance 00000000-0000-0000-0000-000000000000 in the database.
The above error is showing up several times a second in the a logs, filling the logs. This error almost always shows up with 2 other errors, one of which indicates a Deadlock. However, the SQL server is showing no locking at all.
Error
Process: AtrsHost (2068)
Module: AtrsHost.exe
Source: Altiris.TaskManagement.Data.AltirisSqlHelper.RepeatForDeadlocks
Description: AltirisSqlHelper.RepeatForDeadlocks(): Non-deadlock exception: Altiris.TaskManagement.Data.TaskExecutionInstanceNotFoundException: Unable to find task instance 00000000-0000-0000-0000-000000000000 in the database.
at Altiris.TaskManagement.Data.TaskExecutionInstance.GetTaskInstance(TaskInstanceGuid taskInstanceGuid)
at Altiris.TaskManagement.Data.TaskExecutionInstance.<>c__DisplayClass2.<DeleteTaskInstance>b__1(DatabaseContext ctx, Object state)
at Altiris.TaskManagement.Data.AltirisSqlHelper.RepeatForDeadlocks(Int32 retries, Int32 sleep, Object state, RepeatForDeadlocksDelegate func)
Process: AtrsHost (2068)
Module: AtrsHost.exe
Source: Altiris.TaskManagement.Data.AltirisSqlHelper.RepeatForDeadlocks
Description: AltirisSqlHelper.RepeatForDeadlocks(): Failed all retries
Process: AtrsHost (2068)
Module: AtrsHost.exe
Source: Altiris.TaskManagement.Maintenance.CleanupTaskDataTask.DeleteExcessWorkingData
Description: CleanupTaskDataTask.DeleteExcessWorkingData(): Error deleting: Altiris.TaskManagement.Data.TaskExecutionInstanceNotFoundException: Unable to find task instance 00000000-0000-0000-0000-000000000000 in the database.
at Altiris.TaskManagement.Data.TaskExecutionInstance.GetTaskInstance(TaskInstanceGuid taskInstanceGuid)
at Altiris.TaskManagement.Data.TaskExecutionInstance.<>c__DisplayClass2.<DeleteTaskInstance>b__1(DatabaseContext ctx, Object state)
at Altiris.TaskManagement.Data.AltirisSqlHelper.RepeatForDeadlocks(Int32 retries, Int32 sleep, Object state, RepeatForDeadlocksDelegate func)
at Altiris.TaskManagement.Data.TaskExecutionInstance.DeleteTaskInstance(TaskInstanceGuid taskInstanceGuid)
at Altiris.TaskManagement.Maintenance.CleanupTaskDataTask.DeleteExcessWorkingData(WaitHandle eventStop).
Environment
Symantec Manaement Platform 7.0 SP3, SP4, SP5
Cause
This issue is caused by following two reasons:
1. The clients have an old task that they are trying to upload status for. It results in spamming the server with the results from an old or bogus task causing the server to fill up with "junk" data.
2. There is a problem with the cleanup task getting data in the tables out of sync then getting to a point where it just fails.
Any high load on the NS server or high I/O disk values on the SQL (if off-box) should be considered specially while running the work around.
Solution
Possible workarounds as below:
Stop and change the Cleanup Task
What the weekly change is for, is to prevent the errors from showing up again, except once/week. Each time it runs, you should check to see if it ran successfully, and if not, stop it and delete it again. You'll need to continue doing this until a fix can be found.
If task is disabled instead of cleanup, then NO data will be purged, and the table will grow unexpectedly. Perform cleanup on weekly basis. If issue continues, running it manually or daily is recomended.
To help prevent the on demand clean up task from running change the cleanup options and set the maximum number of working database rows to 1 Million.
Manually Delete the Data
Note: Truncating these tables will cause all task history to be removed. If there is task history data that is needed, it must be retrieved before truncating the tables.
Delete data in the database by truncating the following task tables:
TaskInstanceResults
TaskInstances
TaskinstanceParents
TaskInstancesStarted
TaskInstanceStatus
TaskOutputPropertyValue
ie Use the following SQL command
Truncate table Taskinstances
Note that you may have to monitor the following tables and see if the exceed 200k rows. If they do, then you should truncate them as well.
TaskInstanceSummaries
TaskInstanceEvents
TaskInstanceChildEvents
TaskInstanceExecutionInfo
TaskInstanceJobNodes
TaskInstanceresultSummaries
You may have to stop the Altiris Object host and Dataloadeder services if you are having problems truncating the tables.
- Find the Cleanup Task Data task, and find all run instances (not the one that says "Pending" under the Status), and stop them. About every minute, refresh this page until they all show a big red X.
- At that point, delete the run instances. By this time, you'll notice that the logs are no longer filling with errors.
- Now, modify the schedule (the one that says pending) to run weekly instead of daily.
Run a Cleanup Script on Clients
Create a new run script task with the following syntax:
---------------------------------------------------------
REM Get the Altiris Agent install path
FOR /F "tokens=2*" %%A IN ('REG.EXE QUERY "HKLM\Software\Altiris\Altiris Agent" /V "installdir"') DO SET AgentDir=%%B
set tempbat=%temp%\AgentClean.bat"
REM Create temporary batch file to execute while the agent restarts
echo "%AgentDir%\aexagentutil" /stop > %tempbat%
echo rmdir "%AgentDir%\TaskManagement\cache" /s /q >> %tempbat%
echo rmdir "%AgentDir%\TaskManagement\status" /s /q >> %tempbat%
echo rmdir "%AgentDir%\TaskManagement\statusXml" /s /q >> %tempbat%
REM echo rmdir "%AgentDir%\TaskManagement\lti" /s /q >> %tempbat% -- remove away this remark if you are running on a TS 7.1 and above
echo ping localhost -n 30 >> %tempbat%
echo "%AgentDir%\aexagentutil" /start >> %tempbat%
echo exit >> %tempbat%
REM Executes temporary batch file
start "" /MIN %tempbat%
---------------------------------------------------------
Now find the machines which are spamming the server using task results. To do this, first of all you have to find what results are frequenty sent to the NS by those client mahines. Run the following SQL query to see what Tasks have had data sent up to the server in the last 24 hours.
*Stop the service "Altiris Client Task Data Loader" and "Altiris Support Service".
*Stop all the running "Cleanup Task Data Daily" from Altiris Console>Settings>Notification Server>Task Settings>Cleanup Task Data.
*Run Cleanup task data manually with the attached SQL query.
*Run the stopped services "Altiris Client Task Data Loader" and "Altiris Support Service".
*Change the schedule time of the "Cleanup Task Data Daily" to few minutes later to re associated the task with the windows schedule.
By following these steps, the error might get reproduced when you run the services again as there is a regular check for the "taskinstances" table not to exceed 250000 rows causes an "On-Demand Cleanup Task Data". You have to stop it from the console, start the Cleanup Task Data from the console (right click > Start Now), if the error appears again just stop the running Cleanup Task Data, and re-run it when you see that it returned a value of success or fail, this should be performed until all of the failed Cleanup Task Data tasks disappear, and be sure that it can run without problem on its scheduled time.
----------------------------------------------------------------------------------
If Null values are found frequently under following query results, it will confirm that there is a problem
select max(i.name) as Name,TaskVersionGuid, count(*) As count from taskinstances ti
left join itemversions iv
on ti.TaskVersionguid = iv.Versionguid
left join item i
on iv.Itemguid = i.guid
join Taskinstanceresults tir
on tir.TaskInstanceGuid = ti.TaskInstanceGuid
where tir.endtime > getdate() -1
group by TaskVersionGuid
order by count DESC
-------------------------------------------------------
If the query results show that there is NULL value in their name, then it will confirm that it is likely to be problematic task. Please copy that TaskVersionGuid and put it in for the following Query
-------------------------------------------------------
declare @TaskVersionGuid uniqueidentifier
set @TaskVersionGuid='XXXXXX-XXXXXXXX-XXXXXXXX'
select max(vc.name) as Name, ResourceGuid, Count(*) as Count from TaskInstances ti
join vcomputer vc
on ti.ResourceGuid = vc.Guid
where ti.TaskVersionGuid = @TaskVersionGuid
group by ResourceGuid
order by count DESC
--------------------------------------------------------------
Run the above script on all affected machines.
Attachments
|
|
|
|
|
|
|
|
| Description | The data that is being deleted is historical logging data only. Logging records data every time the task runs, which means that the data accumulates rapidly. If you do not do a purge, then the log files can grow very large. The data being deleted does not contain any critical information. Only in some cases would you need to trucate tables to wipe them. The tasks themselves will not be deleted, but only the results of the tasks. The results include the start time, when it finished, the completion status. If you do truncate the table, only the result status remains. If the table exceeds 250,000 rows, the cleanup task may not operate correctly. |
Article URL http://www.symantec.com/docs/TECH144662
Terms of use for this information are found in Legal Notices









Thank you.