Why are my SQL server transaction logs growing each day to the point they overrun available disk space? This is really getting annoying. I am having to truncate them manually. Why isn't the automated procedure working?
After checking EvtQFast and EvtQueue thread settings, and adjusting the Throttle delays as found in KB 2344 and KB 17079, the problem just won't go away. The transaction logs were getting still getting a lot of pressure.
We are going to need some more information. So we set the following Performance Monitor counters on the SQL Server:
- SQLServer:Buffer Manager: Buffer Cache Hit Ratio
- SQLServer:Buffer Manager: Checkpoint pages/sec
- SQLServer:Buffer Manager: Page writes/sec
- SQLServer:Databases: Log Bytes Flushed/sec (Select correct Altiris database first)
- SQLServer:Databases: Percent Log Used
Understanding the Problem
Since Altiris recommends the database be maintained in Simple Recovery Model, one of the tenants that this model essentially tells the DB to truncate the log on checkpoint. The checkpoint procedure constantly removes the transactions that have been committed from the log file, and truncates the logs automatically.
This process of checkpointing is considered a low priority operation, and is automatically pushed back in priority on SQL servers that have a lot of load or pressure.
We were able to notice that the Percent Log Used was staying between 90-99%. It would cycle anywhere in between 90 and 95%. This told us that checkpointing was not occurring frequently enough.
Addressing the Issue
There are two ways to address this issue. You can reduce the pressure to the logs by slowing the flow of data to the Notification Server. Or, you can handle the flow more efficiently. In this case, upgrading the database from SQL 2000 Sp4 to SQL 2005 Sp2 was the direction to go. SQL 2005 checkpointing procedures are more effective and efficient.
Alternately, you could build a 2nd Notification Server, and reduce load that way as well.