Analyzing SQL Performance using Performance Monitor Counters

Article:TECH124987  |  Created: 2010-01-03  |  Updated: 2014-06-12  |  Article URL http://www.symantec.com/docs/TECH124987
Article Type
Technical Solution

Product(s)

Environment

Issue



Analyzing SQL Performance using Performance Monitor Counters


Solution



Below is a list and an explanation of Performance Monitor (PerfMon) counters, plus recommended solutions, in five areas that should be monitored for SQL performance health.

Evaluating SQL Memory Usage (memory bottlenecks)

Counters:
 
  • Memory:  Pages/sec  (avg 20)
  • Memory:  Page Faults /sec
  • Memory:  Cache Faults /sec
  • Memory:  Page Reads /sec
  • Memory:  Page Input /sec
  • Memory:  Available Bytes  (above 5MB)
  • Paging File: % Usage (should remain low)
  • Physical Disk: Disk Reads/sec
  • Physical Disk: Disk Writes/sec
  • Process: Page File Bytes (to track a particular process)
  • SQL Server:  Buffer Manager: Buffer Cache Hit Ratio (must be above 90 percent)
 
Counters Explained:
 
  • Memory: Pages/sec – measures the number of pages per second that are paged out of RAM to Virtual Memory (HDD)or ‘hard faults’ OR the reading of memory-mapping for cached memory or ‘soft faults’ (systems with a lot of memory).  Average of 20 or under is normal.

    -  This counter needs to be used in conjunction with Paging File: %Usage.  If the Pages/sec are high, but the % Usage is low, these are soft faults (cached memory).  If both are high, these are hard faults (working set replacement).
    -  To look for excessive paging, use Task Manager and add the column: Page Faults and PF Delta for current activity.
     
  • Memory:  Page Faults /sec - measures the working set.  (high count indicates working set is too large for memory to handle - could be a mix of hard and soft faults)
     
  • Memory:  Cache Faults /sec - measures the File System Cache
     
  • Memory:  Page Reads /sec - tracks the hard and soft (memory mapping) page faults (Sustained values higher than 5 indicate a closer look at Physical Disk: Avg Read and Write Queue Length for hard page faults)
     
  • Memory:  Page Input /sec - tracks the hard page faults  (should not exceed 15)
    If you have a high rate of page faults combined with a high rate of page reads (which also show up in the Disk counters) then you may have an issue where you have insufficient RAM given the high rate of hard faults.  If Page Reads / sec stay high during heavy operations and the Physical Disk Avg Disk Queue Length remain high, RAM is the problem area.
     
  • Memory: Available Bytes – measures physical memory availability, SQL 2005 (unless manually set with limits) will use all but 10MB of available space.
    (add the /3gb switch to the boot.ini  to make the SQL kernel use only 1gb of memory instead of the default 2gb).  This figure should be above 25% of installed memory.  Note this value is dynamic and only shows last checked value, not the average.
     
  • Paging File: % Usage - This counter shows the amount of thrashing to the hard drive to service virtual memory requests.
     
  • Physical Disk: Disk Reads/sec and Writes/sec – measures disk access.  These vary widely and should be used in conjunction with the other counters.  Example: wide SCSCI drives can handle between 50 -70 I/O's per second.  A baseline should be created to do comparisons for these counters.
     
  • Process: Page File Bytes:  Page File Bytes is the current number of bytes that the chosen process has used in the paging file(s). The lack of space in paging files can prevent other processes from allocating memory.
     
  • SQL Server:  Buffer Manager: Buffer Cache Hit Ratio – measures now much the buffer is used instead of the hard disk to get data.
    (typically at 99 %, but must be near or above 90%, if not: add more memory.  The one caveat is if the SQL server is running OLTP programs, which will cause more disk reads)
 
Recommended Solutions when the memory counters are not in optimum range:
  1. Add more memory
     
  2. Have SQL server the only application running on the computer
     
  3. Move active databases to a dedicated SQL server
     
  4. Increase the maximum memory per instance (if Available Bytes is above 10MB)
     
  5. Decrease the maximum memory per instance (if Available Bytes is below 5MB)
     


Evaluating Disk Usage (HDD bottlenecks)


Counters:
 
  • Physical Disk: Avg. Disk Queue Length  (Should not be higher then the number of spindles plus 2)
  • Physical Disk: Avg. Disk Read Queue Length  (Should be less then 2)*
  • Physical Disk: Avg. Disk Write Queue Length  (Should be less then 2)*
  • Physical Disk: Avg. Disk Read /sec (Should be under 20ms, if over 50ms indicates a serious bottleneck)
  • Physical Disk: Avg. Disk Write /sec (Manufacturer dependent)

Counters Explained:
 
  • Physical Disk: Avg. Disk Queue Length – also measures the pressure on the physical disk array.  Excess of 2 per drive (3 disk array = 6) for 10+ minutes over a 24 hour period, indicates a disk bottleneck.
     
  • Physical Disk: Avg. Disk Read Queue Length  -  Excess of 2 per drive (3 disk array = 6) for 10+ minutes over a 24 hour period, indicates a disk bottleneck.
     
  • Physical Disk: Avg. Disk Write Queue Length -  Excess of 2 per drive (3 disk array = 6) for 10+ minutes over a 24 hour period, indicates a disk bottleneck.
     
  • Physical Disk: Avg. Disk Read /sec - measures the rate of read operations from the disk.
     
  • Physical Disk: Avg. Disk Write /sec - measures the rate of write operations on the disk.
 
Recommended Solutions:
  1. Verify the issue is not a memory bottleneck first
     
  2. Increase the amount of disks in the array
     
  3. Have SQL server the only application running on the computer
     
  4. Move active databases to a dedicated SQL server



Evaluating Processor Usage (CPU bottlenecks)


Counters:
 
  • Processor:  % Processor Time_Total
  • System:  Processor Queue Length

Counters Explained:

  • Processor:  % Processor Time_Total – measures the CPU utilization of each processor.  Add _Total to the count to get overall utilization of all CPU's combined.  Not to exceed 80% for 10+ minutes over a 24 period.
     
  • System:  Processor Queue Length – measures the backlog queue for processors.  Should not exceed 2 per CPU for 10+ minutes for a 24 hour period.  Example, if the server contains 4 CPU's, the count should not exceed 8 for a 10 minute period.

Recommended Solutions when both counts are too high:
  1. Add CPU's , faster CPU's or CPU's with larger L2 cache.
     
  2. 2 CPU environment, speed is more important.
     
  3. 4+ CPU environment, L2 cache is more important.
     
Recommended Solutions when Queue length is too high only:
(too many active worker threads)
  1. Move active databases to a dedicated SQL server



Evaluating Network Usage


Counters:
 
  • Network Interface:  Bytes Received/sec
  • Network Interface:  Bytes Sent/sec
  • Network Interface:  Bytes/sec
  • Network Interface:  Output Queue Length

Counters Explained:

  • Network Interface:  Bytes Received/sec – measures the rate at which bytes are received over each network adapter.  Need existing baseline to use effectively.
     
  • Network Interface:  Bytes Sent/sec - measures the rate at which bytes are sent over each network adapter.  Need existing baseline to use effectively.
     
  • Network Interface:  Bytes/sec – measures the rate at which bytes are sent and received over each network adapter.  Need existing baseline to use effectively.
     
  • Network Interface:  Output Queue Length – should always be 0, but can reach as high as 2 momentarily.  


Recommended Solutions:

  1. Verify there is not an external attach on the system
     
  2. Move active databases to a dedicated SQL server

 



Evaluating User Connections


Counter and explanation:

  • SQL Server: General Statistics:  User Connections.  Shows number of user connections (not currently connector users).    Count should not exceed 255 for 10+ minutes over a 24 hour period.
      

Recommended Solutions:

  1. Increase SQL 2005 Server configuration Maximum Worker Threads
     
  2. Move active databases to a dedicated SQL server
     
  3. Open all ports on the SQL server

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
    MaxUserPort (DWORD)  64,512 (Value:Decimal)

 




Legacy ID



346720


Article URL http://www.symantec.com/docs/TECH124987


Terms of use for this information are found in Legal Notices