Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications

Relational Data Warehouse or Reporting work loads are characterized by low volumes of very large transactions. These applications are often identified as having predominately read workloads (e.g. Decision Support, Analysis, and Reporting) with periodic feeds or batch loads. It is important to keep these characteristics in mind as we examine the significance of database design, resource utilization and system performance. The top performance bottlenecks or gotchas to avoid for Data Warehouse or Reporting applications are outlined below.

1 Database Design issue if….

  • Excessive sorting operations are performed. If you continually perform the same sorting operations over and over, you can avoid these with appropriate indexing.

  • Excessive RID lookups are performed on heap tables. RID lookups mean extra IOs are required to retrieve columns that are not in the index used. This can be avoided with covered nonclustered indexes.

  • Key lookups against the clustering keys look like joins however they are marked as “lookups” only in the XML showplan. These can be avoided with covered nonclustered indexes.

  • A potentially beneficial index is missing on join columns resulting in HASH joins. Indexes on join columns may avoid the hash.

2 CPU gotchas….

  • If signal waits > 25% of total waits, there is a CPU bottleneck. See sys.dm_os_wait_stats for Signal waits and Total waits. Signal waits measure the time spent in the runnable queue waiting for CPU. High signal waits indicate a CPU bottleneck.

  • Avoid inappropriate plan re-use. If the query is identical, then plan re-use is a good thing. However, query parameterization that allows plan re-use is only appropriate when the result set (and intermediate work tables) are of similar size to the original plan. If result set sizes vary significantly due to differing parameter values which are common in data warehouse scenarios, plan re-use can be detrimental. Bad plans can also lead to longer running queries and IO or memory pressure. Therefore, the cost of plan generation in such cases is preferable to plan re-use. Unlike OLTP, data warehouse queries are not always identical in terms of result sets or optimal query plans.

3 Memory bottleneck if….

  • Sudden big drop in page life expectancy. DW applications (e.g. big transactions) could experience big drops in page life expectancy. This is due to a cache flush from a big read. See Perfmon object SQL Server Buffer Manager.

  • Pending memory grants. See counter Memory Grants Pending, in the Perfmon object SQL Server Memory Manager. Large memory grants can be common in Data Warehouse applications. More memory may help, otherwise the user cannot execute until memory grant occurs.

  • Sudden drops or consistently low SQL Cache hit ratio. Drops or low cache hit may indicate memory pressure or missing indexes.

4 IO bottleneck if…

  • The best metric for write performance is disk seconds per read and disk seconds per write. When the IO system is NOT under significant load, there will be no disk queuing and thus disk seconds per read or write should be as good as it gets Normally it takes 4-8 milliseconds to complete a read when there is no IO pressure. Factors for IO throughput are the number of spindles, and drive throughput such as sequential and random IOs per second (according to the vendor). As the IO requests increase, you may notice disk queuing. The effects of queuing are reflected in high disk seconds per read or write. Periodic higher values for disk seconds/read may be acceptable for many applications. For high performance OLTP applications, sophisticated SAN subsystems provide greater IO scalability and resiliency in handling spikes of IO activity. Sustained high values for disk seconds/read (>15ms) does indicate a disk bottleneck.

  • High average disk seconds per write. See Perfmon Logical or Physical disk. Data Warehouse loads can be either logged with inserts, updates or deletes, or non-logged using bulk copy. Logged operations require transaction log writes. A transaction log write can be as fast as 1ms (or less) for high performance SAN environments. For many applications, a periodic spike in average disk seconds per write is acceptable considering the high cost of sophisticated SAN subsystems. However, sustained high values for average disk seconds/write is a reliable indicator of a disk bottleneck.

  • Big IOs such as table and range scans may be due to missing indexes.

5 Blocking bottleneck if….

  • Index contention. Look for high lock and latch waits in sys.dm_db_index_operational_stats. Compare with lock and latch requests.

  • High average row lock or latch waits. The average row lock or latch waits are computed by dividing lock and latch wait milliseconds (ms) by lock and latch waits. The average lock wait ms computed from sys.dm_db_index_operational_stats represents the average time for each block.

  • Block process report shows long blocks. See sp_configure “blocked process threshold” and Profiler “Blocked process Report” under the Errors and Warnings event.

  • High number of deadlocks. See Profiler “Graphical Deadlock” under Locks event to identify the statements involved in the deadlock.

6 Network bottleneck if….

  • High network latency coupled with an application that incurs many round trips to the database.

  • Network bandwidth is used up. See counters packets/sec and current bandwidth counters in the network interface object of Performance Monitor. For TCP/IP frames actual bandwidth is computed as packets/sec * 1500 * 8 /1000000 Mbps.

7 Wait statistics gotchas...

  • Since DataWarehouse and Reporting workloads are largely reads which are compatible with other reads, incompatible exclusive lock waits would ordinarily only come into play during batch loads or periodic feeds. If the top wait statistics are LCK_x. or PAGELATCH_EX, see “SQL Server 2005 Performance Tuning using Waits & Queues” for an explanation of sys.dm_os_wait_stats.

  • There is an IO bottleneck if top wait statistics in sys.dm_os_wait_stats are related to IO such as ASYNCH_IO_COMPLETION, IO_COMPLETION, LOGMGR, WRITELOG, or PAGEIOLATCH_x.

8 Indexing gotchas.

  • Large data warehouse can benefit from more indexes. Indexes can be used to cover queries and avoid sorting. For a data warehouse application, the cost of index overhead is only paid when data is loaded.

  • Check for missing indexes in sys.dm_db_missing_index_group_stats, sys.dm_db_missing_index_groups and sys.dm_db_missing_index_details

9 Watch out for fragmentation.

  • Excessive fragmentation is problematic for big IO operations. The Dynamic Management table valued function sys.dm_db_index_physical_stats returns the fragmentation percentage in the column avg_fragmentation_in_percent. Fragmentation should not exceed 25%. Reducing index fragmentation can benefit big range scans, common in data warehouse and Reporting scenarios

10 Consider Table Partitioning for fast loads

  • For the large tables common in Data Warehouses, table partitioning offers important performance and manageability advantages. For example, the fastest type of load is a non-logged bulk copy. The requirements for non-logged bulk copies are that indexes must be dropped. This is not feasible on a huge billion row table UNLESS you use table partitioning. This allows one to create a staging table identical to the large table (minus indexes). A fast non-logged bulk copy is used to load data. Thereafter, indexes are added to the staging table followed by constraints. Then, a meta-data only SWITCH IN operation switches pointer locations for the populated staging table and the empty target partition of the partitioned table resulting in an fully populated partition and empty staging table. Besides a fast bulk copy, the staging table allows us to eliminate blocking in the large partitioned table during the load. For more information refer to “Loading Bulk Data into Partitioned Tables”. In addition to fast loads, partitioned tables allow fast deletes (or archiving purposes or sliding window deletes) where large logged deletes are replaced with meta-data only partition SWITCH OUT operations that switches pointer locations for the full partition (to be ‘deleted’) and an empty monolithic table. The SWITCH OUT results in an empty partition and a fully populated monolithic staging table. Thereafter the monolithic table can either be dropped or added to a partitioned archive table using SWITCH IN. Partitions also provide manageability improvements when combined with specific filegroup placement, allowing for customized backup and restore strategies.

In contrast to OLTP, a data warehouse or Reporting application is characterized by a small number of large SELECT transactions, each very different. The implications are significant for database design, resource usage, and system performance. These distinctions result in very different objectives and resource utilization profiles.