SQL Q&A: Play the Index Card

SQL indexes can be problematic, but if you keep an eye on them and remain vigilant toward any issues, you should find them easier to manage.

Paul S. Randal

The Accidental DBA

Q. I’m the “unofficial DBA” at my company. We’re using SQL Server more and more, and have instances of SQL Server springing up across the company and know we need an actual DBA to help out. While we’re hiring someone, I need to be able to figure out what’s making some instances run slower. Do you have any general recommendations for where I should start looking?

A. This is a great question. It can be frustrating to have an instance of SQL Server that isn’t performing well and not know where to start looking for the cause. There are all kinds of things that could contribute to performance degradation, so I always find it easiest to simply ask SQL Server itself.

SQL Server tracks two sets of information: I/O statistics and wait statistics. These should give you a good idea of where the problem lies.

Most SQL Server installations these days are I/O-bound (meaning their performance is constrained by something to do with reading and writing data). Your slowdown could be a slow I/O subsystem, a slow network connecting a SAN to the server, insufficient memory on the server that’s forcing SQL Server to swap pages in and out of memory, poor indexing strategy or a bunch of other things.

You can use the Dynamic Management View (DMV) sys.dm_io_virtual_file_stats to see what SQL Server knows about I/O volume, stalls and delays for all I/O to the data and log files. You might find the I/O subsystem itself isn’t the hot spot, but I/O could still be the problem. The I/O subsystem could be inadequately coping with the I/O load.

This is where the other part of the puzzle comes in: wait statistics. SQL Server keeps track of every time an execution thread has to wait for an available resource, and how long the thread had to wait. You can also work out how long the thread had to wait after being notified of the resource’s availability, but before being able to execute on a CPU. By aggregating this data, it’s easy to see the top areas causing waits for SQL Server. This gives you a pointer of where to start looking for the cause.

This is a high-level view of this methodology. For a more in-depth discussion, including a script you can use, read my blog post, “Wait statistics.” It also has the results of a reader survey of more than 1,800 SQL Server instances and the prevalent wait types with explanations. Bottom line: Don’t waste time poking around in SQL Server until you’ve asked SQL Server what it thinks about the problem.

Missing Index Analysis

Q. I’ve just discovered missing index DMVs. Now they’re telling me I have hundreds of missing indexes on one SQL Server instance. Should I just create all of them or is that going to cause problems?

A. Do not immediately create all the indexes without first performing some analysis. The query processor from SQL Server 2005 and later versions can determine when an index would benefit the plan for a query (or batch or stored procedure). It does this while compiling the query plan.

Every time it determines there’s a missing index, it notes that fact. It also keeps count of how many times each missing index could’ve been used, along with the projected improvement in the query plan had that index existed at the time the query plan was compiled.

You can access all this information using three DMVs (sys.dm_db_missing_index_groups, sys.dm_db_missing_index_group_stats and sys.dm_db_missing_index_details). There’s also a DMV that tells you which columns in a table are missing indexes (sys.dm_db_missing_index_columns). The first three DMVs are more common. Query them together for the easiest way to get that data. Bart Duncan’s widely used script, “Are you using SQL’s Missing Index DMVs?” can also help.

This is valuable information, but you should take it with a grain of salt. First, there’s a potential bug in the missing index DMVs. It can tell you an index that actually exists is missing. This bug will be fixed in the next version of SQL Server. You can read more about it in my blog post, “Missing index DMVs bug that could cost your sanity.”

Second, the mechanism that determines a missing index in the query processor only considers whether an index is useful for the query being compiled. It doesn’t consider the possible performance impact for insert/update/delete operations that have to maintain that index. This could be huge if the table has proportionally many more changes than reads. It also doesn’t consider the size of the index created. That’s a trade-off only you can make.

Last, it looks for the absolute best index to help the query plan being compiled. For instance, there may be a table with 30 columns and a clustered index, and a query that requests 25 of the table columns. The missing index determination mechanism would recommend creating a non-clustered index to cover the 25-column query. In most cases, that wouldn’t make sense.

Use Duncan’s script to look at the aggregated missing-indexes output. Then look at the top 10 or 20 indexes and do some analysis to determine whether they’re really worth creating. Most of the time, you’ll find some that aren’t worth creating, so it’s always worth performing this analysis.

Can’t We All Just Get Along?

Q. I’m one of a team of DBAs in our company that deals with various application development teams. There’s constant animosity between the teams. It’s detrimental to the work environment. Do you have any ideas for how to smooth relations between the teams?

A. This is a common issue that can make a work environment unpleasant with animosity, distrust and grudges. None of it helps productivity and the company suffers. Fortunately, there’s a solution. It’s easy to describe, but harder to put into practice:

  • You need to educate each other. Each team needs to understand the other team’s motivations and what they think are their boundaries of responsibility. You’ll be surprised at what each team thinks the other team should be doing.
  • Each team needs to understand the pain points for the other teams. You can do this anonymously, without making things personal.
  • Each team then needs to educate the other teams on how the work the other teams do impacts them. For instance, say the dev team writes some code, only tests with a small data set and then throws it over the wall into production -- and it fails spectacularly. If the dev team expects the DBA team to troubleshoot and fix the code, that's clearly a broken process.

Acknowledging and understanding the problem is the only way to motivate both sides to work toward a solution that will make the workplace environment peaceful and productive again.

Vexing Indexing

Q. I’m a SharePoint administrator and I know a fair amount about SQL Server as well. The SQL Server 2008 that hosts our SharePoint databases has a lot of index fragmentation. This affects SharePoint performance. I know I can’t change the indexes, but is there anything I can do besides constantly having to rebuild them?

A. Constantly rebuilding indexes puts a heavy load on SQL Server in terms of I/O and CPU resources, transaction log generation and potentially blocking other processes. Even running the sys.dm_db_index_physical_stats DMV to determine the fragmented indexes can be a heavy resource drain.

Many indexes become fragmented in a SharePoint environment because SharePoint database schema use GUID clustered index keys. My wife Kimberly discusses this in her blog post, “GUIDs as PRIMARY KEYs and/or the clustering key.”

When an index has what is essentially a random key, index inserts happen randomly and lead to a process called page splits. A page split causes fragmentation, which is an expensive process (see my blog post, “How expensive are page splits in terms of transaction log?”). A page split happens when a page is completely full, but space is required on that page (for instance, when an insert occurs in an index with a random key value that must be stored on that page). A new page is allocated and roughly half of the records from the full page are moved to the new page, thus creating free space. That’s the basic process.

You can’t alter the indexes in a SharePoint database, as that would break the support agreement. You can, however, change their default fill factor. When you create or rebuild an index, you can instruct SQL Server to leave a certain amount of free space in the index pages to allow for random inserts. This means it’s more likely that index pages already have space on them for the new records without requiring a costly page split. Setting a fill factor of 80 means pages will be filled to 80 percent capacity when the index is rebuilt, leaving 20 percent free space.

Then the question becomes, “What’s the best fill factor?” Unfortunately, there’s no good answer. For a data warehouse where data doesn’t change and there’s no online transaction processing (OLTP) insert activity, the best fill factor is usually the SQL Server default of 100 (meaning no free space).

For an OLTP environment, the answer depends on how quickly fragmentation occurs and how frequently you rebuild it to remove fragmentation. It’s a good idea to start with 70 (30 percent free space) and monitor fragmentation to see whether you need to tweak up or down, or do index maintenance more or less frequently.

Paul S. Randal

Paul S. Randal* is the managing director of SQLskills.com, a Microsoft regional director and a SQL Server MVP. He worked on the SQL Server Storage Engine team at Microsoft from 1999 to 2007. He wrote DBCC CHECKDB/repair for SQL Server 2005 and was responsible for the Core Storage Engine during SQL Server 2008 development. Randal is an expert on disaster recovery, high availability and database maintenance, and is a regular presenter at conferences worldwide. He blogs at SQLskills.com/blogs/paul, and you can find him on Twitter at twitter.com/PaulRandal.*