SQL Q&A: Push SQL Performance
There are always several ways to push the SQL Server performance envelope, and it often doesn’t cost any money to do so.
Paul S. Randal
The New Normal
Q. We’re about to start a project that will redesign our database schema and we’re concerned about how far to normalize things. Do you have any advice?
A. Be cautious in pushing normalization too far. The usual goal of a normalization exercise is Third Normal Form, or 3NF, where all non-key attributes help to define the key. There’s a helpful phrase that describes this rule: “The attributes define the key, the whole key and nothing but the key, so help me Codd.” (E.F. Codd originally defined 3NF back in 1971; see this link for more information. The phrase is based on the oath one swears in a courtroom.)
The problem with pushing for 3NF is that many people go too far—what I call over-normalizing. This can cause huge performance problems under load when queries have to join numerous tables to materialize meaningful relationships. A succession of joins is then the only way to reach the various entities.
For example, we have one client whose schema was written 10 years ago by a C# developer who was not particularly SQL Server savvy. Every possible entity is stored in its own table and has an ID. All entity attributes are stored in their own tables, with their own IDs. The relationships by which entities belong to other entities are stored in their own tables. The history of how each attribute has changed for each entity is stored in its own table, with its own ID.
So a particular entity table may have foreign-key references to many other tables. This is a heavily over-normalized schema where doing almost anything requires joining 10 or more tables.
What’s worse is there are no archiving strategies. There are many years of data in the database resulting in joins that have to process huge amounts of data. Multiple things must be done to fix this:
- Remove old data to reduce the data size and implement an automated archiving strategy, keeping the active dataset small.
- Consider using filtered indexes and query predicates to reduce the amount of data undergoing operations.
- Consider de-normalizing parts of the schema to allow some tables to collapse together to reduce join complexity.
Here’s the golden rule to keep in mind when designing new schema: Think about the T-SQL operations required to effectively query the data and try to minimize excessive joins or needlessly operating on very large datasets. You should also prototype your design and run some representative queries at a high workload to see whether there are any obvious design bottlenecks.
Once you’ve put a schema into production it becomes incredibly hard and expensive to make any changes. This means you may spend a lot of time fine-tuning queries and indexing strategies to make up for the performance hit of the over-normalized schema.
Q. Our main table has a clustered index on a unique integer identity column. Table rows are never updated after they’ve been inserted and I’ve kept the row size reasonably small. I’m finding that, with a higher number of concurrent connections, query performance is slowing down. Do you have any idea what the problem could be?
A. You’ve designed your clustered index with our best-practice guidelines, where the clustered index key should be unique, static, narrow and ever-increasing. This works perfectly until you get into a high-end workload.
For instance, with several thousand concurrent connections (or perhaps even hundreds) all inserting into the table, there’s going to be blocking. If you look at the locks held with the sys.dm_tran_locks Dynamic Management View (DMV), it doesn’t show anything amiss. All the threads are holding page IX locks (on one of a few pages) and key X locks (on records on those pages). Things look normal, but there’s definitely some blocking going on. The blocking just doesn’t involve locks. You can find more information about locks and locking in SQL Server in Books Online.
If you look in the sys.dm_os_waiting_tasks DMV you’ll see that most threads are waiting with a wait type of PAGELATCH_EX on one of a few pages. This is because a lock is not sufficient to be able to change an in-memory copy of a data file page. The sys.dm_os_waiting_tasks is one of the most useful DMVs, by the way, because it shows what all the “threads” in SQL Server are waiting for.
That page copy in memory is really a data structure, as far as SQL Server is concerned. You can’t change a data structure with just a relational lock. You must protect the data structure integrity from multiple threads accessing and changing it at once. This is a basic computer science fundamental. You can manage exclusive access to change the data structure within SQL Server using a lightweight synchronization mechanism called a latch, which I’ve discussed here previously.
As the clustered index is on an identity column, inserts will be an append-only insert pattern with many rows on each data page. This is a good thing. However, it means many threads will try to insert rows simultaneously on the same data file page. The required locks don’t cause blocking because each thread has compatible page locks and row locks on individual rows. However, the threads all need to acquire an exclusive latch on the page. This isn’t possible simultaneously. As your workload increases, this can lead to major blocking and query performance issues.
There are a variety of solutions to this, including:
- Using partitioning to spread the insert workload over several partitions of a table (or even several tables)
- Changing to a composite cluster key that will naturally spread the inserts out over several points in the table (at the expense of causing fragmentation)
The bottom line, though, is that the blocking symptom does not always mean that locking is the problem.
Q. We’re having performance issues with our SQL Server and our developers say we need a more powerful server to run our workload. We’re classic “involuntary” DBAs. Can you give us any advice on what else may help, other than buying more hardware?
A. People don’t often realize that poor performance is a symptom of a fundamental problem with the way you’re using SQL Server. The assumption is you’re reaching the capability limits of your current hardware and it’s time to invest in a server with more and faster CPUs. This can be a costly assumption because moving to a server with more CPUs opens up more timing windows and can lead to even higher contention and less workload throughput.
We’ve actually seen that happen a couple of times because of poor T-SQL programming practices.
We’ve repeatedly demonstrated how it would be far cheaper and more beneficial to solve underlying performance problems instead of adding hardware. Before you take the plunge with new hardware, there are many things you can look at to see whether it’s a solvable problem. Here are some examples:
- Look in the sys.dm_os_waiting_tasks DMV to see where time is being spent waiting for resources. If the time is spent waiting for client applications to process data, beefing up the server hardware won’t make a difference. You can see this pattern if many rows in the DMV output list ASYNC_NETWORK_IO as the wait type.
- Examine I/O subsystem read/write latencies using the sys.dm_io_virtual_file_stats DMV to see if the I/O subsystem is being driven too hard. This could indicate the need for a better indexing strategy. Tempdb could also be a bottleneck, which could indicate overuse of temp tables.
- Look at the missing index DMVs to see if high-impact indexes are missing.
- Look at sys.dm_db_index_usage_stats to see if you’re maintaining—but not using—a ton of indexes.
Simple changes can often have a significant effect on the performance of SQL Server. Sometimes the data volume has simply grown and you need more server memory to take the pressure off the buffer pool. Sometimes the I/O subsystem really does need a boost. It’s not very often, though, that you need to buy a bigger, faster server.
Q. We have some queries that cause problems when they run in parallel and we’d like to limit them to be single-threaded. How can we do that without making everything single-thread by setting the server MAXDOP option to one?
A. As background, MAXDOP means “maximum degree of parallelism.” That’s basically how many concurrent parallel threads a query can use when executing.
It’s good that you’re not “knee-jerking” and setting everything to MAXDOP one, as parallelism is one of the best performance features of SQL Server. Many people do that, especially based on poor advice stating the prevalence of CXPACKET waits (from doing wait statistics analysis) should be fixed by stopping parallelism.
One of the problems with using the sp_configure option to disable parallelism (apart from the fact that it affects everything on the server) is that anyone with any privilege level can override that setting by using a MAXDOP query hint in their queries. You could always specify a MAXDOP one query hint on all the queries you want to limit. That can be impractical, though, with hundreds or thousands of queries involved.
Two methods are much more effective at limiting parallelism:
- Increase the server-wide “cost threshold for parallelism.” This is an arbitrary number generated during query compilation. It’s used to decide whether to generate a query plan you can execute in parallel or not. By increasing this number, you can effectively prevent some queries that shouldn’t run in parallel from doing so. You can’t override this setting using a query hint, either. You can read more about this setting and how to change it in a blog post by Jonathan Kehayias here.
- You can also use the Resource Governor feature (in the Enterprise Edition only). This lets you put queries into “buckets” (called workload groups) and then assign a MAXDOP to each bucket. You can override the sp_configure MAXDOP option, but not the Resource Governor. You can use whatever filtering you want to decide which queries go in which buckets. You can have multiple buckets with different MAXDOP settings as well. More people are using this method to gain fine-grained control over parallelism.
Paul S. Randal is the managing director of SQLskills.com, a Microsoft regional director and a SQL Server MVP. He worked on the SQL Server Storage Engine team at Microsoft from 1999 to 2007. He wrote DBCC CHECKDB/repair for SQL Server 2005 and was responsible for the Core Storage Engine during SQL Server 2008 development. Randal is an expert on disaster recovery, high availability and database maintenance, and is a regular presenter at conferences worldwide. He blogs at SQLskills.com/blogs/paul, and you can find him on Twitter at twitter.com/PaulRandal.