SQL Q&ABackup Compression, Client Redirection with Mirroring, and More
Paul S. Randal
Q We're going to be upgrading most of our servers to SQL Server 2008, and one of the features that I'm looking forward to putting into production is backup compression. I know that I can turn it on by default for all databases on each server, but I've also heard that I might not want to do that. I'm not sure why I wouldn't want to have the feature enabled by default, as it seems like I've got nothing to lose. Can you help explain the reasoning behind what I've heard?
A The answer is my perennial favorite: it depends! Let me give some background to explain.
The key point to consider is the compression ratio that each database backup will have when backup compression is enabled. The compression ratio of anything being compressed by any algorithm is determined by the actual data being compressed.
Random data (small integer values, for instance) will not compress very well, so the contents of the tables and indexes in the database will determine, for the most part, the compression ratio that can be achieved.
Here are some examples of when backup compression may not produce a high compression ratio:
- If the database has transparent data encryption enabled, then the compression ratio will be very low because the data being compressed is random small values.
- If most of the data in the database is encrypted at the column level, then the compression ratio will be low, again because the column encryption essentially randomizes the data.
- If most tables in the database have data compression enabled, then the compression ratio will be low; compressing data that is already mostly compressed usually has little effect.
In the case when the compression ratio is low, the problem is not the low ratio but the fact that CPU resources are used to run the compression algorithm for no gain. No matter how well a chunk of data can be compressed, CPU resources are always used to run the compression and decompression algorithms.
This means that you need to check how well each database compresses in a backup before deciding to use backup compression for that database all the time. Otherwise you may possibly be wasting CPU resources. This is the basis for what you've heard.
To summarize, if the majority of the databases will benefit from backup compression, it makes sense to enable backup compression at the server level and manually change a few backup jobs to specifically use the WITH NO_COMPRESSION option. Alternatively, if the majority of the databases will not benefit from backup compression, it makes sense to leave backup compression turned off at the server level and manually change a few backup jobs to specifically use the WITH COMPRESSION option.
Q Last year we upgraded our databases to have database mirroring so that if a failure occurs, we can failover to the mirror and the application continues. While we were designing the system, we practiced doing failovers of the database and everything worked fine. Last week we had a real failure and the database failover occurred, but all of the application transactions stopped and the application didn't connect to the failover server. In the future, how can I set up SQL Server so that it doesn't drop the application connections during the failover so the transactions can continue?
A Let me break this down into two parts—how applications can cope with failovers and how to manage client redirection with database mirroring.
When a failover occurs using any of the high-availability technologies available with SQL Server, the client connection to the failed server is dropped and any in-flight transactions are lost. It is not possible to migrate an in-flight transaction between servers (in a failover situation or otherwise). Depending on the high availability technology, the in-flight transaction either will not exist at all on the failover server or it will exist as an in-flight transaction but will be rolled back as part of the process to bring the database online on the failover server.
With regard to database mirroring, which continually ships transaction log records from the principal server to the mirror server, it is usually the latter case—any in-flight transactions are rolled back as part of bringing the mirror database online as the new principal.
Therefore, there are two things that an application must be able to do gracefully when running on a server with the possibility of having to failover to another server:
- It must be able to gracefully handle the server connection being dropped and then try reconnecting after a small time interval.
- It must be able to gracefully handle a transaction being aborted and then retrying the transaction after a connection is established with the failover server (possibly using a mid-tier transaction manager).
The only high-availability technology here that does not require client changes specifically to allow redirection of the client connection after a failover is failover clustering. Clients connect to a virtual server name and are transparently redirected to whichever physical cluster node is active.
With high-availability technologies, such as log shipping and replication, the server name of the failover server is different, which means that manual redirection of client connections is required after a failover. This manual redirection can be done in several ways:
- You can hardcode the failover server name into the client so the reconnection attempts are made to the failover server.
- You can use Network Load Balancing with a 100/0—0/100 configuration, which will then allow the connection to be switched to the failover server.
- You can use something like a server name alias or switching entries in a DNS table.
With database mirroring, any of these options will work. But database mirroring also has built-in client direction capabilities. The client connection string is able to explicitly specify the name of the mirror server, and if the principal server cannot be contacted, the mirror will then automatically be tried. This process is known as explicit redirection.
If the client connection string cannot be changed, then implicit redirection may be possible if the failed server is now running as the mirror server. Any connections to it will be automatically redirected to the new principal—but this only works if the mirror server is running.
The SQL Server 2005 white paper "
Implementing Application Failover with Database Mirroring
" explains these options in more detail.
When we upgraded to SQL Server 2005, we redesigned our large tables to be partitioned so that we could take advantage of partitioned maintenance and the sliding-window mechanism. You described this in the August 2008 installment ("
Partitioning, Consistency Checks, and More
"). But we've encountered a problem. Occasionally, concurrent application queries are experiencing blocking across the whole table when the queries aren't even accessing the same partitions. I've heard that SQL Server 2008 fixes this problem—can you please explain how I can stop this blocking?
Figure 1 Examining locks on a partitioned table
A The problem you're seeing is caused by a mechanism called lock escalation. SQL Server acquires locks on data to protect them while a query is reading or writing the data. It can acquire locks on entire tables, data file pages, or individual table/index rows, and every lock takes up a little memory.
If a query causes too many locks to be acquired, SQL Server can decide to replace all the locks on rows or pages in a table with a single lock on the entire table (the threshold, when this takes place, is approximately 5000 locks, but the exact algorithm is complicated and configurable). This process is called lock escalation.
In SQL Server 2005, if query A is operating on a single partition of a table and causes enough locks to be taken to trigger lock escalation, then the entire table becomes locked. This can prevent query B from being able to operate on a different partition of the same table. Hence, query B is blocked until query A completes and its locks are dropped.
In SQL Server 2008, the lock escalation mechanism has been improved to allow a table to have partition-level lock escalation. Using the example above, this means that the lock escalation caused by query A would only lock the single partition query A is using, rather than the entire table.
Query B will then be able to operate on another partition without being blocked. Query B could even trigger lock escalation itself, which would then lock just the partition that query B is operating on, rather than the entire table.
This model of lock escalation can be set using the following syntax: