Click to Rate and Give Feedback
Tips
With a core server installation, you have a minimal UI that includes a limited desktop environment for local console management of the server. Here is an overview of key commands and utilities you’ll use for managing server core installations while logged on locally. ...

Read more!

Think part of your database is missing or corrupted? Find out how you can perform a partial restore to recover the missing or corrupted data. ...

Read more!

If you want external users or mail systems to be able to send mail to an address within your organization and then have Exchange forward this mail to an external mailbox, we list the six easy steps to get it done. ...

Read more!

SQL Server PowerShell includes new cmdlets designed specially for working with SQL Server 2008. Here's a look at what's new, and what you can do. ...

Read more!

If you want to manage Hyper-V servers from a computer running an earlier version of Microsoft Windows, such as Windows XP, here's what you need to know. ...

Read more!

Related Articles

SQL Server 2008 offers seven new built-in data types that let you work with more complex data and simplify the management of structured, semi-structured, and even unstructured data. Get an in-depth look at these new data types and find out how you can use them to improve data management in your infrastructure.

Kelly Wilson

TechNet Magazine April 2008

...

Read more!

The Microsoft datacenter located in Quincy, WA is powered entirely by hydroelectricity. Dave Ohara takes you on a tour of this facility and shares some best practices from the Microsoft datacenter team for building and operating a sustainable datacenter.

Dave Ohara

TechNet Magazine June 2008

...

Read more!

Logging and recovery are critical parts of SQL Server, but they are often poorly understood. Paul Randal gives an overview of how the logging and recovery features work in SQL Server and explains the transaction log and recovery models to help you maintain a better database.

Paul S. Randal

TechNet Magazine February 2009

...

Read more!

SQL Server 2008 offers more powerful relational data warehousing capabilities. Take a close look at key performancerelated data warehousing features and find out how you can use all this new technology to optimize query performance.

S. Agarwal, T. Grabs, and J. Hammer

TechNet Magazine April 2008

...

Read more!

Should you enable Backup Compression? How can you avoid dropped application connections during a failover? Paul Randal answers these questions and more in the latest installment of SQL Q&A.

Paul S. Randal

TechNet Magazine February 2009

...

Read more!

Also by this Author

Paul Randal answers your questions about SQL Server. In this installment, Paul discusses the row-overflow feature, database mirroring and automatic failovers, backup strategies, and managing workloads and memory for failover scenarios.

Paul S. Randal

TechNet Magazine December 2008

...

Read more!

Paul Randal answers your SQL Server questions, showing you how to create corruption in a test database, helping you avoid a shrink-grow-shrink-grow cycle, and explaining how you can determine how many databases you can mirror before performance starts to degrade.

Paul S. Randal

TechNet Magazine June 2008

...

Read more!

SQL Server 2008 introduces two new features that allow you to track changes more easily and efficiently. Paul Randal examines the new Change Tracking and Change Data Capture features and discusses the implications these features will have for database administrators.

Paul S. Randal

TechNet Magazine November 2008

...

Read more!

Paul Randal answers your SQL Server questions, showing you how to perform consistency checks and other maintenance on very large databases, how to handle an accidental upgrade, and more.

Paul S. Randal

TechNet Magazine August 2008

...

Read more!

Far too often, people who are not trained database administrators somehow end up responsible for a database. They lack the proper training and knowledge to maintain their database, and problems begin to develop. Here’s a primer for all those involuntary DBAs who need a crash course in database maintenance best practices.

Paul S. Randal

TechNet Magazine August 2008

...

Read more!

Popular Articles

See how the free Microsoft USB Flash Drive Manager can help you easily copy, delete, and back up files from a flash drive, as well as perform other useful tasks.

Lance Whitney

TechNet Magazine January 2009

...

Read more!

Project Server 2007 delivers significant enhancements, not only to the features and functionality for users but also for administrators. Alan Maddison explores some of the most significant new features and walks you through the installation and configuration of Microsoft Office Project Server 2007.

Alan Maddison

TechNet Magazine January 2009

...

Read more!

Raymond Chen explains why it is often the developers who have their acts together who end up checking in the final bug fixes to products.

Raymond Chen

TechNet Magazine January 2009

...

Read more!

Take a close look at SharePoint Security Accounts to see how a weak configuration can give an attacker full control over all site collections and sites.

Pav Cherny

TechNet Magazine January 2009

...

Read more!

Why do attachment sizes increase when sending and receiving e-mail messages? How can you ensure databases on a passive node in a CCR cluster are defragmented during online maintenance? Can you use an external trust between forests? We answer these questions and more.

Henrik Walther

TechNet Magazine January 2009

...

Read more!

Our Blog

NAP monitors the health of specified computers when they attempt to connect to a network and includes a number of mechanisms to enforce health requirements. In this article, Geek of All Trades Greg Shields gives readers an overview of these enforcement mechanisms and, as an example, takes a closer look at setting ...

Read more!

Use Windows PowerShell to Manage Virtual Machines Here are a few examples of how you can use Windows PowerShell scripts to manage virtual machines running on a Server Core installation. Note that these scripts are presented as samples and may need to be customized to work in your environment.

Create a New ...

Read more!

Disabling an Unused Part of Group Policy Objects One way to disable a policy is to disable an unused part of the GPO. By disabling part of a policy that isn’t used, the application of GPOs and security will be faster.

Administer Windows Server 2008 Server Core from the Command Prompt ...

Read more!

In the August 2008 issue of TechNet Magazine, Paul Randal wrote an article Top Tips for Effective Database Maintenance.  It was geared toward "involuntary  DBAs" (IT pros who inadvertently wind up responsible for a SQL Server instance).  The article had a great response from our readers so Paul has written another ...

Read more!

Microsoft Forefront is designed to deliver an integrated security solution that makes it much easier to deploy and manage security across an organization’s IT infrastructure. In this, our annual security issue, we feature two articles that describe how Forefront Security protects instant messaging and e-mail.

Protect ...

Read more!

SQL Q&A Large Transaction Logs, When to Use Repair, and More
Paul S. Randal


Q I've noticed some strange behavior with backups that I'm hoping you can explain. Every so often we back up our 62GB production database to refresh the data used by our application developers. We always delete the old copy before restoring the new one. The restored copy is the same size as the production database and the data looks the same, but the restore process takes many times longer than the backup process. What's going on? Why does it take so much longer to restore than to back up?
A There's really nothing strange going on. Depending on your circumstances, this is generally expected behavior. The difference in time required by a backup versus a restore comes from the steps that each process must perform.
Backing up a database comprises two steps. Basically, it's just doing read IOs on the database and write IOs on the backup device:
Backup Step 1 Read all the allocated data in the data files and write it to the backup device.
Backup Step 2 Read some transaction log and write it to the backup devices.
The exact amount of transaction log required can vary wildly, but it's really the amount required to be able to recover the restored database to a consistent point in time. (I've given a more in-depth explanation of this in a blog post available at sqlskills.com/blogs/paul/2008/01/31/MoreOnHowMuchTransactionLogAFullBackupIncludes.aspx.)
Restoring a database, on the other hand, can comprise up to four steps. And the work involved is more complicated than just read and write IOs:
Restore Step 1 If the database files don't exist, create them.
Restore Step 2 Read all the data and transaction log from the backup and write it to the relevant database files.
Restore Step 3 Run the REDO phase of recovery on the transaction log.
Restore Step 4 Run the UNDO phase of recovery on the transaction log.
The total elapsed time for both steps of the backup should be roughly the same as the time required for Restore Step 2 (assuming it has similar hardware and a server with no user activity). Step 1 of the restore can take a long time if the data files are large and need to be zero-initialized (which is the behavior in SQL Server 2000 and the default behavior in SQL Server 2005).
To prevent this, do not delete the existing files before starting the restore. Or alternatively, enable instant initialization so the files are created very quickly (more information on this can be found at msdn.microsoft.com/­library/ms175935.aspx).
Steps 3 and 4 of the restore are running recovery on the restored database to make it transactionally consistent; this is the same process a database would go through during crash recovery. The length of time that recovery will take depends on the amount of transaction log that needs to be processed. For instance, if there was a very long-running transaction active at the time the backup was taken, all the transaction log for that transaction would be in the backup and would have to be rolled back.

Q I'm trying to choose between log shipping and database mirroring to provide a redundant copy of our production database. I'm concerned by the amount of transaction log that will need to be sent between the servers, especially for the index rebuild operations we do every night. I've heard that mirroring sends the actual rebuild commands instead of the transaction log and the rebuilds are done on the mirror. Is this true? This should make mirroring a better solution than even log shipping with the BULK_LOGGED recovery model, right?
A What you've heard is not true. Database mirroring works by sending the actual transaction log records from the principal database to the mirror server, where they are "replayed" in the mirror database. There is no translation or filtering of any kind taking place, nor any kind of interception of T-SQL commands for a mirrored database.
Database mirroring only supports the FULL recovery model, which means an index rebuild operation will always be fully logged. Depending on the size of the indexes concerned, this could mean a significant amount of transaction log being generated and, in turn, a large log file on the principal database and substantial network bandwidth for sending the log records to the mirror.
You can think of database mirroring as real-time log shipping (in fact, this was a name used for the feature early on during SQL Server 2005 development). In log shipping, backups of the primary database's transaction log are shipped regularly to the secondary server and are restored on the secondary database.
Log shipping supports the FULL and BULK_LOGGED recovery models. For an index rebuild operation in a log-shipped database with the FULL recovery model, the same amount of transaction log will be generated as the amount generated for a mirrored database. However, in the log-shipped database scenario, the data is sent to the redundant database in a log backup (or a series of log backups) rather than as a continuous flow.
If the BULK_LOGGED recovery model is used in the log-shipped database while the index rebuild is done, only a minimal amount of transaction log will be generated. However, the next transaction log backup will also contain all the data file extents that were changed by the minimally logged index rebuild operation. This means the log backups covering the index rebuild in the BULK_LOGGED recovery model will be almost exactly the same size as those covering the index rebuild in the FULL recovery model.
So the amount of information that needs to be sent to the redundant database is almost exactly the same for an index rebuild on a mirrored database and on a log-shipped database. The real difference is how the information is sent—continuously or in batches.
There are many other factors to consider when choosing between these two approaches (way too many factors to discuss in just one installment of SQL Q&A). You should see how all of these factors match up to your requirements (such as acceptable data-loss limit and allowable downtime) before you make a decision.

Q I'm running SQL Server 2005 and one of my databases has a transaction log that just keeps growing. The database is in the full recovery mode and I'm doing transaction log backups. I was under the impression that this should prevent the transaction log from growing. What exactly is going wrong here?
A You're right that taking transaction log backups in the full recovery mode is critical. However, there are other factors that can contribute to transaction log growth. It all comes down to what is requiring the transaction log to be needed (or active). Other common factors (aside from a lack of transaction log backups) that may cause your problem include replication, database mirroring, and an active transaction.
Replication works by asynchronously reading your transaction log records and then loading up the transactions to replicate into a separate distribution database. Any transaction log records that haven't yet been read by the replication log reader task cannot be freed. If your workload generates a lot of transaction log records and you've set a large interval for how often the replication log reader will run, then lots of records can accumulate and cause the transaction log to grow.
If you are running asynchronous database mirroring, then there could be a backlog of transaction log records that have not been sent from the principal to the mirror (called the database mirroring SEND queue). The transaction log records cannot be freed until they have been successfully sent. With a high rate of transaction log record generation and limited bandwidth on the network (or other hardware issues), the backlog can grow quite large and cause the transaction log to grow.
Finally, if a user starts an explicit transaction (such as using a BEGIN TRAN statement) and then does a modification of some sort (such as a DDL statement or an insert/update/delete action), the transaction log records that are generated need to be kept around until the user either commits or rolls back the transaction. This means that any subsequent transaction log records generated by other transactions also cannot be freed, as the transaction log cannot be selectively freed. If that user, for example, goes home for the day without completing the transaction, the transaction log will continue to grow as more and more transaction log records are generated but can't be freed.
You can find out why the transaction log cannot be freed by querying the sys.databases system catalog view and examining the log_reuse_wait_desc column, like so:
SELECT name AS [Database], 
  log_reuse_wait_desc AS [Reason]
FROM master.sys.databases;
If the cause turns out to be an active transaction, you can use the DBCC OPENTRAN statement to get more information about the transaction:
DBCC OPENTRAN ('dbname')

Q I've heard that REPAIR_ALLOW_DATA_LOSS should only be used as a last resort for recovering from corruption; restoring from backups should be used instead. Can you explain why "repair for SQL Server 2005" shouldn't be used and why it is in the product considering how "dangerous" it is?
A First off, I actually wrote repair for SQL Server 2005. The problem with REPAIR_ALLOW_DATA_LOSS (which I'll just call repair from here on out) is that it's not clear how it works. The name of the repair option was chosen to call out that running it could lead to loss of data from the database. The way the feature usually repairs a corrupt database structure is to delete the corrupt structure and then fix up everything else in the database that referenced or was referenced by the deleted structure. Repair really is intended to be a last resort way of getting the database structurally consistent—it is not focused on saving user data. Repair doesn't go out of its way to delete user data, but it doesn't go out of its way to save user data either.
This may seem like an irresponsible way to conduct repairs, but when the repair option needs to be used, it provides the fastest and most reliable method to fix corruption. Speed is of paramount importance in a disaster recovery situation and correctness is absolutely required. It is almost impossible to engineer more complex repair algorithms that can be proven to work quickly and correctly in all cases. There are some complex algorithms in the repair code to resolve the case when two indexes have the same page or extent allocated, for instance, but mostly the algorithm is repair plus fix-up.
Still, there are some issues with repair that you should be aware of:
  • Repair does not consider foreign-key constraints when deleting corrupt structures, so it may delete records from a table that has a foreign-key relationship with another table. There is no way to determine whether this has happened without running DBCC CHECK­CONSTRAINTS after running repair.
  • Repair does not (and cannot) consider any inherent business logic or data relationships defined at the application level that could be broken by some data being deleted. Again, there is no way to determine whether anything is broken without running whatever custom consistency check is built into the application.
  • Some repair operations can't be replicated. Running repair on a publisher or a node in a peer-to-peer topology may introduce inconsistencies within the topology, which must be corrected manually.
For these reasons, it is always a good idea to recover from corruption using a backup rather than running the repair option. But repair is offered in the product because there is always the potential of getting in a situation where your database is corrupt, there is no backup, and you need some way to get the database back online quickly.

Q I've just moved to a new company as a DBA, and I've been given responsibility for several applications and their back-end databases. One of the applications has very poor performance doing updates. I investigated and found that the tables used by the application have an enormous number of indexes on each. After asking around, it seems that the previous DBA had a habit of adding an index of each table column, plus some combinations. I can't believe that all of these indexes are necessary—how can I tell which ones I can safely drop? We run SQL Server 2005.
A The large number of indexes is most likely a major contributor to the poor performance, as you surmise. Every time a row is inserted, updated, or deleted in the table, corresponding operations need to be done in every non-clustered index. This adds a lot of overhead in terms of I/Os, CPU utilization, and transaction log generation.
In SQL Server 2000, the only way to determine which indexes were being used was to use profiling and examine query plans. With SQL Server 2005 there is a new Dynamic Management View (DMV) that tracks index usage—sys.dm_db_index_usage_stats.
This DMV tracks every time an index has been used (and how it was used) since the database it is part of started up. The statistics for all databases are lost when SQL Server is shutdown, and the statistics for a single database are lost when it is closed or detached. The idea is that if an index does not appear in the output, then it has not been used since the database was started up.
A simple approach to tracking index usage over time is to take periodic snapshots of the DMV output and then compare the snapshots. One thing that many people miss is that you must track the index usage over an entire business cycle. If you just took a snapshot for a single day, you may find several indexes that aren't used. But if those indexes are used, say, to help the end-of-month reports run magnitudes faster, then the indexes probably shouldn't be removed. If an index really hasn't been used over a whole business cycle, then you can probably drop it and reclaim space and performance.
For some simple code you can use to take periodic snapshots of the DMV, see my blog post at sqlskills.com/blogs/paul/2007/10/05/IndexesFromEveryAngleHowCanYouTellIfAnIndexIsBeingUsed.aspx.

Paul S. Randal is the Managing Director of SQLskills.com and a SQL Server MVP. Paul wrote DBCC CHECKDB/repair for SQL Server 2005 and was responsible for the Core Storage Engine during SQL Server 2008 development. An expert on disaster recovery, high availability, and database maintenance, Paul is a regular presenter at conferences. He blogs at SQLskills.com/blogs/paul.

Page view tracker