Chapter 11 - Backup and Recovery

You should back up your databases and transaction logs on a regular basis. In case of system problems, you can restore the backed up files and return to work more quickly than if you had neglected to back up your database. Additionally, every time Microsoft SQL Server starts, recovery is performed on all system and user databases.

The following sections address frequently asked questions about SQL Server 7.0 backup and restore functionality, as well as problems you may encounter when backing up and restoring databases and transaction logs. Additionally, information about resolving performance and insufficient disk space problems related to recovery is included.

Frequently Asked Questions

Here are some frequently asked questions regarding backup and restore functionality, as well as problems you may encounter when backing up and restoring databases and transaction logs.

Q: What are the conditions under which deferred updates occur in SQL Server 7.0?

A: In SQL Server 7.0, all updates are in-place (direct), provided that the column or columns participating in the clustered index key are not changed. If a change is made to a UNIQUE clustered key, the update plan runs in a hybrid mode where the query processor combines direct and deferred updates. If the clustered key is not UNIQUE, all of the updates are run as deferred.

You can use SHOWPLAN to see examples of this behavior. Look for the SPLIT and COLLAPSE operators in the plan. If you find a SPLIT below the clustered index update, one of the clustering keys has been changed. If a COLLAPSE operator is found, the update is running in a hybrid mode. SQL Server collapses delete and insert to the same key values into an in-place update.

Note that this behavior holds true for any index.

Q: Does enabling the torn page detection database option add any measurable performance overhead to a server?

A: The torn page detection option does not add much CPU cost, but it can increase contention on hot pages. With torn page detection off, a page can be accessed while it is being written to disk. This is not true if torn page detection is on.

Q: When does SQL Server check for torn pages?

A: torn page detection is performed whenever a page is read from disk. In practice, this is likely to happen during recovery, because any page on which the write did not complete during normal operations is very likely to be read by recovery (except for nonlogged operations, such as index creation, bulk copy, and so on).

Q: What happens when SQL Server detects a torn page?

A: When a torn page is detected, a severe I/O error is raised. This error closes the connection. The database is marked suspect only if the torn page is detected during recovery.

Q: How can I recover from torn pages?

A: Restoring the database from a backup and rolling the transaction log forward should correct the problem with no data loss.

Q: What situations are most likely to cause torn pages?

A: Lab tests have shown that torn pages are quite likely to happen when disk writes are occurring and power is lost. If you do not have a battery backup or Uninterruptible Power Supply (UPS), you should consider enabling this option.

Q: Will my query result sets be returned in order if I am running in SQL Server 6.5 compatibility mode?

A: If you do not provide a GROUP BY clause explicitly while in SQL Server 6.5 compatibility mode, the query processor adds one. In SQL Server 7.0 compatibility mode, not having an ORDER BY clause means that any ordering is acceptable to the user or application.

GROUP BY is always sorted in SQL Server 6.5 because the only way that SQL Server 6.5 can form groups is by first sorting the data. However, SQL Server 7.0 has other algorithms for grouping data (most involving hashing of some sort) that can work many times faster than sorting the data to form the groups. This prevents SQL Server from paying the penalty for a slower sort if one is not needed.

Adding an index hint does not force order; it only forces a scan of that particular index. If there is no ORDER BY clause in the query, the query processor is free to decide the cheapest execution strategy. This can be either a logical order scan of the index or a physical order scan of the index. The latter scan may return rows out of order for the index. If ordering is required, use the ORDER BY clause.

Q: Why is my reported log space never 0 (zero), even after truncating the log?

A: In SQL Server 7.0, the log truncation granularity is the virtual log file; in SQL Server 6.x it is a page. Consider an example in which a log configuration consists of four virtual log files. Even if there are no outstanding or unreplicated transactions that prevent truncation of the log after backup, at least 25 percent of the log is always in use, thereby causing at least one virtual log file to be marked as busy.

Efficiency is one of the primary reasons for implementing this schema in SQL Server 7.0. In SQL Server 6.x, truncating the log (even to throw it away) requires scanning through the page chain and deallocating pages. In SQL Server 7.0, truncating the log is as simple as changing the status on a virtual log file from a "used but doesn't contain active log" state to "usable."

It is still possible to have a process back up the log when it reaches some level of being full. However, SQL Server 7.0 differs from SQL Server 6.x in that the smallest fullness level that can be achieved through a transaction log backup and truncation is (1/n * 100) percent, where n is the number of virtual log files in the database configuration.

Q: Why does using SET QUOTED_IDENTIFIER ON within my stored procedures not affect the stored procedure's behavior?

A: Stored procedures maintain the QUOTED_IDENTIFIER setting that was active at the time the stored procedures were compiled. Stored procedures go through distinct compile and execute phases, and the entire stored procedure is compiled as a unit. This means that by the time the SET QUOTED_IDENTIFIER statement is executed, the entire stored procedure has already been compiled. Therefore, changes to the setting cannot affect the stored procedure.

Furthermore, because the QUOTED_IDENTIFIER setting actually affects parsing, a change to the setting cannot be caught midway through compilation and enforce a mode switch.

Pushing the unit of compilation down to individual statements can potentially resolve this situation. In this case, doing so would require that all subsequent statements be recompiled when a SET QUOTED_IDENTIFIER statement is encountered, thereby negating the performance benefits of a stored procedure.

Q: How does SQL Server decide whether to use indexes?

A: SQL Server is a cost-based query optimizer, not a rule-based system. Being cost-based, SQL Server is syntax independent and literally analyzes the cost of each execution strategy based on the projected number and size of the result sets. If you want to force table scans or index strategies, you can guarantee their use only by using index hints. This is generally not recommended, although at times it may become necessary.

It is difficult to generalize and specify a basic set of rules under which the query processor will always pick a table scan or index seek. In general, the use of an index access strategy is favored over table scans unless the choice is very clear, for example, if all rows are wanted. Table scans acquire shared locks and thereby can reduce concurrency (that is, multiuser access) greatly. As a result, table scans are avoided whenever possible.

Scan decisions are based on anticipated execution costs, so there is no size limit below which indexes are ignored. However, if the entire table fits on a single page, there are very few cases, if any, in which indexes will be of value.

Q: DBCC SHRINKDB is not shrinking my log. Why?

A: DBCC SHRINKDB shrinks data files on a per-file basis, but shrinks log files as if all the log files existed in one contiguous log pool. The shrinking of log files is not immediate and does not occur until the active portion of the log moves. As updates are performed on the database, the shrink operation occurs at checkpoints or during transaction log backups. Each log file is marked with the target_percent for the shrink operation. Each subsequent log backup or log truncation attempts to shrink the file as close as possible to the target_percent size. Because a log file can be shrunk only to a virtual log file boundary, it may not be possible to shrink a log file to a size smaller than the size of a virtual log file, even if it is not being used.

Q: Why does the creation of a UNIQUE index stop on multiple null values in a column?

A: SQL Server does not support the occurrence of multiple null values in a UNIQUE index. For the purposes of the index key, a NULL is considered a value and can occur only once in a given UNIQUE index.

Q: What is the bulk copy TABLOCK hint or property?

A: The TABLOCK hint or property increases the performance of bulk copy operations by reducing the locking contention of the operation. SQL Server 7.0 introduces a new bulk update (BU) lock type. BU locks have reduced contention with other types of locks acquired by non-bulk copy operations. Bulk copy operations use bulk update locks by specifying the TABLOCK hint or property on either the bcp command prompt utility, the BULK INSERT Transact-SQL statement, the bulk copy API supported by the SQL Server ODBC driver, or the IRowsetFastLoad interface supported by the OLE DB Provider for SQL Server. If TABLOCK is not specified in a bulk copy operation, it acquires row locks on the bulk copied rows. These row locks have more overhead than a BU lock taken at the table level.

Tips for Using Backup and Restore

Several differences between the backup and restore functionality of Microsoft SQL Server 7.0 and earlier versions warrant a brief explanation and tips about how to exploit their benefits fully.

In SQL Server 7.0, database creation and backup operations use parallelism

The ALTER DATABASE statement uses parallelism to extend each file on a distinct logical drive. Likewise, the BACKUP DATABASE statement uses a separate thread per logical drive. Although this is a useful performance feature in certain circumstances, it does not mean that creating many logical drives is generally preferred over combining those drives in a RAID array. SQL Server uses asynchronous I/O and can usually leverage all available I/O capacity in a multidisk RAID array. Combining most available drive spindles to a large RAID array (of the appropriate type) often provides the best performance for the broadest circumstances.

SQL Server 7.0 documents in the error log backups with INIT or FORMAT

It may appear that SQL Server backup no longer documents whether a backup device has been initialized in the error log. Actually, this is documented, but not in the same format as in SQL Server 6.5.

The FILE= attribute in the error log provides the necessary information. For example, if the error log says FILE = 1, either an INIT or a FORMAT command was issued.

  • FILE = 1 means the backup device was initialized. 

  • FILE = >1 means the backup device was not initialized. 

The following examples show SQL Server 7.0 messages in the error log:

1998-11-17 18:17:47.98 backup Database backed up with following 
information: Database: bck_db, creation date and time: 
11/17/98(18:09:42), pages dumped: 107, first LSN: 4:32:1, last LSN: 
4:34:1, sort order: 52, striped: 0, number of dump devices: 1, device 
information: (FILE=1, TYPE=DISK: {'c:\temp\fb1.bck'}).

1998-11-17 18:33:54.28 backup Database backed up with following 
information: Database: bck_db, creation date and time: 
11/17/98(18:09:42), pages dumped: 106, first LSN: 4:42:1, last LSN: 
4:44:1, sort order: 52, striped: 0, number of dump devices: 1, device 
information: (FILE=2, TYPE=DISK: {'c:\temp\doit.bck'}).

The following examples show SQL Server 6.5 messages in the error log:

98/12/07 10:54:00.30 backup DATABASE dumped with following info: 
Database Name:pubs, Creation Date and Time:Dec 7, 98(10:53), Pages 
dumped:161, Current Sequence:36075 11572328, Sort Order:52, Striped:NO, 
Number of Dump Devices:1, device info:(VOLID=SQL0001 
NAME=D:\temp\65pubs.dat TYPE=DISK FILE=1)

98/12/07 10:54:13.34 backup DATABASE dumped with following info: 
Database Name:pubs, Creation Date and Time:Dec 7, 98(10:54), Pages 
dumped:161, Current Sequence:36075 11572328, Sort Order:52, Striped:NO, 
Number of Dump Devices:1, device info:(VOLID=SQL0001 
NAME=D:\temp\65pubs.dat TYPE=DISK FILE=2)
Differential backup time considerations

A differential backup records only changes made to the database since the last full backup. Although typically much faster than a full database backup, elapsed time for SQL Server 7.0 differential backup is roughly proportional to the allocated space in the database. Consequently, a differential backup on a very large database may take longer than expected, even if few changes have occurred since the last full database backup.

If the amount of time required for a differential backup is too large, you should consider doing log backups instead.

Common Backup and Restore Problems

Here are the solutions to some problems you may encounter when backing up and restoring databases and transaction logs:

  • A syntax error occurred when using the BACKUP or RESTORE statements, which indicates that the database is in Microsoft SQL Server version 6.5 compatibility mode. The BACKUP and RESTORE keywords are valid only with SQL Server 7.0 databases.

    Set the SQL Server compatibility level to 70 before using BACKUP or RESTORE statements. For more information, see SQL Server Books Online.

    The BACKUP statement cannot be performed at the same time as these operations:

    • DBCC CHECKALLOC 

    • DBCC SHRINKDATABASE 

    • bcp Utility 

    • SELECT INTO 

    • File manipulation

    Reissue the backup operation after the conflicting operation has finished. For more information, see "Error 3023" in SQL Server Books Online.

  • A standby database cannot be backed up if it has not yet been recovered.

    Use backups from your primary server until operations have switched to the standby. For more information, see SQL Server Books Online. 

  • A database cannot be restored unless its sort order, collation sequence, Unicode locale ID, and Unicode comparison style match those of the server.

    For more information, see "Error 3120" or "Error 3149" in SQL Server Books Online. 

  • The backup being restored is a valid Microsoft Tape Format, but it is not a SQL Server backup.

    To determine the backup contents, use RESTORE HEADERONLY. For more information, see "Error 3143" in SQL Server Books Online.

  • The backup set is a backup of a database with the same name as the database to which you are restoring. However, the database being restored to was created by a different CREATE DATABASE statement than the database in the backup set.

    Either overwrite the existing database or restore the backup set to a different database name. For more information, see "Error 3154" in SQL Server Books Online.

  • A restore operation failed because ALTER DATABASE was used to add or remove one or more database or transaction log files.

    Restore a full database backup created after the files were added or removed. For more information, see "Error 3155" in SQL Server Books Online.

  • An attempt was made to use a logical device that is not a defined backup device.

    Either create the device or use the TAPE = or DISK = syntax of the BACKUP statement. For more information, see "Error 3206" or "Error 3209" in SQL Server Books Online.

  • The media family spans multiple volumes. The restore operation has already processed the data on the specified volume.

    Replace the current volume with a volume not yet processed. For more information, see "Error 3227" in SQL Server Books Online.

  • The backup device does not contain data in Microsoft Tape Format. For more information, see "Error 3242" in SQL Server Books Online.

  • The media family spans multiple volumes. The restore operation expected to process the volume number specified in the error message but found a different volume number instead.

    To continue the restore operation, replace the current volume with the volume number specified in the error message. For more information, see "Error 3247" in SQL Server Books Online.

  • The media family spans multiple volumes. The backup set to be processed by the restore operation starts on an earlier volume than the one inserted into the named device.

    Replace the current volume with a volume containing the start of the target backup set. For more information, see "Error 3249" in SQL Server Books Online.

  • The restore operation has completely processed the media family on the named device and is now ready to reuse the device to restore one of the remaining media families.

    Replace the current volume with the first volume of a media family that has not yet been processed. For more information, see "Error 3251" in SQL Server Books Online.

  • The BACKUP operation that created the backup set did not finish successfully.

    Either restore a different database backup, if restoring a database backup, and apply transaction logs; or apply the next transaction log backup, if restoring a transaction log backup. For more information, see "Error 3256" in SQL Server Books Online.

  • The volume on the named device does not belong to the same RAID media set as the other volumes being processed. 

    Either remove the offending volume and insert the next volume of the media family, for tape media sets; or, for disks, reissue the command, naming only those backup devices part of the same RAID media set. For more information, see "Error 3258" in SQL Server Books Online.

  • The server will not initialize the volume inserted into the named device as a continuation volume for the designated media family because the RAID media set spans multiple volumes. 

    Replace the current volume with a fresh tape that can be overwritten. For more information, see "Error 3263" in SQL Server Books Online.

  • The server is too busy to perform the backup or restore operation.

    Retry the operation after reducing the server load. For more information, see "Error 3267" or "Error 3627" in SQL Server Books Online.

  • If the trunc. log on chkpt. database option is enabled, the log is truncated when periodic checkpoints occur. Only full database and differential database backups are allowed because the log has been truncated and any log backups made would be unusable.

    Either perform full and differential backups, leaving trunc. log on chkpt. enabled, or maintain a full set of transaction log backups, disabling trunc. log on chkpt. For more information, see "Error 4208" in SQL Server Books Online.

  • To restore the database after failure you must begin with either a full database backup or a complete set of file backups.

    Perform a full database backup before backing up the transaction log. For more information, see "Error 4214" in SQL Server Books Online.

  • The restore operation found a gap between the last restore and the transaction log that you attempted to apply.

    Apply the transaction log backups in the order they were created originally. For more information, see "Error 4305" in SQL Server Books Online.

  • No further restore operations can be performed after a database has been recovered. 

    Restart the restore sequence and use the NORECOVERY option on all but the final RESTORE statement. For more information, see "Error 4306" in SQL Server Books Online.

  • You cannot recover the database to the state that it was in at the time the current log backup occurred. At least one file was modified; therefore, recovery is not possible because the database will be left in an inconsistent state.

    Recover the database to either its most recent state or a specific point in time. For more information, see "Error 4318" in SQL Server Books Online.

Troubleshooting Orphaned Users

When restoring a database backup to another server, you may experience a problem with orphaned users. This scenario displays and resolves the problem:

  1. Alias the login janetl to dbo by executing sp_addlogin

    sp_addlogin 'janetl', 'dbo'
    
  1. Back up a database. In this example, back up Northwind

    BACKUP DATABASE Northwind
    

TO DISK = 'c:\mssql7\backup\northwnd'

  1. Drop the database that was just backed up. 

    DROP DATABASE Northwind
    
  1. Drop the login. 

    sp_droplogin 'janetl'
    
  1. Restore the backed up database. 

    RESTORE DATABASE Northwind
    

FROM DISK = 'c:\mssql7\backup\northwnd'

The **janetl** login is not allowed into the **Northwind** database unless the **guest** login is allowed. Even though the **janetl** login has been deleted, it still shows up (as an orphaned row) in the **sysusers** table: 

<pre IsFakePre="true" xmlns="https://www.w3.org/1999/xhtml">USE Northwind

SELECT * FROM sysusers WHERE name = 'janetl'

To resolve orphaned users
  1. Add a temporary login using sp_addlogin. Specify the security identifier (SID) (from sysusers) for the orphaned user. 

    sp_addlogin @loginame = 'nancyd',
    

@sid = 0x32C864A70427D211B4DD00104B9E8A00

  1. Drop the temporary alias that belongs to the aliased SID using sp_dropalias

    sp_dropalias 'nancyd'
    
  1. Drop the original user (now orphaned) using sp_dropuser

    sp_dropuser 'janetl'
    
  1. Drop the original login using sp_droplogin

    sp_droplogin 'nancyd' 

Recovery Performance

Recovery time is determined by how much work has been done since the last checkpoint, and by how much work has been done by all active transactions at the time of the server crash. Microsoft SQL Server uses the recovery interval configuration option to set the maximum number of minutes per database that SQL Server needs to recover databases. This recovery interval option controls checkpoint frequency. For an online transaction processing (OLTP) system (using short transactions), recovery interval is the primary factor determining recovery time.

After installation, SQL Server sets recovery interval to a default value of 0. As long as recovery interval is at the default setting and long-running transactions are not present, recovery for each database should take approximately one minute or less. If long-running transactions were active at the time of the server crash, recovery time is controlled by the time it takes to roll back the effects of these transactions.

If recovery routinely takes significantly longer than one minute for a database, recovery interval has a value of 0, and there are no long-running transactions to roll back, consider contacting your primary support provider to resolve the recovery performance problem.

Recovery reports progress (based on the virtual log files for a database). At the beginning of recovery, recovery analyzes and scans the log since the last checkpoint. Based on the analysis phase, recovery estimates how much log will be read during recovery. The amount of log read is used to report recovery progress.

If recovery interval is changed from the default value, database recovery takes that many times longer to complete. For example, if recovery interval is changed to 10, recovery would take approximately 10 times longer to complete than if recovery interval remained at the default setting of 0.

When growing the log, use larger chunks to ensure a shorter startup time for SQL Server. The greater the number of small chunks, the longer SQL Server takes to initialize them.

If a long-running transaction is terminated, let the server finish the rollback process. If you are concerned about the length of the rollback process, ask your system administrator to confirm that activity is taking place on the server. Terminating the server process during the rollback of a long-running transaction results in long recovery time.

If you have a long-running transaction and a crash occurs during this transaction, SQL Server begins the recovery process. This may take some time. If you are concerned that this recovery process is taking too long and you believe it is halted, contact your system administrator.

Insufficient Disk Space

During recovery, it is a rare but possible occurrence for the server to require additional log or data space. If additional space is unavailable and either the log or data files cannot grow, the server:

  • Reports error message 9002 or 1105 in the Microsoft SQL Server error log. 

  • Marks the database as suspect. 

  • Takes the database offline.

To resolve the 9002 error message and bring the database online
  1. Free disk space on any disk drive containing the log file for the related database. Freeing disk space allows the recovery system to grow the log file automatically. 

  2. Reset the suspect status by executing sp_resetstatus

  3. Run recovery by executing DBCC DBRECOVER (database). 

Or

  1. Free disk space on a different disk drive. 

  2. Move the transaction log files with an insufficient amount of free disk space to the disk drive in Step 1. 

  3. Detach the database by executing sp_detach_db

  4. Attach the database by executing sp_attach_db, pointing to the moved files. 

Or

  • Add a log file to the suspect database and run recovery on the database by executing sp_add_log_file_recover_suspect_db
To resolve the 1105 error message and bring the database online
  1. Free disk space on any disk containing a file in the filegroup mentioned in the 1105 error message. Freeing disk space allows the files in the filegroup to grow. 

  2. Reset the suspect status by executing sp_resetstatus

  3. Run recovery by executing DBCC DBRECOVER ( database )

Or

  1. Free disk space on a different disk drive. 

  2. Move the data files in the filegroup with an insufficient amount of free disk space to the disk drive in Step 1. 

  3. Detach the database by executing sp_detach_db

  4. Attach the database by executing sp_attach_db, pointing to the moved files. 

Or

  • Add a data file to the suspect database and run recovery on the database by executing sp_add_file_recover_suspect_db.