Understanding recovery paths is important if you use differential or log backups, and you recover a database to a previous point in time by either of the following methods:
Performing a point-in-time restore
Performing a recovery without first restoring all of the log backups or most recent differential backup.
If you recover a database to an earlier recovery point and start using the database from that point, a new recovery path originates. The recovery path is the sequence of data and log backups that have brought a database to a particular point in time, either by regular database use or by a specific data and log restore. A recovery path consists of a unique set of specific transformations that have evolved the database over time, yet maintained the consistency of the database. The following illustration shows the relationship between a recovery point and the resulting recovery paths.
The following situations create a new recovery path because the database is not restored to the "end of time". Thereafter, backups exist that can take the database down two or more recovery paths, all of which use the same range of LSNs.
Restoring a full database backup and recovering the database without using any other type of backup.
Recovering the database at the end of a differential backup other than the most recent differential backup.
Restoring a full database backup and a differential database backup and recovering the database without applying existing transaction log backups.
Recovering the database at the end of a transaction log backup other than the most recent transaction log backup.
Recovering the database at a specific time or a marked transaction within a transaction log backup.
In general, a recovery point starts a new recovery path if the recovery point causes transactions to be rolled back. Preexisting backups may now have log sequence numbers (LSNs) greater than the LSN of this recovery point. The LSNs in these backups exist on a different recovery branch from the new branch that is created by the current recovery operation.
Best practice To avoid creating a recovery path that has multiple recovery forks, perform a complete set of data backups as soon as possible after you recover the database. This approach guarantees that all backups are taken on a single recovery branch. To verify this, you can look at the last_recovery_fork_guid column in the backupset table or RESTORE HEADERONLY results set after backing up the data.
Initially, all the backups of a database form a single recovery path, as shown in the following illustration. In this illustration, the recovery path includes a database backup, taken at time t1, and three log backups, taken at times t2, t3, and t4.
The following illustration shows a recovery fork that results from recovering the database to an old point in time. A problem with backup t4 causes the database administrator to recover the database at the end of the t3 log backup. This restore causes a recovery fork. At time t5, a new log backup starts a new recovery branch, recovery branch 2.
The t5 log backup contains recovery-fork metadata that connects this backup to the t3 log backup on recovery branch 1. For information about recovery-fork metadata, see "Managing Recovery Forks," later in this topic.
The example shown in the preceding illustration creates a new recovery path, which is shown in the following illustration. The new recovery path includes some of the backups on recovery branch 1 (t1 through t3) and every log backup on recovery branch 2 (t5 through t9). From the perspective of this recovery path, log backup t4 is obsolete.
After a point-in-time restore, the next backup always causes a recovery fork. In the following illustration, a point-in-time restore is completed midway through the t4 log backup. Recovering the database to that point in time causes a recovery fork. Then, a log backup is created on the recovered database at time t5, establishing recovery branch 2 and creating a new recovery path. The first log backup on the new branch, t5, contains the same first LSN as log backup t3 and replaces it. Therefore, both the t3 and t4 backups are obsolete on the new recovery path.
To restore backups on this new recovery path, the restore sequence is: t1, t2, and t5. As future backups are taken on recovery branch 2, they will be incorporated in the new recovery path.
To Restore and Roll Forward Along an Old Path
Typically, when multiple recovery paths exist, the newest of them is the preferred path for restoring the database. We recommend that you avoid using an old recovery path. However, if you have to, you can roll forward along an old recovery path by following the sequence of backups taken before the creation of the current recovery path. For example, you can use backups taken before a point-in-time recovery to reach later points in time along the old path.
For example, based on the backups created in the previous illustration, after log backup t5 is created, it is still possible to restore from the full database backup taken at t1 to the end of log backup t4. This is on the old recovery path.
To Restore and Roll Forward from an Old Path to a New Path
The SQL Server Database Engine prevents a restore sequence from by using backups that do not go together, that is, that try to roll forward along different recovery paths. This restriction maintains the consistency of a database after a recovery.
To restore and roll forward along a new recovery path, construct distinct restore sequences for the backups before the recovery point and for the backups after the recovery point:
Restore the backups taken were before the recovery that introduced the new recovery path. Exclude the backup that contains the recovery point.
Roll forward along the new recovery path by restoring the backups that have been taken since the recovery path was created.
A recovery branch is a range of LSNs that share the same GUID. A recovery path describes a range of LSNs from a start point (LSN,GUID) to an end point (LSN,GUID). The range of LSNs in a recovery path may traverse one or more recovery branches from start to end. A new recovery branch originates when a database is created and when RESTORE WITH RECOVERY generates a recovery fork.
A recovery fork. is the point (LSN,GUID) at which a new recovery branch is started, every time a RESTORE WITH RECOVERY is performed. Each recovery fork determines a parent-child relationship between the recovery branches.
Recovering the database sets the whole database state, including the next LSN, to the recovery point. LSNs are then reused, starting with the fork_point_lsn. When constructing a restore sequence, therefore, backups must be linked by recovery fork and also by LSN, because the same LSN might exist on more than one fork. The following illustration shows LSN reuse. It shows how LSNs are reused in different recovery forks. The green boxes in the illustration indicate two backups that use the same LSN.
If a restore sequence must incorporate backups that traverse a recovery fork, the restore sequence must be constructed so that the backups that are used follow the correct recovery path to the recovery point. For this purpose, backups include a first recovery fork GUID and a last recovery fork GUID.
These GUIDs, together with other metadata that is relevant to tracking recovery paths, are stored in the backupset history table and are also returned by the RESTORE HEADERONLY statement. The following table summarizes the metadata values that are relevant to constructing restore sequences that traverse a recovery fork. Notice that the column names for these values are different for the history table and the result set of the RESTORE HEADERONLY statement:
backupset column name
RESTORE HEADERONLY column name
First recovery fork GUID
ID for the starting recovery fork.
Last recovery fork GUID
ID for the ending recovery fork.
Log sequence number of the first or oldest log record in the backup set.
Log sequence number of the next log record after the backup set.
Fork point LSN
Log sequence number of a fork point if the first recovery point GUID is not equal (≠) to the last recovery point GUID. Otherwise, no recovery fork occurs in the backup, and the fork point LSN is NULL.
Differential base GUID
For a single-based differential backup, the value is the unique identifier of the differential base.
For multibased differentials, the value is NULL, and the differential base must be determined at the file level. For more information, see backupfile (Transact-SQL).
For nondifferential backup types, the value is NULL.
The remainder of this discussion uses only the names of values in the backupset history table.
The last recovery fork GUID and first recovery fork GUID are used to link the backups to make sure that the sequence follows the correct fork. For each log backup in the sequence to be restored, first_recovery_fork_guid must equal last_recovery_fork_guid of the prior backup in the sequence.
Data and differential backups must also be linked.
If the log backup contains both the last LSN of a full database backup or a differential database backup and a fork point, the linking test depends on the location of the last LSN relative to the fork point.
The linking tests are as follows, using values from backupset:
If last_lsn is less than or equal to fork_point_lsn, the last_recovery_fork_guid of the data or differential backup must equal the first_recovery_fork_guid of the log backup. The following illustration shows a case in which last_lsn is less than fork_point_lsn.
If last_lsn is greater than fork_point_lsn, the last_recovery_fork_guid of the data or differential backup must equal the last_recovery_fork_guid of the log backup. The following illustration shows a case in which last_lsn is greater than fork_point_lsn.
For a differential backup, locate the differential base by using backupset.differential_base_guid.
If the differential is multibased, backupset.differential_base_guid is NULL, and you must determine the differential bases file-by-file by using backupfile.differential_base_guid.