Maintaining Remote BLOB Store

The remote BLOB store (RBS) Maintainer is supporting process that performs periodic garbage collection and other maintenance tasks for an RBS deployment. These tasks must be scheduled using Windows Task Scheduler or SQL Server Agent to run for each database that uses RBS. The maintainer must be provisioned through command line parameters or an XML file. In case of mirrored or replicated databases, the maintainer needs to be run against any one instance. The RBS Maintainer can perform the following functions:

  • Garbage Collection

  • RBS Maintenance Tasks

  • RBS and BLOB Store Consistency Checks

Running the RBS Maintainer

RBS requires that you define a connection string to the database before it can run. This string is stored in a configuration file that is normally created during installation. If RBS is installed manually, however, this configuration file will not be automatically created. The maintainer can be run manually by executing the Microsoft.Data.SqlRemoteBlobs.Maintainer.exe program with the following parameters:

  • ConnectionStringName <string name> - Specifies the name of the connection string for the configuration file. The default name created during RBS setup is RBSMaintainerConnection. This option is required.

  • Operation <space separated list of operations to perform> - Lists the garbage collection operations to perform. The possible values of this parameter are: ConsistencyCheck, GarbageCollection, Maintenance, ConsistencyCheckForStores, and ForceFinalize. This option is required.

  • GarbageCollectionPhases <phases> - Lists the garbage collection phases to perform. Phases are specified in this parameter by a set of representative letters. A value of r specifies a reference scan, d specifies delete propagation, and o for orphan cleanup. This option is required if garbage collection is to be performed

  • ConsistencyCheckMode <c, r or b> - Specifies the type of consistency check to be performed. A value of c specifies that only a check will be performed, r specifies that a check will be performed and any issues found will be repaired, and b for a check, repair and rebuild of the internal data structures. This option is required if a consistency check is to be performed

  • ConsistencyCheckExtent <m or c> - Specifies the extent of the consistency check. A value of m specifies that only metadata will be checked; individual BLOBs will remain unchecked if this option is specified. A value of c indicates that a complete check will be performed. The default value is m.

  • ConsistencyCheckForStores <Space separated list of blob store names> - Lists the names of the BLOB stores that will be checked for consistency. By default, all BLOB stores are checked.

  • TimeLimit <time in minutes> - Specifies the amount of time that is available for the maintainer to perform its tasks.

A separate RBS Maintainer task needs to be scheduled for every database using RBS. The following steps describe how to schedule a Maintainer task:

  1. Add a connection string to the <RBS_INSTALL_DIR>\Maintainer\Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config file for the Maintainer task to be performed. The RBS installer creates one connection string, named "RBSMaintainerConnection", using the connection information provided during setup, but new connection strings must be added for every additional database.

    The RBS Maintainer connection strings are stored in an encrypted format so, in order to add additional connection strings, either the new strings must be encrypted or the all of the connection strings must be decrypted. Encrypted strings must be added one at a time, but all of the connection strings can be decrypted at once using the %windir%\microsoft.net\framework\<version>\aspnet_regiis.exe tool, which is distributed as a part of the .net framework.

    Running the following commands will decrypt the connection strings and store the results in a web.config file.

    rename Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config web.config
    aspnet_regiis -pdf connectionStrings .
    

    After that, strings can be added in decrypted form and the file can then be encrypted back and renamed to Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config using the following commands:

    aspnet_regiis -pef connectionStrings . -prov DataProtectionConfigurationProvider
    rename web.config Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config
    
  2. Create a windows scheduler task to execute the Maintainer task for each applicable database. If run in GUI mode, the RBS installer will automatically create a maintainer task but, if command line mode was used, then the following steps must be performed each time a new Maintainer task is scheduled:

    1. Run the Task Scheduler in Administrative Tools.

    2. Create a new task.

    3. On the actions tab select Add New Action, then set the type to Start a program and point it to the maintainer binary <RBS_INSTALL_DIR>\Maintainer\Microsoft.Data.SqlRemoteBlobs.Maintainer.exe, adding any desired arguments. The default values created by installer are "-ConnectionStringName RBSMaintainerConnection -Operation GarbageCollection ConsistencyCheck ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120".

    4. On the Triggers tab, add a new trigger and schedule the task. Low system activity times are recommended.

    5. Set an account with sufficient database permissions to run the maintainer task.

    6. Mark the option Run whether user is logged on or not.

Garbage Collection

Garbage collection in RBS is performed passively. References to BLOBs are counted by looking at the list of BLOB IDs stored by the application in its RBS table columns at the time of garbage collection.

Any BLOB references that are present in the RBS auxiliary tables but absent in any RBS column in application tables, as well as BLOBs that are not present in any RBS column and were created before the Orphan Cleanup Time Window, described below, are assumed to be deleted by the application and will be garbage-collected.

Because passive garbage collection tabulates BLOB references from the RBS columns of an application's tables, every RBS column should have a valid index before it can be registered with RBS.

This lazy garbage collection is performed by the RBS Maintainer tool. It should be scheduled to run at non-busy times to reduce the impact on normal database operations.

RBS Garbage Collection is performed in three steps:

  • Reference scan. The first step is to compare the contents of the application's RBS tables with RBS' own internal tables and determine which BLOBs are no longer referenced. Any unreferenced BLOBs are marked for deletion.

  • Delete propagation. The next step is to determine which BLOBs have been marked for deletion for a period of time longer than the garbage_collection_time_window value and delete them from the BLOB store.

  • Orphan cleanup. The final step is to determine if any BLOBs are present in the BLOB store, but absent in the RBS tables. These orphaned BLOBs will then be deleted.

Configuring RBS Garbage Collection

Garbage collection can be configured by adjusting several maintainer and database settings. The Maintainer settings are listed below:

  • Maintainer schedule. This setting determines how often the maintainer will be executed. There is no default setting, as the user must specify a setting during RBS installation.

  • Task Duration. This setting determines the maximum length that a single maintainer execution is allowed to run. The default setting is 2 hours.

These Maintainer settings should be configured so that the Maintainer activity has minimal impact on regular activity. The database garbage collection settings are described in Configuring Remote BLOB Store.

Checking RBS Consistency

The RBS Maintainer verifies the integrity of RBS BLOB references and corrects any errors that are found. It performs several consistency checks for the database, such as verifying that indexes exist for the RBS columns, and verifying that all BLOBs referenced by application tables exist in RBS.

Auxiliary Table Consistency Check

The Auxiliary Table Consistency Check verifies that the RBS auxiliary tables are in a consistent state. The Check performs the following:

  1. Verifies that each RBS table column has a valid index.

  2. Verifies that registered applications RBS table columns exist, have enabled, valid indexes, and have the correct column type.

The following consistency checks are optional and can be omitted:

  1. Verifies that all of the blobs referenced in the application tables are present in the RBS tables.

  2. Verifies that no blobs are marked as both in use and deleted.

If the consistency check discovers any problems they will be logged and the RBS Maintainer will attempt to fix them by creating missing index entries, un-registering missing columns, or marking in-use BLOBs as not deleted.