Best Practices for Deleting Data from the Data Warehouse

  • Double the size of the Data Warehouse database to allow for deleting data.

    The space needed for the ID tables that are used by the Data Deletion DTS task must be included in the size estimate for the Data Warehouse database. We recommend that you double the amount of space that you allocate. For example, for each gigabyte (GB) of data in the Data Warehouse, you should allocate 2 GB to allow for data deletion.

  • Delete ID tables if the delete fails.

    The delete code contains a switch that creates ID tables in the Data Warehouse database to delete data. This enables the Data Deletion DTS task to recover and continue from a failover or short connection failure. If the data deletion process fails, or the system experiences an extended connection failure, you must delete these tables manually.

    If the delete operation is completed successfully, the Data Warehouse deletes the ID tables. The table names contain a GUID and the prefix MSCSTemp. For example, the name of the ID table for TaskHistory might be MSCSTempTaskHistory_{490CD250-C540-47B7-80BD-B308F3B356C7}.

    The UseTempTablesForDelete environment variable controls where the ID tables are created. The default value of UseTempTablesForDelete is false. This means that the ID tables are created in the Data Warehouse database. Setting this variable to true will use the tempdb database for delete functions, and you will not be able to recover from a temporary connection failure, for example if the SQL Server service is stopped and restarted.

  • Delete Web log file data frequently to improve import times.

    Run the Data Deletion DTS task frequently to delete detailed Web log file data. Over time, Web log file data can accumulate to the point where it significantly affects import time and memory capacity of the Data Warehouse. For information about how to run the Data Deletion DTS task, see How to Run the Data Deletion DTS Task.

  • Delete orphan data from the Data Warehouse database to improve import times.

    Use the DeleteOrphans_UriQuery, DeleteOrphans_Uri, and DeleteOrphans_LogUser stored procedures to delete orphan data from the Data Warehouse database. For information about how to delete orphan data from the Data Warehouse database, see Deleting Orphan Data from the Data Warehouse.

  • Run the Report Preparation DTS task in full mode after you run the Data Deletion task.

    Follow this step to reprocess the cube partitions. By doing this, deleting old log imports will not adversely affect report aggregations.

See Also

Other Resources

Deleting Data from the Data Warehouse