Data Warehouse Configuration Best Practices

The following best practices apply to the Commerce Server Data Warehouse.

  • If you have multiple applications and log files, use a script to configure the Web server log import DTS task instead of using the user interface.

    For instructions, see Scripting the Web Server Log Import DTS Task.

  • When you change your site configuration settings, run the Configuration synchronization DTS task.

    You use the Configuration synchronization DTS task to import the site configuration data from the Administration database into the Data Warehouse. This differs from the Synchronize Values button on the Application Properties dialog box. When you click this button, Commerce Server reads the values from the Internet Information Services (IIS) metabase to the Administration database, and then from the Administration database to the Data Warehouse. For more information, see Configuring an Application.

    For performance purposes, the Data Warehouse stores the site configuration data, which includes site topology settings, instead of reading it from the Administration database each time it is needed.

    You can choose to synchronize the Data Warehouse with a single site or with all of the sites associated with the Data Warehouse. You must run the synchronize task before you run any DTS tasks that import or delete data. For instructions on running the Configuration synchronization DTS task, see Running the Configuration Synchronization DTS Task.

  • For the Profile data import DTS task, review the default value of the Query Interval property and the Import Start Date property.

    The Query Interval property controls how many import queries are sent to the online data source to retrieve data. The query interval is used to split the time between the time you last ran the Profile data import task and now into a number of intervals. The default value is 15 minutes. This means that the DTS task imports 15 minutes worth of data at a time, locking fewer records in the data source.

    Setting a smaller value will lock a smaller portion of the table but will increase the number of queries against the table and therefore the time the task takes to run. Setting the value to zero will run a single query, locking a significant portion of the table for the entire duration of the import and may degrade site performance.

    The Import Start Date property determines the starting date of import. By default, it is set to the current date. The first time you run the task you may need to change this value.

    If the Profile data import task is configured to run in incremental mode, the time the last run completed is used as the starting time for the next import. If the Profile data import task is configured to run in bulk mode, the value of the Import Start Date property is used as the starting time.

  • If user profile data is stored in an Oracle database, you must set the Date format environment variable before running the Profile data import task.

    Oracle uses a different date format than SQL Server. Set the Date format variable as follows:

    Set NLS_DATE_FORMAT = MM/DD/YYYY HH24:MI:SS
    
  • Review the default value of the FirstDayOfWeek property in your language version of SQL Server to determine the start day of the week for reports.

    For weekly aggregations of analysis reports, Commerce Server uses the default first day of the week for the given language version of the SQL Server product used. For the U.S. English version of SQL Server, the default first day of the week is Sunday.

    Analysis reports do not use the Start day of the week property specified in the Commerce Server Manager.

  • If you change the sa password for the SQL Server that hosts the Data Warehouse database, you must recreate the Transaction data import DTS task and the Product catalog import DTS task.

    These DTS tasks persist the sa password in them.

  • Always accept the default IIS option to log data in Greenwich time, which is the W3C standard.

    The Web server log import DTS task assumes that the data to be imported is in Greenwich time and processes the translation accordingly. Never select the IIS option to log data in local time, as that can cause problems with using the Date/Time Range property to import Web server log files.

    If you configure the Web server log import DTS task to import log files based on the Date/Time Range property, specify the date/time in Greenwich time.

  • Set the IncrementalOverlap value to the difference in time zones plus two hours

    When performing incremental imports, the Transaction data import and Profile data import DTS tasks use the last modified date field in the run time tables to detect when a record has been changed and needs to be imported into the Data Warehouse. As this field is a date/time value written by applications, the time zones and clocks of the systems writing the value may not be synchronized. This can lead to variations in the last modified date field that can potentially cause records to be skipped during the import process.

    For systems operating in different time zones, this interval should be increased so that the processing window covers the difference in time zones plus two hours.

    To override the default of two hours, add the following values to your Windows NT registry:

    DWORD : HEKY_LOCAL_MACHINE\Software\Microsoft\Commerce Server\TransactionImportDTS\IncrementalOverlap
    DWORD : HKEY_LOCAL_MACHINE\ Software\Microsoft\Commerce Server\ProfileDataImportDTS\IncrementalOverlap
    

    If the registry value is present, the Data Warehouse will use it as the number of hours subtracted from the start time of the last successful import instead of using the IncrementalOverlap value.

  • If you have a situation in which the Web server log import DTS task is not importing a file you want, or is importing one you do not want, check the configuration of the "Import all log files since last import" option.

    The Sitesummary table in the Data Warehouse stores the start date/time and end date/time data that you specify in the Import Web Server Logs dialog box.

    The Import all log files since last import option compares the end time value in the Sitesummary table with the date on the Web log file. This value is recorded in the local time, therefore, you may need to adjust it since the Web log files are logged in Greenwich time.

  • If client IP addresses are translated to a single IP address, use either cookies or user names in the Web server log import DTS task for inferring users.

    Two DTS tasks are affected by IP translations:

    • Web server log import task. When inferring users, if the IP address of some users is translated to a single IP, and if the users all use the same browser, then they will be treated as the same user.

      For example, if there are ten users whose IP address is translated to single IP address, and five of those users use browser "A" and the remaining five use browser "B", then all ten users will be treated as two users. This is the result if the IP address and browser type (instead of cookies or user names) are used for inferring users.

    • IP resolution task. When some IP addresses are translated to a single IP address, you may not get accurate domain information on all of the users visiting the site because the IP resolution task performs a look up on only one IP address.

    To use either cookies or user names to infer users, see Accuracy of Inferences.

  • If you use drillthrough on the Product Sales report or Shopping Basket Events report, use the CSDW_MakePrimaryProductCategoryInfo stored procedure to map each product to its primary parent category.

    When you use drillthrough to view the details of the Product Sales report or Shopping Basket Events report, more items may appear in the report details than would be expected from the count indicated in the PivotTable report. This behavior can occur when a product is mapped to multiple categories. This is by design—the product exists in multiple categories under the category currently being examined.

    If you want to ensure that only one row is returned for each product in the drillthrough results, you must ensure that each product is mapped to only one product category. Use the CSDW_MakePrimaryProductCategoryInfo stored procedure to map each product to its primary parent category. This ensures that each product will be mapped to one and only one category in the category hierarchy.

    For more information, see Using Drillthrough to View Product Sales or Shopping Basket Events.

Copyright © 2005 Microsoft Corporation.
All rights reserved.