Data Warehouse Deployment Best Practices

The following best practices apply to the Commerce Server Data Warehouse.

  • For optimum performance, deploy the Data Warehouse onto three computers, and Business Desk onto a fourth computer.

    Computer Software
    Requirements
    Hardware
    (Large sites)
    Hardware
    (Medium sites)
    1. Log import Commerce Server 2002
    Data Warehouse resource (unpack)
    Analysis Services client tools
    8 processors
    4GB RAM
    200 GB disk space (minimum)
    4 processors
    4GB RAM
    100GB disk space
    2. Data Warehouse SQL Server 2000
    SQL Server Client Tools
    Commerce Server 2002
    Data Warehouse resource (unpack)
    8 processors
    4GB RAM
    200 GB disk space (minimum)
    4 processors
    4GB RAM
    100GB disk space
    3. Analysis Server Analysis Server
    Commerce Server
    Data Warehouse resource (unpack)
    4 processors
    4GB RAM
    100GB or larger hard drive for the OLAP database
    4 processors
    4GB RAM
    100GB or larger hard drive for the OLAP database

    Analysis Services must be installed on any computer onto which you unpackage the Data Warehouse resource. After you unpackage the resource, you can remove Analysis Services. For example, you can install Analysis Services on the computer on which you are running Commerce Server Site Packager, but you can unpackage the Data Warehouse resource to another computer. When the unpackaging process is complete, you can uninstall Analysis Services from the computer on which you were running Site Packager.

    Analysis Services client tools must be installed on the computer running the DTS import tasks.

  • On the Analysis Server, set Page file size to 4GB per volume for large-scale data.

    This configuration enables Analysis Server to accommodate a large committed virtual address space when processing large cubes. Analysis Server caches all dimension data in memory.

    It is possible to set the Page file size as high as 10GB, but Commerce Server has not been tested beyond the 4GB size. For instructions, see the Knowledge Base article 237740 "How to Overcome 4,095-MB Paging File Size Limit in Windows" available at https://go.microsoft.com/fwlink/?LinkId=15058.

To set the Windows 2000 paging file size

  1. On your desktop, right-click My Computer, and then click Properties.
  2. In the System Properties dialog box, on the Advanced tab, in the Performance section, click Performance Options.
  3. In the Performance Options dialog box, in the Application response section, verify that Background services is selected.
  4. In the Virtual memory section, click Change.
  5. In the Virtual Memory dialog box, in the Paging file size for selected drive section, in the Maximum size (MB) box, type 4095, and then click Set.
  6. Click OK three times to close all dialog boxes.
  • Deploy the Data Warehouse database, the transaction log file, and the page file onto different drives.

    This configuration improves data throughout by enabling simultaneous access to the physical device.

  • Use the Filegroups feature of SQL Server to simplify maintenance and improve performance.

    The Filegroups feature helps you balance data load across multiple disks, and use parallel threads to improve data access. If your hardware is I/O bound, you may want to use the Filegroups feature in SQL Server. To determine whether your hardware is I/O bound, see "Physical Disk performance counters" in Windows 2000 Help.

    For example, you can separate the Request, Visit, and VisitInfo tables into three different file groups. (Separate LogUser/URI into a separate file group if these dimensions are large). Put aggregates in separate file groups. For information about file groups, see Creating New Data Warehouse File Groups.

  • To prevent "out of memory" errors in the Data Warehouse event log, specify a 3 GB switch in the Windows 2000 Boot.ini file on the computer hosting the Data Warehouse (SQL Server), and the computer hosting Analysis Server.

    If you are running a large site, you will receive "out of memory" errors if the Web server log import process runs out of virtual memory. To avoid this error, enable the 3 GB switch in the Boot.ini file in Windows 2000.

    Enabling the 3 GB switch requires two steps:

  1. Modify the Boot.ini file. For more information about the 3 GB switch, see Windows 2000 Advanced Server Help.

  2. Modify the executable running the log import DLL. Depending on how you import data into the Data Warehouse, you must modify the appropriate executable file.

    For instructions, see Running the Web Server Log Import DTS Task for a Large Site.

    In addition, for Analysis Services, increase the Process Buffer Size from the default value. For example, set the buffer size to 2 GB and allocate 4 GB of physical memory.

  • Set the file size of the Data Warehouse database to improve import performance.

Automatically growing the database file significantly reduces Data Warehouse import performance. Increasing the size of the database will help to reduce the overhead of expanding the database at runtime. You can calculate the size of the Data Warehouse database by dividing the total log file size by 2. Using a fixed size for the database instead of growing the file will prevent the significant reduction of import performance caused by growing the database file.

To disable automatic file growth

  1. In SQL Server Enterprise Manager, expand a server group, and then expand a server.
  2. Expand Databases, right-click the Data Warehouse database, and then click Properties.
  3. In the <database> Properties dialog box, on the Data Files tab, in the Maximum file size section, select Restrict file growth (MB), in the Restrict file growth (MB) box,****type the maximum size of the database, and then click OK.

If you want to use the SQL Server automatic file growth feature, set the size of the Data Warehouse database to expand to a particular size, instead of by a percentage. Setting the maximum size of a database to a particular size instead of to a percent will allow you to manage the size of the database and disk capacity more closely.

To set the file growth parameters of the Data Warehouse database

  1. In SQL Server Enterprise Manager, expand a server group, and then expand a server.
  2. Expand Databases, right-click the Data Warehouse database, and click Properties.
  3. In the <database> Properties dialog box, on the Data Files tab, in the Maximum file size section, select Restrict file growth (MB), in the Restrict file growth (MB) box,****type the maximum size of the database, and then click OK.
  • Use only the default instance of SQL Server on the production server.

To improve performance, do not run multiple instances of SQL Server on the production server.

  • Set the environment variable "StoreClientIP" to True only if you do not have any reporting needs for IP addresses.

To set StoreClientIP

  1. On your desktop, right-click My Computer, and then click Properties.

  2. In the System Properties dialog box, on the Advanced tab, click Environment Variables.

  3. In the Environment Variables dialog box, in the User variables section, click New.

  4. In the New User Variable dialog box, do the following:

    Use this To do this
    Variable Name Type StoreClientIP.
    Variable Value Type True.
  5. Click OK.

    Existing processes that are running while you are resetting a new environment variable do not update the extant executing processes.

  • Set the UseExternalReferrerForInference environmental variable to false to use timeouts only to close open user visits while importing.

Commerce Server infers individual visits to a site with a combination of the referrer domain from which a user visits a site and a configurable timeout value for the visit (the default value is 30 minutes), after which Commerce Server considers the visit ended. Other Log Analysis products use only a 30-minute timeout to calculate visits. Set the UseExternalReferrerForInference environmental variable to False to use only 30-minute timeouts to calculate visits.

In Commerce Server, the internal referrer is a combination of the Nonsecure host name property and the IIS application path property. During log processing, for each line of the log, the referrer field of the W3C log file (cs(referrer)) will contain a URI. Commerce Server will match the prefix of the W3C referrer field against the internal referrer. If they do not match, Commerce Server considers the hit referred by an external site, and the current user visit ended.

To set the UseExternalReferrerForInference environmental variable to false

  1. Right-click My Computer and then click Properties.
  2. In the System Properties dialog box, on the Advanced tab, click Environment Variables.
  3. In the Environment Variables dialog box, in the System variables section, click New.
  4. In the New System Variable dialog box, do the following:
    Use this To do this
    Variable Name Type UseExternalReferrerForInference.
    Variable Value Type F.
  5. Click OK three times to close all of the dialog boxes.
  • Failure to configure Commerce Server to recognize all internal domains for a site can result in an artificially high visit count.

Commerce Server infers individual visits to a site with a combination of the referrer domain from which a user visits a site and a configurable timeout value (the default value is 30 minutes). If you do not configure Commerce Server to recognize all internal domains for a site, Commerce Server considers the domains you do not configure as internal as external referrers and uses them to calculate new visits.

Setting the value of the UseExternalReferrerForInference environment variable to F (false) will turn off external referrer checking when calculating visits.

If you invoke the Web Log Import process from a command line, you can use the following statement to set the UseExternalReferrerForInference environment variable to false for that command shell and turn off external referrer checking when calculating visits:

   set UseExternalReferrerForInference=F

You can set the UseExternalReferrerForInference environment variable to true (T) and Commerce Server Web Log Import will use external referrer checking again.

Copyright © 2005 Microsoft Corporation.
All rights reserved.