Data Warehouse Deployment Best Practices

This section describes best practices related to the deployment of Commerce Server.

Distributing Across Multiple Computers

For optimal performance, deploy the Data Warehouse onto three computers. The following table lists the hardware and software requirements for distributing the Data Warehouse across multiple computers.

Computer

Software

Requirements

Hardware

(Large sites)

Hardware

(Medium sites)

Log import

  • Commerce Server

  • Data Warehouse resource (unpack)

  • Analysis Services client tools

  • 8 processors

  • 4 GB RAM

  • 200 GB disk space (minimum)

  • 4 processors

  • 4 GB RAM

  • 100 GB disk space

Data Warehouse

  • SQL Server core

  • SQL Server Client Tools

  • Commerce Server

  • Data Warehouse resource (unpack)

  • 8 processors

  • 4 GB RAM

  • 200 GB disk space (minimum)

  • 4 processors

  • 4 GB RAM

  • 100 GB disk space

Analysis Server

  • Analysis Server

  • Commerce Server

  • Data Warehouse resource (unpack)

  • 4 processors

  • 4 GB RAM

  • 100 GB or larger hard disk for the Online Analytical Processing database

  • 4 processors

  • 4 GB RAM

  • 100 GB or larger hard disk for the OLAP database

Before you unpack the Commerce Server Data Warehouse cubes, note the following items.

  • You must have Analysis Services client tools available on the computer from which you are unpacking.

  • You must have Analysis Services client tools available on the computer that is running the Data Transformation Services (DTS) import tasks. The computer that is running the DTS import tasks does not have to be the computer that will host the Data Warehouse. You may remove Analysis Services after you unpack the Data Warehouse.

  • If you are using Microsoft SQL Server 2005, you must install the SQL Server Client tools.

  • Do not unpack the Data Warehouse and other site resources to the same database because both databases use the same table names and this will cause failures.

File System Configuration

On the computer that is running Analysis Server, set the page file size to 4 GB per volume for large-scale data. This configuration lets Analysis Server handle a large committed virtual address space when it processes large cubes. Analysis Server caches all dimension data in memory. You can set the Page file size as high as 10 GB, but Commerce Server has not been tested beyond the 4 GB size. For more information about how to use a page file that is larger than 4 GB, see https://go.microsoft.com/fwlink/?LinkId=78817.

To set the Windows Server 2003 page file size

  1. On your desktop, right-click My Computer, and then click Properties.

  2. In the System Properties dialog box, on the Advanced tab, in the Performance section, click Settings.

  3. In the Performance Options dialog box, on the Advanced tab, in the Processor Scheduling section, verify that Background services is selected.

  4. In the Virtual memory section, click Change.

  5. In the Virtual Memory dialog box, in the Paging file size for selected drive section, in the Maximum size (MB) box, type 4095, and then click Set.

  6. Click OK three times.

To set the Windows Server 2008 page file size

  1. Click Start, right-click Computer, and then click Properties.

  2. In the Computer name, domain, and workgroup settings section, click Changesettings.

  3. In the Systems Options dialog box, on the Advanced tab, in the Performance section, click Settings.

  4. In the PerformanceOptions dialog box, on the Advanced tab, in the Processor scheduling section, verify that Background services is selected.

  5. In the Virtual memory section, click Change.

  6. In the Virtual Memory dialog box, in the Paging file size for selected drive section, in the Maximum size (MB) box, type 4095, and then click Set.

  7. Click OK three times.

Deploy the Data Warehouse database, the transaction log file, and the page file to different drives. This configuration improves data throughput by enabling simultaneous access to the physical device.

Use the Filegroups feature of SQL Server to simplify maintenance and improve performance.

SQL Server filegroups feature help you balance the data load across multiple disks, and use parallel threads to improve data access. If your hardware is I/O bound, you may want to use the Filegroups feature in SQL Server. To determine whether your hardware is I/O bound, see "Physical Disk performance counters" in Windows Help.

For example, you can separate the Request, Visit, and VisitInfo tables into three different file groups. (Separate the LogUser/URI table into a separate file group if these dimensions are large). Put aggregates in separate file groups. For information about file groups, see Creating New Data Warehouse Filegroups.

Database Memory Configuration

If you are running a large site, you will receive "out of memory" errors if the Web server log import process runs out of virtual memory. To avoid this error, enable the /3GB switch in the Boot.ini file on the computer that is hosting the Data Warehouse (SQL Server) and on the computer that is hosting Analysis Server. Additionally, increase the Analysis Services Process Buffer Size from the default value. For example, set the buffer size to 2 GB and allocate 4 GB of physical memory.

Database Sizing

Set the file size of the Data Warehouse database to improve import performance.

Automatically growing increasing, or growing, the size of the database file significantly reduces Data Warehouse import performance. Increasing the size of the database will help reduce the overhead of expanding the database at runtime. You can calculate the size of the Data Warehouse database by dividing the total log file size by 2. .

If you want to use the SQL Server Restrict file growth feature, set the size of the Data Warehouse database to expand to a particular size, instead of by a percentage. If you set the maximum size of a database to a particular size instead of to a percentage, you can manage the size of the database and disk capacity more closely.

To set the file growth parameters of the Data Warehouse database

  1. If you are using SQL Server 2005, start SQL Server Management Console.

  2. Expand a server group, and then expand a server.

  3. Expand Databases, right-click the Data Warehouse database, and then click Properties.

  4. In the <database> Properties dialog box, on the Data Files tab, in the Maximum file size section, select Restrict file growth (MB), in the Restrict file growth (MB) box, type the maximum size of the database, and then click OK.

Use only the default instance of SQL Server on the production server.

IP Addresses Reporting

Set the environment variable StoreClientIP to F (false) if you do not have to report IP addresses.

To set the StoreClientIP environment variable on Windows Server 2003

  1. On the desktop, right-click My Computer, and then click Properties.

  2. In the System Properties dialog box, on the Advanced tab, click Environment Variables.

  3. In the Environment Variables dialog box, in the User variables section, click New.

  4. In the New User Variable dialog box, in the Variable Name box, type StoreClientIP, in the Variable Value box, type F, and then click OK three times.

To set the StoreClientIP environment variable on Windows Server 2008

  1. Click Start, right-click Computer, and then click Properties.

  2. In the Computer name, domain, and workgroup settings section, click Changesettings.

  3. In the SystemOptions dialog box, on the Advanced tab, click Environment Variables.

  4. In the Environment Variables dialog box, in the User variables section, click New.

  5. In the New User Variable dialog box, in the Variable name box, type StoreClientIP, in the Variable value box, type F, and then click OK three times.

Processes that are running when you change the value of an environment variable do not use the new value. For example, if you change trace flags, you must stop and restart any trace processes that are running to make the new trace flags take effect.

Calculating Site Visits

Commerce Server infers individual visits to a site by using a combination of the referrer domain from which a user visits a site and a configurable time-out value for the visit (the default value is 30 minutes). After 30 minutes of inactivity, Commerce Server considers the visit ended. Other Log Analysis products use only a 30-minute time-out to calculate visits. Set the UseExternalReferrerForInference environmental variable to F (false) to use only 30-minute timeouts to calculate visits.

In Commerce Server, the internal referrer is a combination of the Nonsecure host nameproperty and the IIS application path property. During log processing, for each line of the log, the referrer field of the World Wide Web Consortium (W3C) log file (cs(referrer)) will contain a uniform resource identifier (URI). Commerce Server will match the prefix of the W3C referrer field against the internal referrer. If they do not match, Commerce Server considers the hit to be referred by an external site, and the current user visit is ended.

To set the UseExternalReferrerForInference environmental variable to false on Windows Server 2003

  1. Right-click My Computer and then click Properties.

  2. In the System Properties dialog box, on the Advanced tab, click Environment Variables.

  3. In the Environment Variables dialog box, in the System variables section, click New.

  4. In the New System Variable dialog box, in the Variable Name box, type UseExternalReferrerForInference, in the Variable Value box, type F, and then click OK three times.

To set the UseExternalReferrerForInference environmental variable to false on Windows Server 2008

  1. Click Start, right-click Computer, and then click Properties.

  2. In the Computer name, domain, and workgroup settings section, click Changesettings.

  3. In the SystemOptions dialog box, on the Advanced tab, click Environment Variables.

  4. In the Environment Variables dialog box, in the System variables section, click New.

  5. In the New System Variable dialog box, in the Variable name box, type UseExternalReferrerForInference, in the Variable value box, type F, and then click OK three times.

If you do not configure Commerce Server to recognize all internal domains for a site, it can cause an artificially high visit count.

Set the value of the UseExternalReferrerForInference environment variable to F (false) to turn off external referrer checking when you calculate visits.

If you start the Web Log Import process from a command line, you can use the following statement to set the UseExternalReferrerForInference environment variable to F (false) for that command shell. This will turn off external referrer checking when calculating visits.

set UseExternalReferrerForInference=F

You can set the UseExternalReferrerForInference environment variable to T (true) and Commerce Server Web Log Import will use external referrer checking again.

See Also

Other Resources

Best Practices for the Data Warehouse