Site Server - Creating Usage Profile Reports

August 1999 

Overview

This document describes how to create Usage Profile Reports from Internet Information Server (IIS) logs, using the Usage Import and Report Writer programs provided with Site Server 3.0 Commerce Edition. The Usage Profile Reports can then be used to perform a transaction cost analysis for your site. For more information on using analysis tools, consult the documentation provided with Site Server 3.0 Commerce Edition. You can also obtain other documents from Microsoft TechNet or from our Web site at https://www.microsoft.com/siteserver . For information on how best to use analysis tools, see the paper Microsoft Site Server Usage Analysis Capacity and Performance Analysis located in the Microsoft Site Server 3.0 Commerce Edition Resource Kit. See the paper Using Transaction Cost Analysis for Site Capacity Planning, also located in the Microsoft Site Server 3.0 Commerce Edition Resource Kit, for more information on transaction cost analysis and site capacity planning.

Note When you install Site Server, be sure to select Microsoft® SQL Server™ as your database, and not Microsoft® Access.

To create Usage Profile Reports:

  • Create the analysis database. 

  • Import log files. 

  • Add look-up information. 

  • Create analysis reports. 

  • Perform weekly maintenance. 

Creating the Analysis Database

This section describes how to:

  • Calculate database size 

  • Create the database 

  • Initialize the database 

Calculating Database Size

The first step in creating an analysis database is to calculate its size. Database size is based on the total number of log files you want to store. In general, you should store at least one week's worth of log file data to provide a good data sample. If you have multiple Web servers generating logs, you should store the logs from all servers so you can reconcile sessions that span servers.

To calculate your database size 

  1. Determine what types of reports you plan to run by month, by quarter, or by some other time period. 

  2. Estimate the size of the logs you plan to import during the specified period. 

  3. Double that value to obtain an estimate of the number of bytes you need for database tables, logs, and indexes. 

For a more thorough explanation of the calculations used to determine the size of your database, see the Microsoft Knowledge Base article 216559: "How to Estimate the Database Size for Site Server Analysis" in Microsoft TechNet.

Creating the Database

Use Microsoft SQL Server Enterprise Manager and your calculated database size to create the database and log device in which to store your analysis data. The Database Setup Wizard can do this for you. However, you will get better results by doing it manually.

Note If you used the values in the Site Server installation manual to create your database, delete that database and recreate it using your specific size calculations.

Initializing the Database

Once you have created your database, use the Site Server Database Setup Wizard to initialize your database for use with the Usage Import and Report Writer programs. You can't import logs until this is done.

To run the Site Server Database Setup Wizard 

  1. On the Start menu, select Programs / Microsoft Site Server / Analysis / Database Setup. The Database Setup Wizard Screen appears. 

  2. Click Next. The Choose a Setup Mode screen appears. 

  3. Click Update an existing database, and click Next. The SQL Server Login screen appears. Figure 1 illustrates the SQL Server Login screen with sample entries. 

     

    Figure 1 SQL Server Login Screen 

  4. Enter your login information, and click Connect. The Confirm Your Database Setup screen appears.

  5. If the information shown is correct, click Finish to update your database with the schema necessary to run the analysis utilities. If the information is not correct, repeat Steps 1-4 until it is correct. 

Importing Log Files

When you have created your analysis database, you can start importing log files and other data. When you do this, it is a good idea to copy the log files to a dedicated server so that you don't affect the production server's performance or network traffic. You can copy the files manually, of course, but it is more efficient to use File Transfer Protocol (FTP) to copy the files. You can use the Scheduler utility from the Usage Import program to schedule the file transfers using FTP, then import the files and run the desired reports.

To import log files 

  1. On the Analysis menu, click Usage Import. When you run Usage Import for the first time, it prompts you for a database connection. 

  2. Enter the connection information for the analysis database. (Once you start running the Usage Import program, you can point to a different database or view the current database by selecting Open Database from the File menu.) 

Creating a New Log Data Source

To view a set of logs from a new source (Web site), you must first create a new Log Data Source, and set up a server and a site.

To create a new Log Data Source 

  1. On the View menu, click Service Manager

  2. On the Edit menu, click New Log Data Source. The Log Data Source Properties dialog box shown in Figure 2 appears. 

     

    Figure 2 The Log Data Source Properties Dialog Box 

  3. Click Auto Detect, even if you know the log type, and click OK. The Properties dialog box for the default server appears. 

  4. Set the properties as follows, and click OK:

    Property

    Setting

    Server type 

    World Wide Web. 

    Directory index files 

    Names of the default pages for the site. 

    IP address 

    No entry. Only enter an IP address if you are using one log file to store data for several virtual servers. 

    IP port 

    80, unless the default port is another value. 

    Local time zone 

    Casablanca. GMT provides correct offsets of the report. Casablanca uses daylight savings time. You use the Import Options dialog box described in the Setting the Import Options section to enter the appropriate offset for the log files' time zone. 

    Local domain 

    No entry. 

    The Site Properties dialog box for the default site shown in Figure 3 appears. 

    Figure 3 Site Properties Dialog Box for the Default Site 

  5. In the Properties dialog box, select the Basic tab and enter the following:

    Property

    Setting

    Home page URLs 

    Web site addresses for this site. 

    URL path extension for this site 

    No entry, unless you want to check only a portion of a site. If you want to check only a portion of a site, enter the path to the section of the site you want to analyze. 

  6. In the Properties dialog box, select the Inferences tab shown in Figure 4. 

    Figure 4 The Inferences Tab 

  7. On the Inferences tab, enter the following:

    Field

    Setting

    Request algorithm 

    Select this check box. If there is no referring URL, this algorithm assumes the last page the user was on (that is, assumes the user selected the Back button). 

    Visit algorithm 

    The number of minutes to wait before ending an idle user session. 

    User algorithm 

    No entry. Selecting this box causes Usage Import to ignore the Username field. 

Setting the Import Options

Before importing a log file or performing look-ups, you must set the import options.

To set the import options 

  1. On the Tools menu, click Options. The Import Options dialog box shown in Figure 5 appears. 

    Figure 5 The Import Options Dialog Box 

  2. On the Import tab, enter the following:

    Field

    Setting

    Drop database indexes before import 

    No entry. If you select this box, the indexes are dropped each time an import occurs, causing poor performance.

    Adjust requests timestamps to 

    Select this check box, and select the time zone for the location of the log files.

    Use cookies for inferences 

    Select this check box if you want to track cookie usage on your Web site. Otherwise, no entry.

    Save query strings with URI 

    No entry. You can select this check box to try to improve the ability of HTML title look-ups to work on dynamic pages; however, this feature can decrease system performance.

    Start day of week 

    Day of the week on which to start importing logs. 

    After Import fields 

    No entry. In general, you should run these look-ups only once—after you have finished all your imports, because they take a long time to run. Use the Tools menu to run these look-ups manually when you have finished importing data.

  3. Select the IP Servers tab shown in Figure 6. 

    Figure 6 The IP Servers Tab 

  4. Enter the proxy information needed for the look-ups. You might also need to have a Winsock proxy client running on your machine to access the Internet.

Note If you are running inside a firewall, you need to have ports 43 and 53 open.

Importing the Files

When everything has been set up, you can begin importing log files.

To import log files 

  1. On the View menu, click Import Manager. The Microsoft Site Server Analysis—Usage Import—[Import Manager] screen shown in Figure 7 appears. 

    Figure 7 Import Manager Screen 

  2. To select files for import, enter the following information:

    Field

    Setting

    Log data source 

    Your log data source. 

    Log access 

    File:// 

    Log path 

    The selected file path. 

    Browse button 

    Select the log files for import. You can select as many files as you want, but you must have enough RAM to load all of the files you select. 

    If you want to import several large files but can't do it during one session, you can use the Scheduler on the Tools menu to run the imports at another time. 

    Add to list button 

    Add the selected files to the list of files to be imported. 

  3. Click Start Import to begin the import. 

    Note The import may take several hours to complete. 

  4. To view the results of the import, on the View menu, click Import History Manager.

Improving Import Performance

You can improve the performance of your imports with the help of SQL script files located in the Site Server 3.0 Commerce Edition Resource Kit.

To improve import performance 

Tuning Tips for Microsoft SQL Server 6.5

If you are using Microsoft SQL Server 6.5, you can further improve performance by doing the following:

Change…

To…

Default Index Fill Factor

20%

Max Asynch I/O

255

Max Sort Pages

255

MEM used

3/4 of available RAM

LE Threshold count

5000

LE Threshold %

20

Hash Buckets

10000

Tuning Tips for Microsoft SQL Server 7

If you are using Microsoft SQL Server 7, you can further improve performance by doing the following:

Change…

To…

Index Fill factor

20% (from auto)

Adding Look-up Information

After importing your log files, select the three look-ups that can be run to add more descriptive information to your reports.

To add look-ups 

  • On the Tools menu, run the following commands:

    Command

    To…

    Lookup HTML Titles 

    Add the titles of your Web pages to the report. 

    Resolve IP Addresses 

    Convert IP addresses to domain names. 

    If this command doesn't appear to work, check the section on Setting the Import Options (Figure 6). 

    If the command still doesn't run: 

    1. Go to the Import Options dialog box (Figure 5). 

    2. Go to the IP Resolution tab. 

    3. Set Cache IP resolutions to zero. 

    4. Rerun the command. 

    5. Reset Cache IP resolutions to its original value. 

    Whois Query 

    Replace domain names with organizational information.

    Be sure to run Resolve IP Addresses before running Whois Query

Creating the Reports

When all of the data has been imported, close Usage Import and run Report Writer.

To run Report Writer 

  1. On the File menu, click Open Report Writer Catalog. The Report Writer screen appears. You can create many reports from this catalog. For example, the Comprehensive Site Analysis Report in the Detail folder can tell you just about everything you need to know about your site. 

  2. To create the User Profile Report, select the Request detail report in the Detail folder. 

  3. If you want to run the report against all of the data in the database, click Run Now

    If you want to specify report parameters, click Next

  4. When the report has been generated, view the chart titled, "Top 10% of all requests." This chart lists the Web pages on your site that are visited most often. This should match the chart shown in the "Set Up a User Profile" section in the document Using Transaction Cost Analysis for Site Capacity Planning which is located in the Site Server 3.0 Commerce Edition Resource Kit. 

Performing Weekly Maintenance

Weekly maintenance is vital to ensure the stability of your site and to keep it running smoothly. Performing weekly maintenance also ensures that you get the best possible performance during log file imports and report generation. Maintenance should be performed immediately after you set up your database, then once a week thereafter. Maintenance should consist of the following steps:

  1. Back up the database. 

  2. Run the following utility to improve import and report generation performance: 

c:\mssql\binn\sqlmaint sqlmaint -S servername -U sa -P password -D dbname –Rpt c:\temp.txt -RebldIdx 20

**Note** Every field is required and the utility is case-sensitive.

Appendix

This section contains the code samples of SQL Server script files, which you can use to add, drop, and update indices:

  • Addindices.sql 

  • Dropindices.sql 

  • Updateindex.sql 

These indices are for the Analysis tables and are used for usage import and report creation.

Addindices.sql

The Addindices.sql script recreates the indices dropped by the Dropindices.sql script, changing the index type from clustered to non-clustered. Cut and paste the following text into an *.sql file to add indices:

ALTER TABLE tblImportCookie
ADD CONSTRAINT xCookieName Primary Key NONCLUSTERED (CookieName)
go
ALTER TABLE tblImportHostname
ADD CONSTRAINT xHostname Primary Key NONCLUSTERED (Hostname)
go
ALTER TABLE tblImportLevel1Dir
ADD CONSTRAINT xDirName1 Primary Key NONClUSTERED (DirName, Level1DirID)
go
ALTER TABLE tblImportLevel2Dir
ADD CONSTRAINT xDirName2 Primary Key NONCLUSTERED (DirName, Level1DirID)
go
ALTER TABLE tblImportLevel3Dir
ADD CONSTRAINT xDirName3 Primary Key NONCLUSTERED (DirName, Level2DirID)
go
ALTER TABLE tblImportLevel4Dir
ADD CONSTRAINT xDirName4 Primary Key NONCLUSTERED (DirName, Level3DirID)
go
ALTER TABLE tblImportLevel5Dir
ADD CONSTRAINT xDirName5 Primary Key NONCLUSTERED (DirName, Level4DirID)
go
ALTER TABLE tblImportLevel6Dir
ADD CONSTRAINT xDirName6 Primary Key NONCLUSTERED (DirName, Level5DirID)
go
ALTER TABLE tblImportOrganization
ADD CONSTRAINT xDomain Primary Key NONCLUSTERED (DomainName)
go
ALTER TABLE tblImportParameter
ADD CONSTRAINT xParam Primary Key NONCLUSTERED (ParameterName)
go
print "tblImportParameter"
ALTER TABLE tblImportReferrer
ADD CONSTRAINT xURL Primary Key NONCLUSTERED (URL)
go
ALTER TABLE tblImportUserName
ADD CONSTRAINT xUsername Primary Key NONCLUSTERED (Username)
go
ALTER TABLE tblImportVersion
ADD CONSTRAINT xVersion Primary Key NONCLUSTERED (VersionName, ProductID, SecurityID)
go
ALTER TABLE tblInterseFilter
ADD CONSTRAINT xFilterName Primary Key NONCLUSTERED (FilterName)
go
ALTER TABLE tblInterseIP
ADD CONSTRAINT xIP Primary Key NONCLUSTERED (IP)
go
ALTER TABLE tblIntersePath
ADD CONSTRAINT xTableName Primary Key NONCLUSTERED (TableName)
go
ALTER TABLE tblInterseProperty
ADD CONSTRAINT xPropertyName Primary Key NONCLUSTERED (PropertyName)
go
ALTER TABLE tblInterseSiteExt
ADD CONSTRAINT xSiteExtID Primary Key NONCLUSTERED (SiteID)
go
ALTER TABLE tblReferenceAdvertiser
ADD CONSTRAINT xAdvertiser Primary Key NONCLUSTERED (AdvertiserName)
go
ALTER TABLE tblReferenceDomain
ADD CONSTRAINT xDomainName Primary Key NONCLUSTERED (DomainName)
go
ALTER TABLE tblReferenceMediaExt
ADD CONSTRAINT xExt Primary Key NONCLUSTERED (ExtName)
go
ALTER TABLE tblReferenceTop
ADD CONSTRAINT xTop Primary Key NONCLUSTERED (TopName)
Go

Dropindices.sql

The Dropindices.sql script drops indices that map to non-sequential data (URLs, and so forth), changing them to a more suitable index type. Cut and paste the following text into an *.sql file to drop indices:

ALTER TABLE tblImportCookie DROP CONSTRAINT xCookieName
go
ALTER TABLE tblImportHostname DROP CONSTRAINT xHostname
go
ALTER TABLE tblImportLevel1Dir DROP CONSTRAINT xDirName1
go
ALTER TABLE tblImportLevel2Dir DROP CONSTRAINT xDirName2
go
ALTER TABLE tblImportLevel3Dir DROP CONSTRAINT xDirName3
go
ALTER TABLE tblImportLevel4Dir DROP CONSTRAINT xDirName4
go
ALTER TABLE tblImportLevel5Dir DROP CONSTRAINT xDirName5
go
ALTER TABLE tblImportLevel6Dir DROP CONSTRAINT xDirName6
go
ALTER TABLE tblImportOrganization DROP CONSTRAINT xDomain
go
ALTER TABLE tblImportParameter DROP CONSTRAINT xParam
go
ALTER TABLE tblImportReferrer DROP CONSTRAINT xURL
go
ALTER TABLE tblImportUserName DROP CONSTRAINT xUsername
go
ALTER TABLE tblImportVersion DROP CONSTRAINT xVersion
go
ALTER TABLE tblInterseFilter DROP CONSTRAINT xFilterName
go
ALTER TABLE tblInterseIP DROP CONSTRAINT xIP
go
ALTER TABLE tblIntersePath DROP CONSTRAINT xTableName
go
ALTER TABLE tblInterseProperty DROP CONSTRAINT xPropertyName
go
ALTER TABLE tblInterseSiteExt DROP CONSTRAINT xSiteExtID
go
ALTER TABLE tblReferenceAdvertiser DROP CONSTRAINT xAdvertiser
go
ALTER TABLE tblReferenceDomain DROP CONSTRAINT xDomainName
go
ALTER TABLE tblReferenceMediaExt DROP CONSTRAINT xExt
go
ALTER TABLE tblReferenceTop DROP CONSTRAINT xTop
Go

Updateindex.sql

The Updateindex.sql script rebuilds the index after you've added or deleted indices. If you don't run updateindex.sql, the indices will exist, but won't be used. (Alternatively, you can run sqlmaint.exe to update all indices.) Cut and paste the following text into an *.sql file to update the index:

Update Statistics tblImportCookie
go
Update Statistics tblImportHostName
go
Update Statistics tblImportLevel1Dir
go
Update Statistics tblImportLevel2Dir
go
Update Statistics tblImportLevel3Dir
go
Update Statistics tblImportLevel4Dir
go
Update Statistics tblImportLevel5Dir
go
Update Statistics tblImportLevel6Dir
go
Update Statistics tblImportOrganization
go
Update Statistics tblImportParameter
go
Update Statistics tblImportReferrer
go
Update Statistics tblImportUsername
go
Update Statistics tblImportVersion
go
Update Statistics tblInterseFilter
go
Update Statistics tblInterseIP
go
Update Statistics tblIntersePath
go
Update Statistics tblInterseProperty
go
Update Statistics tblInterseSiteExt
go
Update Statistics tblReferenceAdvertiser
go
Update Statistics tblReferenceDomain
go
Update Statistics tblReferenceMediaExt
go
Update Statistics tblReferenceTop
go