Site Server - Creating Usage Profile Reports
August 1999
Overview
This document describes how to create Usage Profile Reports from Internet Information Server (IIS) logs, using the Usage Import and Report Writer programs provided with Site Server 3.0 Commerce Edition. The Usage Profile Reports can then be used to perform a transaction cost analysis for your site. For more information on using analysis tools, consult the documentation provided with Site Server 3.0 Commerce Edition. You can also obtain other documents from Microsoft TechNet or from our Web site at https://www.microsoft.com/siteserver . For information on how best to use analysis tools, see the paper Microsoft Site Server Usage Analysis Capacity and Performance Analysis located in the Microsoft Site Server 3.0 Commerce Edition Resource Kit. See the paper Using Transaction Cost Analysis for Site Capacity Planning, also located in the Microsoft Site Server 3.0 Commerce Edition Resource Kit, for more information on transaction cost analysis and site capacity planning.
Note When you install Site Server, be sure to select Microsoft® SQL Server™ as your database, and not Microsoft® Access.
To create Usage Profile Reports:
Create the analysis database.
Import log files.
Add look-up information.
Create analysis reports.
Perform weekly maintenance.
Creating the Analysis Database
This section describes how to:
Calculate database size
Create the database
Initialize the database
Calculating Database Size
The first step in creating an analysis database is to calculate its size. Database size is based on the total number of log files you want to store. In general, you should store at least one week's worth of log file data to provide a good data sample. If you have multiple Web servers generating logs, you should store the logs from all servers so you can reconcile sessions that span servers.
To calculate your database size
Determine what types of reports you plan to run by month, by quarter, or by some other time period.
Estimate the size of the logs you plan to import during the specified period.
Double that value to obtain an estimate of the number of bytes you need for database tables, logs, and indexes.
For a more thorough explanation of the calculations used to determine the size of your database, see the Microsoft Knowledge Base article 216559: "How to Estimate the Database Size for Site Server Analysis" in Microsoft TechNet.
Creating the Database
Use Microsoft SQL Server Enterprise Manager and your calculated database size to create the database and log device in which to store your analysis data. The Database Setup Wizard can do this for you. However, you will get better results by doing it manually.
Note If you used the values in the Site Server installation manual to create your database, delete that database and recreate it using your specific size calculations.
Initializing the Database
Once you have created your database, use the Site Server Database Setup Wizard to initialize your database for use with the Usage Import and Report Writer programs. You can't import logs until this is done.
To run the Site Server Database Setup Wizard
On the Start menu, select Programs / Microsoft Site Server / Analysis / Database Setup. The Database Setup Wizard Screen appears.
Click Next. The Choose a Setup Mode screen appears.
Click Update an existing database, and click Next. The SQL Server Login screen appears. Figure 1 illustrates the SQL Server Login screen with sample entries.
Figure 1 SQL Server Login Screen
Enter your login information, and click Connect. The Confirm Your Database Setup screen appears.
If the information shown is correct, click Finish to update your database with the schema necessary to run the analysis utilities. If the information is not correct, repeat Steps 1-4 until it is correct.
Importing Log Files
When you have created your analysis database, you can start importing log files and other data. When you do this, it is a good idea to copy the log files to a dedicated server so that you don't affect the production server's performance or network traffic. You can copy the files manually, of course, but it is more efficient to use File Transfer Protocol (FTP) to copy the files. You can use the Scheduler utility from the Usage Import program to schedule the file transfers using FTP, then import the files and run the desired reports.
To import log files
On the Analysis menu, click Usage Import. When you run Usage Import for the first time, it prompts you for a database connection.
Enter the connection information for the analysis database. (Once you start running the Usage Import program, you can point to a different database or view the current database by selecting Open Database from the File menu.)
Creating a New Log Data Source
To view a set of logs from a new source (Web site), you must first create a new Log Data Source, and set up a server and a site.
To create a new Log Data Source
On the View menu, click Service Manager.
On the Edit menu, click New Log Data Source. The Log Data Source Properties dialog box shown in Figure 2 appears.
Figure 2 The Log Data Source Properties Dialog Box
Click Auto Detect, even if you know the log type, and click OK. The Properties dialog box for the default server appears.
Set the properties as follows, and click OK:
Property
Setting
Server type
World Wide Web.
Directory index files
Names of the default pages for the site.
IP address
No entry. Only enter an IP address if you are using one log file to store data for several virtual servers.
IP port
80, unless the default port is another value.
Local time zone
Casablanca. GMT provides correct offsets of the report. Casablanca uses daylight savings time. You use the Import Options dialog box described in the Setting the Import Options section to enter the appropriate offset for the log files' time zone.
Local domain
No entry.
The Site Properties dialog box for the default site shown in Figure 3 appears.
Figure 3 Site Properties Dialog Box for the Default Site
In the Properties dialog box, select the Basic tab and enter the following:
Property
Setting
Home page URLs
Web site addresses for this site.
URL path extension for this site
No entry, unless you want to check only a portion of a site. If you want to check only a portion of a site, enter the path to the section of the site you want to analyze.
In the Properties dialog box, select the Inferences tab shown in Figure 4.
Figure 4 The Inferences Tab
On the Inferences tab, enter the following:
Field
Setting
Request algorithm
Select this check box. If there is no referring URL, this algorithm assumes the last page the user was on (that is, assumes the user selected the Back button).
Visit algorithm
The number of minutes to wait before ending an idle user session.
User algorithm
No entry. Selecting this box causes Usage Import to ignore the Username field.
Setting the Import Options
Before importing a log file or performing look-ups, you must set the import options.
To set the import options
On the Tools menu, click Options. The Import Options dialog box shown in Figure 5 appears.
Figure 5 The Import Options Dialog Box
On the Import tab, enter the following:
Field
Setting
Drop database indexes before import
No entry. If you select this box, the indexes are dropped each time an import occurs, causing poor performance.
Adjust requests timestamps to
Select this check box, and select the time zone for the location of the log files.
Use cookies for inferences
Select this check box if you want to track cookie usage on your Web site. Otherwise, no entry.
Save query strings with URI
No entry. You can select this check box to try to improve the ability of HTML title look-ups to work on dynamic pages; however, this feature can decrease system performance.
Start day of week
Day of the week on which to start importing logs.
After Import fields
No entry. In general, you should run these look-ups only once—after you have finished all your imports, because they take a long time to run. Use the Tools menu to run these look-ups manually when you have finished importing data.
Select the IP Servers tab shown in Figure 6.
Figure 6 The IP Servers Tab
Enter the proxy information needed for the look-ups. You might also need to have a Winsock proxy client running on your machine to access the Internet.
Note If you are running inside a firewall, you need to have ports 43 and 53 open.
Importing the Files
When everything has been set up, you can begin importing log files.
To import log files
On the View menu, click Import Manager. The Microsoft Site Server Analysis—Usage Import—[Import Manager] screen shown in Figure 7 appears.
Figure 7 Import Manager Screen
To select files for import, enter the following information:
Field
Setting
Log data source
Your log data source.
Log access
File://
Log path
The selected file path.
Browse button
Select the log files for import. You can select as many files as you want, but you must have enough RAM to load all of the files you select.
If you want to import several large files but can't do it during one session, you can use the Scheduler on the Tools menu to run the imports at another time.
Add to list button
Add the selected files to the list of files to be imported.
Click Start Import to begin the import.
Note The import may take several hours to complete.
To view the results of the import, on the View menu, click Import History Manager.
Improving Import Performance
You can improve the performance of your imports with the help of SQL script files located in the Site Server 3.0 Commerce Edition Resource Kit.
To improve import performance
Tuning Tips for Microsoft SQL Server 6.5
If you are using Microsoft SQL Server 6.5, you can further improve performance by doing the following:
Change… |
To… |
---|---|
Default Index Fill Factor |
20% |
Max Asynch I/O |
255 |
Max Sort Pages |
255 |
MEM used |
3/4 of available RAM |
LE Threshold count |
5000 |
LE Threshold % |
20 |
Hash Buckets |
10000 |
Tuning Tips for Microsoft SQL Server 7
If you are using Microsoft SQL Server 7, you can further improve performance by doing the following:
Change… |
To… |
---|---|
Index Fill factor |
20% (from auto) |
Adding Look-up Information
After importing your log files, select the three look-ups that can be run to add more descriptive information to your reports.
To add look-ups
On the Tools menu, run the following commands:
Command
To…
Lookup HTML Titles
Add the titles of your Web pages to the report.
Resolve IP Addresses
Convert IP addresses to domain names.
If this command doesn't appear to work, check the section on Setting the Import Options (Figure 6).
If the command still doesn't run:
1. Go to the Import Options dialog box (Figure 5).
2. Go to the IP Resolution tab.
3. Set Cache IP resolutions to zero.
4. Rerun the command.
5. Reset Cache IP resolutions to its original value.
Whois Query
Replace domain names with organizational information.
Be sure to run Resolve IP Addresses before running Whois Query.
Creating the Reports
When all of the data has been imported, close Usage Import and run Report Writer.
To run Report Writer
On the File menu, click Open Report Writer Catalog. The Report Writer screen appears. You can create many reports from this catalog. For example, the Comprehensive Site Analysis Report in the Detail folder can tell you just about everything you need to know about your site.
To create the User Profile Report, select the Request detail report in the Detail folder.
If you want to run the report against all of the data in the database, click Run Now.
If you want to specify report parameters, click Next.
When the report has been generated, view the chart titled, "Top 10% of all requests." This chart lists the Web pages on your site that are visited most often. This should match the chart shown in the "Set Up a User Profile" section in the document Using Transaction Cost Analysis for Site Capacity Planning which is located in the Site Server 3.0 Commerce Edition Resource Kit.
Performing Weekly Maintenance
Weekly maintenance is vital to ensure the stability of your site and to keep it running smoothly. Performing weekly maintenance also ensures that you get the best possible performance during log file imports and report generation. Maintenance should be performed immediately after you set up your database, then once a week thereafter. Maintenance should consist of the following steps:
Back up the database.
Run the following utility to improve import and report generation performance:
c:\mssql\binn\sqlmaint sqlmaint -S servername -U sa -P password -D dbname –Rpt c:\temp.txt -RebldIdx 20
**Note** Every field is required and the utility is case-sensitive.
Appendix
This section contains the code samples of SQL Server script files, which you can use to add, drop, and update indices:
Addindices.sql
Dropindices.sql
Updateindex.sql
These indices are for the Analysis tables and are used for usage import and report creation.
Addindices.sql
The Addindices.sql script recreates the indices dropped by the Dropindices.sql script, changing the index type from clustered to non-clustered. Cut and paste the following text into an *.sql file to add indices:
ALTER TABLE tblImportCookie ADD CONSTRAINT xCookieName Primary Key NONCLUSTERED (CookieName) go ALTER TABLE tblImportHostname ADD CONSTRAINT xHostname Primary Key NONCLUSTERED (Hostname) go ALTER TABLE tblImportLevel1Dir ADD CONSTRAINT xDirName1 Primary Key NONClUSTERED (DirName, Level1DirID) go ALTER TABLE tblImportLevel2Dir ADD CONSTRAINT xDirName2 Primary Key NONCLUSTERED (DirName, Level1DirID) go ALTER TABLE tblImportLevel3Dir ADD CONSTRAINT xDirName3 Primary Key NONCLUSTERED (DirName, Level2DirID) go ALTER TABLE tblImportLevel4Dir ADD CONSTRAINT xDirName4 Primary Key NONCLUSTERED (DirName, Level3DirID) go ALTER TABLE tblImportLevel5Dir ADD CONSTRAINT xDirName5 Primary Key NONCLUSTERED (DirName, Level4DirID) go ALTER TABLE tblImportLevel6Dir ADD CONSTRAINT xDirName6 Primary Key NONCLUSTERED (DirName, Level5DirID) go ALTER TABLE tblImportOrganization ADD CONSTRAINT xDomain Primary Key NONCLUSTERED (DomainName) go ALTER TABLE tblImportParameter ADD CONSTRAINT xParam Primary Key NONCLUSTERED (ParameterName) go print "tblImportParameter" ALTER TABLE tblImportReferrer ADD CONSTRAINT xURL Primary Key NONCLUSTERED (URL) go ALTER TABLE tblImportUserName ADD CONSTRAINT xUsername Primary Key NONCLUSTERED (Username) go ALTER TABLE tblImportVersion ADD CONSTRAINT xVersion Primary Key NONCLUSTERED (VersionName, ProductID, SecurityID) go ALTER TABLE tblInterseFilter ADD CONSTRAINT xFilterName Primary Key NONCLUSTERED (FilterName) go ALTER TABLE tblInterseIP ADD CONSTRAINT xIP Primary Key NONCLUSTERED (IP) go ALTER TABLE tblIntersePath ADD CONSTRAINT xTableName Primary Key NONCLUSTERED (TableName) go ALTER TABLE tblInterseProperty ADD CONSTRAINT xPropertyName Primary Key NONCLUSTERED (PropertyName) go ALTER TABLE tblInterseSiteExt ADD CONSTRAINT xSiteExtID Primary Key NONCLUSTERED (SiteID) go ALTER TABLE tblReferenceAdvertiser ADD CONSTRAINT xAdvertiser Primary Key NONCLUSTERED (AdvertiserName) go ALTER TABLE tblReferenceDomain ADD CONSTRAINT xDomainName Primary Key NONCLUSTERED (DomainName) go ALTER TABLE tblReferenceMediaExt ADD CONSTRAINT xExt Primary Key NONCLUSTERED (ExtName) go ALTER TABLE tblReferenceTop ADD CONSTRAINT xTop Primary Key NONCLUSTERED (TopName) Go
Dropindices.sql
The Dropindices.sql script drops indices that map to non-sequential data (URLs, and so forth), changing them to a more suitable index type. Cut and paste the following text into an *.sql file to drop indices:
ALTER TABLE tblImportCookie DROP CONSTRAINT xCookieName go ALTER TABLE tblImportHostname DROP CONSTRAINT xHostname go ALTER TABLE tblImportLevel1Dir DROP CONSTRAINT xDirName1 go ALTER TABLE tblImportLevel2Dir DROP CONSTRAINT xDirName2 go ALTER TABLE tblImportLevel3Dir DROP CONSTRAINT xDirName3 go ALTER TABLE tblImportLevel4Dir DROP CONSTRAINT xDirName4 go ALTER TABLE tblImportLevel5Dir DROP CONSTRAINT xDirName5 go ALTER TABLE tblImportLevel6Dir DROP CONSTRAINT xDirName6 go ALTER TABLE tblImportOrganization DROP CONSTRAINT xDomain go ALTER TABLE tblImportParameter DROP CONSTRAINT xParam go ALTER TABLE tblImportReferrer DROP CONSTRAINT xURL go ALTER TABLE tblImportUserName DROP CONSTRAINT xUsername go ALTER TABLE tblImportVersion DROP CONSTRAINT xVersion go ALTER TABLE tblInterseFilter DROP CONSTRAINT xFilterName go ALTER TABLE tblInterseIP DROP CONSTRAINT xIP go ALTER TABLE tblIntersePath DROP CONSTRAINT xTableName go ALTER TABLE tblInterseProperty DROP CONSTRAINT xPropertyName go ALTER TABLE tblInterseSiteExt DROP CONSTRAINT xSiteExtID go ALTER TABLE tblReferenceAdvertiser DROP CONSTRAINT xAdvertiser go ALTER TABLE tblReferenceDomain DROP CONSTRAINT xDomainName go ALTER TABLE tblReferenceMediaExt DROP CONSTRAINT xExt go ALTER TABLE tblReferenceTop DROP CONSTRAINT xTop Go
Updateindex.sql
The Updateindex.sql script rebuilds the index after you've added or deleted indices. If you don't run updateindex.sql, the indices will exist, but won't be used. (Alternatively, you can run sqlmaint.exe to update all indices.) Cut and paste the following text into an *.sql file to update the index:
Update Statistics tblImportCookie go Update Statistics tblImportHostName go Update Statistics tblImportLevel1Dir go Update Statistics tblImportLevel2Dir go Update Statistics tblImportLevel3Dir go Update Statistics tblImportLevel4Dir go Update Statistics tblImportLevel5Dir go Update Statistics tblImportLevel6Dir go Update Statistics tblImportOrganization go Update Statistics tblImportParameter go Update Statistics tblImportReferrer go Update Statistics tblImportUsername go Update Statistics tblImportVersion go Update Statistics tblInterseFilter go Update Statistics tblInterseIP go Update Statistics tblIntersePath go Update Statistics tblInterseProperty go Update Statistics tblInterseSiteExt go Update Statistics tblReferenceAdvertiser go Update Statistics tblReferenceDomain go Update Statistics tblReferenceMediaExt go Update Statistics tblReferenceTop go