Performance Optimization and Capacity Planning

Article
08/02/2007

For the latest information, see https://www.microsoft.com/cmserver

Introduction

Microsoft® Content Management Server 2001 (MSCMS) provides a number of methods and tools to plan, analyze, and test an MSCMS Web site in order to optimize its performance and ensure that it can meet the required capacity load.

This paper is presented in two sections that address each of these separate but related topics. Methods, tools, and the steps that need to be taken to achieve both these goals are discussed. Examples and test numbers are also provided.

The first section discusses how to achieve optimal performance from an MSCMS site by making use of caching, and by analyzing, scaling, and tuning the site to eliminate any site bottlenecks. Bottlenecks, which limit site performance, can be caused by a variety of conditions. Finding and fixing bottlenecks will tune your site for best performance.

The second section addresses capacity planning—how to plan for, estimate, and manage the maximum anticipated load capacity for your site. It outlines which information to gather during the site planning phase, how to determine your business goals and user targets, and describes transaction cost analysis methodology. Some test examples are also provided. Tests include calculating the load that a site can handle, the number of pages that can be served in a given time, cost per user, and maximum capacity. Possible bottleneck areas are also addressed in this section.

Tests and analyses in this paper were done using the Microsoft® Web Application Stress Tool (WAST) and the Transaction Cost Analysis (TCA) tool from the Microsoft® Commerce Server 2000 performance toolkit.

WAST is a load generation tool. The TCA tool acts as a WAST driver and is used for site performance analysis.

References to documents, tools, and links are provided in the "References" topic located at the end of this paper.

Reader Guidance

This paper is of interest to systems analysts, developers, and IT professionals. Readers of this paper should have a working knowledge of the following:

MSCMS
System Architecture
Transaction cost analysis methodology
Web application stress testing

For detailed information about the concepts in MSCMS, see the MSCMS online documentation located at https://www.microsoft.com/CMServer/techinfo/.

Section I: Performance Optimization

This section outlines the aspects of MSCMS that affect performance. A brief overview of the architecture of an MSCMS site is presented to provide the overall understanding necessary to produce high performance sites. Also discussed are MSCMS settings that can be tuned to increase site performance, site design considerations related to performance, and known performance limitations. The key performance-tuning concept of fragment caching is covered. Possible bottleneck areas are identified and addressed as they relate to each topic.

MSCMS Server Architecture

Before discussing performance optimization methods, it is helpful to clarify the physical and logical architecture of an MSCMS site.

The following figure shows a possible architectural configuration for an MSCMS site. It also shows the workflow and deployment of the site.

In this configuration, the site structure, templates, and navigation are developed on the MSCMS development server. The site is deployed to the authoring server, using the MSCMS Site Deployment (SD) feature. In addition, Microsoft Application Center 2000 (AC) is used to move include files and registry settings. This deployment is required relatively infrequently compared to the deployment from the authoring server.

Authors create the content on the MSCMS authoring server. The same server also handles low volume read/write traffic.

This completed, or updated, site is frequently deployed out to the production server farm outside a firewall. A load balancer manages the traffic through the second firewall to and from the Internet.

Physical Architecture

An MSCMS site runs Microsoft® SQL Server™ on its back-end server. This server supports a cluster of MSCMS Web servers or a single MSCMS Web server.

The SQL server back-end scales up, by adding more processors. The Web server cluster can be scaled out, by adding hardware to the configuration, as well as up.

Logical Architecture

The MSCMS server logical architecture contains layers. You need to understand the layers and how they affect performance. This is well illustrated by looking at how an MSCMS page is assembled. To serve up a page, MSCMS performs the following steps:

An HTML request for a page is sent to the MSCMS server. The MSCMS Publishing API is initialized as soon as a user session in started.
The Microsoft Internet Server Application Programming Interface (ISAPI) filter receives the request and analyzes the URL to see which MSCMS objects are being requested. The program locates the template ASP page that is required, and the underlying objects that are needed.
MSCMS transfers control to that template.
Based on the information in the template, MSCMS retrieves the objects that are required to assemble the page structure—either from a cache or from the database—and renders that page.
The template also contains API code. This API code, which contains information about placeholders and navigation, is run. From this, content for the page is retrieved.

When the page is served, the assembling cycle is complete.

The first step in achieving the optimal speed at which a page is served up, is to measure and test the current or projected throughput (requests per second).

Measuring and Testing Performance

Throughput is a key measurement of performance. This can be measured as:

Pages per Second. A page is what the user sees after making a request; one page request can be made up of many ASP requests, for example, server executes or redirects.
ASPs per Second. Each ASP request can contain many Get requests.
Gets per Second. Individual requests to the Web server for objects such as images.

Therefore, when measuring throughput be aware which of the above is being measured, since the numbers are successively larger for Gets vs ASP, and for ASP vs Pages.

Another measurement is the number of concurrent users on the site. This is discussed in "Section II: Capacity Planning" in this paper, along with more detailed information about measuring performance.

Two ways to maximize the performance of your site are:

Tuning performance with configuration settings and fragment caching
Careful planning of site content at design time such as containers, searches, use of placeholders and navigation considerations.

Tuning Performance

After performance has been measured and tested, the next step is to tune the site.

Areas to fine tune performance on your site include:

Tuning the software with the MSCMS Server Configuration Application
Caching, including database caching, fragment caching, and automatic caching components
Balancing the number of items in each container
Using API searches efficiently
Limiting the use of placeholders
Site navigation considerations

Each of these areas are discussed in the following topics.

Tuning MSCMS with the SCA

The Server Configuration Application (SCA) in MSCMS provides configuration settings that allow you to tune site performance. When SCA is run, it presents an interface that contains six tab selections for tuning the server. The following are used to enhance performance:

SCA Cache tab:
- On this tab, you can set the maximum size of the file system cache to a size large enough to hold the bulk of your binary data.
- You can set the maximum nodes in the memory cache so that it is large enough to hold the most commonly accessed pages on your site. Keep in mind that a single page may require many nodes to be displayed, depending on such elements as navigation, page depth, and siblings.
SCA Web tab:
- On this tab, you mark the Web entry points as read-only.
  
  Entry points marked as read-only use a different version of the MSCMS include files (*_RT.inc) which do not have the include overheads that are needed to support edit mode. In other words, MSCMS uses a smaller amount of script code in order to run a read-only request than it does to run a read-write request.
  
  Performance can improve by up to 10% on the average Web page. An average Web page is one that is neither heavy with graphics nor light with text only.
  
  MSCMS supports mixed entry points. That is, read-only and read-write entry points for the same site have different serialized template files in their exec resource caches.

Caching

Database Caching

The MSCMS database contains many objects; therefore, constant retrieval directly from the database would be extremely slow. This is a potential bottleneck for your site.

In order to improve the speed at which objects are retrieved, MSCMS provides two automatic caches. These are the memory object cache and the file system cache. The file system cache is also known as the disk cache.

The memory object cache is used for storing underlying data relating to API objects, such as channels and placeholders. This cache is queried when an API object is requested.

The file system cache stores binary data, such as graphics and templates.

If a requested object is not already in the appropriate cache, the information is retrieved from the database and added to that cache for future use.

The more information retrieved from the cache, the faster will be the performance. For more information about the SCA configuration settings for tuning the cache, see the previous topic "Tuning MSCMS with the SCA."

Fragment Caching

Fragment caching is a key element in obtaining significantly better performance on your site.

It turns dynamic operations into static operations in the following manner:

A hash table look-up is done in the cache to see if a fragment exists.
If the fragment does not exist, a string is concatenated to represent the HTML for the currently requested object. The string is stored in the cache with a unique hash code. The fragment is output for immediate use, and is available in the cache for future retrieval.

If a fragment exists in the cache, it is output from there. This operation has virtually no overhead.

When to Use Fragment Caching

Each time a request is received MSCMS calls up a large number of COM objects in order to render the page, and each time, the objects are created and destroyed.

Fragment caching moves the output obtained the first time these COM objects are run, and puts the output into the cache. The output is then available for future use. The result is a decrease in overhead costs and an improvement in speed.

Best practice Use fragment caching for guest access only. Guest access is typically the most frequent type of access on a high-load site.

When Not to Use Fragment Caching

Do not use fragment caching during design time because new objects are constantly being created, and every edit causes the cache to flush. In addition, the number of concurrent users at design time is typically significantly lower than at run time.

Be careful if using fragment caching during authenticated access, as you do not want to circumnavigate the access controls. Because different users have different access rights to channels, you need to maintain control over who has permission to view which channels.

Automatic Caching Components

Several components that run automatic caching are available:

Least Recently Used (LRU) Cache

If you are using Commerce Server, it provides a Least Recently Used (LRU) cache component. It allows you to configure settings, for example, the size the cache should reach before it is flushed. This extremely fast component scales well to many processes.

The MSCMS Autoflush Dictionary

This is the default cache used by MSCMS. It uses two methods:

Add: takes the fragment key and the HTML code that makes up the value of the fragment, and adds them to the cache.
Lookup: passes the key to the cache, which returns the fragment.

The following code syntax shows an implementation of the Add and Lookup methods:

Set ContentCache = AutoSession.Session.GetContentCache
ContentCache.Add(strKey,strValue)
StrValue = ContentCache.Lookup(strKey)

The cache flushes automatically whenever its underlying state changes. This is a coarse flush as opposed to a per-object flush. A coarse flush saves overhead.

Alternatives

You can also use Microsoft ActiveX® Data Objects (ADO), or any other thread safe storage component. ADO does not flush automatically. Use the Server Change Count method as discussed in the following topic "Keeping your External Cache Synchronized."

Important: Do not use the Microsoft Visual Basic® Scripting Edition (VBScript) scripting dictionary that comes with the scripting engine. It is apartment threaded and Internet Information Services (IIS) is multithreaded. See Microsoft Knowledge Base article number 240415, located at https://support.microsoft.com/default.aspx?scid=kb;en-us;240415&sd=tech, for more information.

Keeping the External Cache Synchronized

In some situations, using an external caching mechanism rather than using the automatic flushing components can be useful. For example, you may want to have an external cache that only contains output about pages or users.

MSCMS provides a Server Change Count method that tracks a count value that is incremented when a change to the underlying site is detected, for example, when an edit operation is performed.

This count value is also incremented when an implicit change is detected, that is without any edit operation having been performed. For example, when a posting is published or a page expires, no editing has been done on the site; it is simply that time has passed and that page has activated or expired.

You can keep a record of this count value in an external cache and check it against the current count value in the memory object cache each time a request is made. If the current value has increased, indicating that a change has occurred, you should flush the external cache.

The following exercise shows the effect on the count of an implicit change:

Create an ASP page containing the following code:

<html> <body> <% response.write( Autosession.ServerChangeCount ) %> </body> </html>

Set the start time of a posting to two minutes from the present time.
Approve the posting.
Return to the page you created. Note the count value.

After the start time has passed, refresh the page. Note that the count value incremented by one when the page was published.

Balancing Items in Containers

Consider the number of items that are in each container.

Keep in mind that API collections instantiate all items in a container when the collection is accessed. For example, calling pMyChannel.Postings.Item(0).Name results in every posting object in that collection being inflated.

If the number of items is too high, performance issues may arise. This maximum number is hardware dependent, but could be, for example, 400 items.

Best practice distribute items over multiple containers, ensuring that the number of items in each one does not exceed your maximum. If for some reason the number of items must exceed that amount, make use of fragment caching.

Exceptions: API calls that access items directly, such as getByPath(), GetByURL(), GetByGuid() are not subject to this performance issue. Note that GetByGuid() is the fastest way to access an object.

Using API Searches Efficiently

Consider how you are searching, and how frequently. An API search may need to access the database to determine what to retrieve. So each time you call a search method in the API it may be generating a load on the database. For example, GetChannelsByCustomProperty() does a table search of all the channels.

Use SQL Server Query Analyzer to watch the load you are putting on the database by searches. When running MSCMS efficiently, the database load should be low.

Best practice Use searches sparingly and/or make use of fragment caching. Use SQL Query Analyzer to monitor the database load and query execution.

Exceptions: API calls that access items directly, such as getByPath(), GetByURL(), GetByGuid() are not subject to this performance issue.

Limiting the Use of Placeholders

Consider how many placeholders there are on individual pages. The more placeholders a page needs to render, the longer will be the processing time.

Placeholder collections are treated like other MSCMS collections. Accessing the placeholder collection instantiates all placeholders, for example, running pPosting.Placeholders.Item("MyPlaceholder").

Best practice When storing metadata about a page, do not store it in a placeholder. Instead, make use of MSCMS custom properties. This data is not going to be displayed at run time so it does not need to be in a placeholder.

Site Navigation Considerations

On frameless sites, dynamic navigation is generated on every page request.

Dynamic navigation can be costly because it touches many objects.
If navigation is deep, for example a two level DHTML pull-down menu, performance will be adversely affected.

Best practice To help offset this overload, use fragment caching.

Performance Optimization Summary

SCA provides some configuration options that are relevant to performance tuning, for read-only sites and for node and file cache sizing.

MSCMS provides strong caching support for both the in-memory object cache and the file system cache. Fragment caching is the single most important performance driver – always use it where appropriate. It is not appropriate to use fragment caching during site design or during authenticated operations.

Site design and API usage affects overall site performance, with regard to such tasks as searches, collection, and navigation.

Taking the measures discussed in this section will contribute towards keeping your site performing at optimum speed. The site should be monitored and tuned on an ongoing basis.

Another way to improve performance is to plan carefully to provide sufficient hardware capacity for your site.

Section II: Capacity Planning

Capacity planning is the process of measuring the speed at which a Web site serves content to users.

The capacity of your site is determined by measuring the number of users who vist your site, analyzing the types of activity the user performs, and how much demand those activities place on the server. The results of these measurements are used to determine usage requirements.

The capacity of a system is measured by the number of requests it can handle, such as how many Web pages can be served up in a given time, and how much user traffic the site can handle.

In order to get the best results from MSCMS and your site, you need to plan for the capacity of your site and your system. Your goal is to ensure that your system can meet or exceed your site's projected maximum capacity demands.

The steps used to determine the capcity for your site, including general analysis and guidelines, tranaction goals, and Transaction Cost Analysis methodology, are covered in this section, as well as the hardware needs of an MSCMS site. Also included are example MSCMS performance testing results.

Areas for Analysis

Consider the various areas of your resources in which analysis must be done to plan optimal capacity:

Web servers
Database server
Network
Disk speed and RAM

While these are the major areas to be examined, you should also look at some logical areas:

Run-time production server
Authentication
Content contribution (mix of design-time and run-time operations)
Deployment

Plan your site capacity requirements top-down. Consider what the business is trying to achieve, what the users will do on the site, and the quantity of users who will visit the site.

Create a list of transactions that will be performed on the site so you can calculate some measurements. The results from this planning will help you to determine the type of architecture required for your site, and how to fine-tune the code on your site. For more information about listing the transactions needed for your site, see the topic "Transaction Goals" later in this document.

After your site is operational, it should be monitored on an ongoing basis throughout its life cycle. You should monitor changes in business requirements and operations in order to make the appropriate adjustments.

General Guidelines

Use the following list of general guidelines when determining capacity planning for your site:

Read-only sites are restricted to the Web server CPU. Monitor for API searches and third party searches – for example, Microsoft® SharePoint™ Portal searches.
Authoring makes up a very small percentage of a site's traffic, even in large sites.
Deployments are database, CPU, and Web server CPU intensive.

Note: While these guidelines have been proven true during testing, they should not be considered a substitute for your own site testing.

Transaction Goals

A clear understanding of planned targets is invaluable in creating a set of transaction goals. Consider the following when gathering information about your business requirements:

Talk to other employees about their functions and business plan goals, their marketing targets, and growth projections. Your CEO, CIO, or marketing department may have this information.
Validate and compare your estimates against other industry figures. Also, look at analysis for other sites your own company may have.
What is the anticipated user base?
Review known industry results and traffic levels.
Determine what the browse-to-buy ratio is—what is the ratio of hits before a purchase is made?
Determine peak traffic volumes—what is the range between average and peak traffic?

Based on the results of this list, you can estimate capacity requirements by calculating the throughput for your site, as shown in the following example. Numbers for a possible business plan are applied to the given formula in order to determine an estimate of throughput for a site.

Example: New E-commerce Initiative

Estimate the throughput using the following formula:

((revenue / avg sales / browse to buy ratio) x avg pages visited x requests per page) / secs per year

Business Plan:

Drive 50 million in revenue through the site

Average customer sale $40

Expected browse to buy ratio 1.3%

Average user visits 10 pages

Estimate based on the above figures:

((50,000,000 / 40 / .013) x 10 x 3)/Second per year = ~95 pages/sec

User Targets

When estimating the anticipated number of users, consider the following points:

How long are users browsing the site?
What is the maximum concurrent load: average vs peak loads. Are there any special events or offers that may affect the load, for example, holiday shopping?
How big is your registered user base?
What is the projected growth rate?

Calculate Concurrent Load

Based on the figures collected, you can:

Calculate the number of requests at peak using the formula:

Visitors x requests x peak / seconds in a day
Calculate the maximum concurrent users using the formula:

Visitors x length x peak / minutes per day
Compare how much greater peak is than average.

Example

Category	Quantity
Visitors	500,000
Minutes per visit	7
Requests per visit	8
Peak multiplier	3.50
Requests at peak	162.03
Max concurrent users	8,506

Page Size Considerations

If a page is too big and serves up too slowly, users will leave the site. This is called user walkaway. An MSCMS page can be too big if it contains an above-average number of graphics or placeholders.

Pages should be kept small enough so that they will serve up in less than five seconds.

The following table shows the time it takes to serve a page, based on the type of access being used. Bolded numbers are considered too slow.

Access type	Throughput	2K	10K	20K	50k	100K	1MB
28.8K	28800	0.6	2.8	5.7	14.2	28.4	291.3
56K	57600	0.3	1.4	2.8	7.1	14.2	145.6
DSL	640000	0.0	0.1	0.3	0.6	1.3	13.1
Cable modem	800000	0.0	0.1	0.2	0.5	1.0	10.5

Bandwidth Decisions

Example: Capacity based on peak

A site experiences at peak 163 page requests per second. The average page size is 30 KB. Therefore, the bandwith requirement is around 38 Mbps at peak.

In the following table, the bolded numbers are the optimal choice.

Connection	Mbps
T-1/DS-1	1.544
T-3/DS/-3	45
OC-3	155
Oc-12	622
Oc-48	2488

Install the highest capacity bandwidth line that your projections tell you will be needed. Keep in mind that it can take some months to get the required bandwith installed.

The Transaction Cost Analysis tool, which calculates the amount of hardware required to support a given number of users, is an ideal tool to use for your analysis.

About the Transaction Cost Analysis Tool

The purpose of the Transaction Cost Analysis (TCA) tool is to calculate the amount of hardware required to support a given number of users or, conversely, how many users will be supported by a given hardware configuration. TCA calculates this by analyzing the cost of user operations. It greatly simplifies the transaction cost analysis process.

TCA uses a well laid out approach to estimating a site's capacity. It breaks down the transactions in the site and analyses them. The results can be used to optimize the site, predict capacity, and help with scaling decisions. The TCA tool is used in conjunction with the Web Application Stress Tool (WAST).

Web Application Stress Tool

WAST allows you to stress your Web site and determine site throughput. To prepare for testing using WAST, you can enter a script manually, record a script, and build a list of several URLs to simulate a user transaction for test purposes.

When run with the TCA tool, the WAST applies successively more load until maximum performance is reached, or a failure occurs.

Detailed information about the WAST can be found at https://www.microsoft.com/technet/archive/itsolutions/intranet/downloads/webstres.mspx.

Using the TCA Tool

The TCA tool provides a mechanism for incrementally increasing the system load for testing purposes. It monitors performance counters and terminates tests when set failure criteria are met. Furthermore, it produces Microsoft Excel graphs of measured performance counters and calculates the cost of user transactions.

Specifically, the TCA tool:

Acts as a WAST driver
Eases the work of performing transaction cost analysis
Calculates cost for operation automatically by allowing you to measure the performance cost of site transactions

The transaction cost analysis methodology:

Allows you to determine the current capacity of your site
Points out how to improve the capacity of your site
Indicates where to scale the capacity
Projects the number of concurrent users your site can handle (capacity) with a given amount of hardware

Why Use TCA Methodology?

TCA provides the tools to allow you to determine capacity and when to scale:

What hardware is needed for a site
How many users a site can serve up
When more servers should be added
If a site can survive special events, for example, holiday traffic
Where the site bottlenecks are

Steps Required in the TCA Process

The following steps outline the TCA process.

Document site details.

The documented information that you gather can be moved from one site to another, for use in future analysis.

Keep in mind that the more closely the test environment mirrors the production environment, the more applicable will be the test results. Do not underestimate the effect this can have. Document the following:
- Hardware configuration
  - Topology diagrams: firewall locations, network speed of transfer between hardware
  - Resource metrics: CPU speed, memory
- Software used: versions, service packs, hot fixes
- The site's complexity:
  - Number of pages and channels in the site
  - Pages with heavy usage of placeholders, containers with large number of items
  - Number of API searches
  - Dynamic navigation that does not make use of fragment caching

Discover user profile.

It is critical to measure the throughput of the site in a manner that reflects the way it will be used. You must know how users will use the site, for example, which pages will they visit and how often.

In the same way that different sites have different capacities, a single site that is used in several ways will result in varying sets of performance characteristics for each usage. For example, a site that has a high percentage of users who search the site can support a much smaller number of users than a site where most users simply view pages. If the usage profile changes significantly, a new TCA must be run.

Obtaining a user profile is easy with an existing site. It can be obtained from:

Traffic analysis of IIS logs
Commerce Server 2000 analytics
Third party log analysis tools

If the site is new, existing knowledge and data can be applied to the new site. Remember to validate the projections for the new site by gathering actual traffic data after the site is launched, and doing a comparison.

Using the data gathered determine:

Average session length
Average number of operations performed in a session
The most frequently visited pages, generally, a small number of operations make up more than 95% of usage, for example displaying the home page
Statistical page hit distributions for the most frequently visited pages
Difference between average and peak loads

Significant pages and their distribution make up the user profile operations. These need not add up to 100% of site usage.

For example, the user profile shown in the following table was developed from IIS log files. This user profile is based on a user performing 11 operations in an 11-minute session. That is, one operation per minute with 1 minute think time between operations.

User transaction	Page hits	Hit Ratio %	Operations
Home page	58,285	22.82	2.5
Search (good)	36,641	14.35	1.6
Search (bad)	2,599	1.02	0.1
Add item	4,482	1.76	0.2
Add item + delete	2,726	1.07	0.1
Add item + checkout	2,800	1.10	0.1
Browse	93,507	36.61	4.0
Login	4,418	1.73	0.2
Register	2,700	1.06	0.1
Zip code	40,771	15.96	1.8
View cart	6,452	2.53	0.3
Totals	255,381	100.00	11.0

Although the transactions in this table account for only the top 10% of pages, they account for 90% of the traffic to this site.
Build the test Site.

You are now ready to build the site based on the information that has been gathered.

Use the documentation to build the lab environment, keeping in mind that it must closely match the intended production environment.

Stress Script Creation and Cost Measurement

In order to use WAST to measure the cost of each operation in the usage profile the following steps should be performed:

Convert the usage profile into load generation scripts to run in WAST. Create one script for each operation. Note that scripts are run without think time, which is the time a user spends on one page.
Use WAST to record relevant performance counter data. This should include the expected failure condition counter, in this case, CPU utilization, and throughput counters—Get requests/sec, ASP requests/sec, and so on.
Use the TCA tool to run the WAST scripts for each operation. The TCA tool drives WAST to slowly increase the load until the failure criteria is met. Watch for other system bottlenecks that prevent failure criteria from being reached.
Record the maximum throughput obtained just before failure. Also note the percentage utilization of the failure counter at this point.
Use the following formula to determine the cost of an operation:

(% Utilization x Total Power) / Throughput = Cost per operation

For example, throughput of 100 pages/sec at 84.5% utilization on a 1 GHz computer would produce a cost per operation of:

(84.5% x 1000 Mhz) / 100 = 84.5 Mhz per operation

Note: If CPU is the performance-limiting resource, the Microsoft Excel graphs created from TCA will calculate the Cost per operation for you.

Failure Criteria

Failure criteria can be met in a variety of ways. Most often failure occurs when the CPU load of the MSCMS server exceeds a set percentage, such as 85%. At this point further throughput is not likely because more load will lead to an increase in thrashing and non-useful activities. Depending on your configuration, other resources such as memory consumed, disk I/O, or network bandwidth might limit performance. If a resource other than CPU is limiting performance, that limiting resource should be modified to remove the bottleneck.

Track the limiting resource using the TCA tool. Set a failure value above which little extra throughput is expected. The TCA tool will then stop the test when this set failure value is reached.

Be wary of overstressed situations. These occur when latency becomes exceedingly high. Latency is the time it takes a package to travel a network connection from sender to receiver. Anything over five seconds is too long and you will experience "walkaway" (the user leaving your site). Latency will creep up slowly until a sudden sharp increase tells you that you have reached an overstressed point. It is important not to measure throughput after this point.

In general, throughput should be measured at the point just prior to failure criteria being met or when latency is less than five seconds, whichever occurs first.

Cost per User

After the cost of the user profile operations is determined and a user profile is established, the Cost per User Operation per Second can be determined. The following table shows some examples of Cost per User, based on the operation being performed.

Operations	Hit Ratio	UPOs*	UPOs per sec**	Cost per Op (Mcycles)	Cost per User Op per Sec***
Home page	22.82	2.5	0.003804	53.77	0.2045
Search (good)	14.35	1.6	0.002391	150.12	0.3590
Search (bad)	1.02	0.1	0.000170	42.08	0.0071
Add item	1.76	0.2	0.000293	261.54	0.0765
Add item + delete	1.07	0.1	0.000178	359.56	0.0640
Add item + checkout	1.10	0.1	0.000183	415.22	0.0759
Browse	36.61	4.0	0.006102	58.50	0.3570
Login	1.73	0.2	0.000288	251.69	0.0726
Register	1.06	0.1	0.000176	62.39	0.0110
Zip code	15.96	1.8	0.002661	70.10	0.1865
View cart	2.53	0.3	0.000421	84.50	0.0356
Totals		11.0			1.4496

* (over 11 min, 660 sec)

** (UPO / 660 sec)
*** (Cost per Op * UPO/Sec)

This table indicates that the bolded Search (good) Cost per User Profile Operations (UPO) per Second is overall the most costly operation and should be optimized. However, it can be seen that the Cost per Operation is not the highest, factoring in the search frequency results in the high Cost per User.

Summing the Cost per User Operation per Second for each operation in the user profile gives the total per second cost of a user on the system. In this case, each user takes up 1.4496 Mhz of CPU time.

Projecting Capacity

To project user capacity, use the formula:

Users x Cost per user

Recall that the cost per user was calculated in the previous table as 1.4496. So in the example for 50 users in the following table, 50 * 1.4496 = 72.5. The table shows results with the number of users incremented by 50 each time.

Users	Cost (Mcycles)
0	0.0
50	72.5
100	145.0
150	217.5
200	290.0
250	362.5
300	435.0
400	580.0
450	652.5
500	725.0
550	797.5
600	870.0

Further, to calculate the user capacity based on the percentage of CPU used and the Cost per User, the formula is:

Capacity = CPU budget / User Profile Cost.

Examples: Assuming an 800 Mhz System

User Capacity @ 50% CPU = 400 Mcycles / 1.4496 = 276 users

User Capacity @ 75% CPU = 600 Mcycles / 1.4496 = 414 users

Verifying Capacity

In order to verify the analyses, use the WAST tool to create WAST scripts to execute all operations at once with the Hit Ratio set to that calculated in the User Profile. Make a record of the capacity just before failure criteria are met. The system's overall capacity should come close to the value predicted by the above formulas.

Unlike the stress script created to convert usage profiles, this script is run with think time. Use session length and distribute as think time.

Capacity Planning Summary

Before starting capacity planning, become familiar with the business goals of your company, and base the planning for capacity requirements on the information you have gathered.

Empirical data is a requirement for the related tasks of performance tuning and capacity planning. Be wary of unjustified statements about capacity requirements—look to your own analyses for reliable projections.

Use TCA methodology to determine site capacity and drive optimization.

Keep in mind that different sites have different performance characteristics.

Site usage profile and MSCMS deployment architecture have an effect on the capacity of a site.

Remember to measure performance early, often, and throughout the life cycle of the site.

Key Performance Counters

The following is a list of the key performance counters you should use to achieve the best performance and capacity from your site:

ASP/sec
SQL monitoring
% CPU (Web server and database server)
Memory usage (especially with deployment)
For latency measurement note:
- Time to last byte (TTLB): how much time to process and download
- Time to first byte (TTFB): how much time spent processing a page
ASP request execution time
ASP Errors/sec (there should not be any)
Requests queued, requests waiting, and ASP request wait time—these are useful for determining if there is an overstress situation

CMS and Usage Profiles

Create usage profiles for both the design and production servers.

Use the design-time profile to determine the amount of content that needs to be deployed to the production server, and how frequently it must be deployed.

Remember to factor in the load that will be generated during deployment into both run-time and design-time servers.

CMS Sizing—Using the Results

After site throughput has been determined, the system capacity can be increased to meet specified business goals.

Scale, tune, or remove bottlenecks that have been exposed by the TCA methodology.
Use the information gathered from MSCMS performance optimization analysis to tune the site.

· Scale out the site by adding more nodes to the Web server as required, to increase capacity. The CMS Web server scales out linearly. These figures are from a slow site and were run on a slower hardware site.
· Scale up the database server. MSCMS does not support federated SQL Servers, but scale up works well for the database server. The MSCMS Web server scales up well to four processes and possibly more. Note that the diagram shows an increase in speed with the use of fragment caching.

Database load on disk and CPU were nominal for all test runs.

Note: Federated SQL servers switch from one database to another to retrieve data that is on a specific database. The MSCMS server switches from one database to another during failover.

Deployment Performance

There are some points to consider when deploying a site. The graphs below show the effect the number of deployment transactions has on the speed of the operation.

The MSCMS deployment export operation starts fast, at around 20 to 30 postings per second. As more postings are exported, the speed decreases.
When importing new postings, a similar curve in performance is seen. However, updating existing postings during an import is considerably slower because of the need to preserve an existing object while it is being replaced.

References

The following list contains technical resources you can use to help you with performance optimization and capacity planning of your Web site:

Microsoft Web Application Stress Tool (WAST), located at https://www.microsoft.com/technet/archive/itsolutions/intranet/downloads/webstres.mspx
"Integration Techniques and Strategies for Internet Business" white paper, located at https://msdn.microsoft.com/library/default.asp?url=/library/en-us/CMS2002_GB/htm/cms_integration_techniques_abstract_msdn_rqeq.asp
"Capacity Planning" white paper, which addresses key issues in capacity management as it applies to business-to-consumer e-commerce solutions, located at https://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsent7/html/vxconOverviewOfCapacityPlanning.asp
Microsoft TechNet Web site, which contains useful tools and information about performance optimization and capacity planning, located at https://www.microsoft.com/technet/prodtechnol/comm/comm2000/maintain/perform/cstools.mspx
Commerce Server Technial Resources Web site, located at https://www.microsoft.com/commerceserver/techinfo/default.asp