Chapter 4 - Capacity Planning

Article
12/09/2009

Capacity planning for a Web server installation involves determining the current and future needs of the installation, then choosing hardware and software that can meet current estimated needs, but that can also be expanded or upgraded to meet new needs as they arise. Because there are so many variables, and because needs change so rapidly, capacity planning is more of an art than a science, and typically requires an iterative approach.

This chapter is intended primarily for people who decide what equipment and software to acquire for Web sites, and for site administrators. It explains some aspects of Web server capacity, provides ways to determine where bottlenecks are likely to occur, and shows how much network bandwidth you will need. It also contains a case study that provides insight into large-site issues, many of which can and do arise at smaller sites as well.

Capacity Planning Issues

Creating and maintaining a Web site involves managing traffic with hardware, software, and network bandwidth. This first section explores traffic.

Traffic

Servers send out pages in response to requests. In order to issue a request, a browser first establishes a Transmission Control Protocol (TCP) connection with the server, and then sends the request via that connection. Traffic, as the term applies to Web servers, is the resulting mixture of incoming requests and outgoing responses.

In general, traffic occurs in bursts and clumps that are only partly predictable. For example, many intranet sites have activity peaks at the beginning and end of the day, and around lunchtime; however, the exact size of these peaks will vary from day to day and, of course, the actual traffic load changes from moment to moment. Not surprisingly, there is a direct relationship between the amount of traffic and the network bandwidth needed. The more visitors your site has and the larger the pages it provides, the more network bandwidth your server requires.

To start with a simple example, imagine a server that displays static Hypertext Markup Language (HTML), text-only pages that average 5 kilobytes (KB) in size, which is about equivalent to a full page of printed text. The server is connected to the Internet through a DS1/T1 line, which can transmit data at 1.536 megabits per second (Mbps). Inherent overhead makes it impossible to use the full-rated T1 bandwidth of 1.544 Mbps. How many pages per second can the server send out under optimum conditions? To answer this question, it is necessary to look at the way the information travels between computers.

Data travelling on a network is split into packets. Each packet includes, in addition to the data it carries, roughly 20 bytes of header information and other network protocol information (all this extra information constitutes "overhead"). The amount of data in a packet is not fixed, and thus the ratio of overhead to data can vary. Most incoming HTTP requests are small. A typical request (for example, GET https://www.microsoft.com/default.asp), including the Transmission Control Protocol/Internet Protocol (TCP/IP) overhead, consists of no more than a few hundred bytes as it travels across the network. For a 5-KB file like the one in this example, protocol overhead is significant, amounting to about 30 percent of the file's size. For larger files, the overhead accounts for a smaller percentage of network traffic. Overhead can become an important consideration when you are estimating your site's bandwidth requirements and deciding how fast a connection you'll need. If you are close to the limits of your connection's capacity, an extra 20 percent for overhead may mean that you will require the next fastest type of connection.

Table 4.1 shows the traffic generated by a typical request for a 5-KB page. Note that all the figures for overhead are estimates. The precise number of bytes sent varies with each request.

Table 4.1 Traffic Generated by a Request for a 5-KB Page

Traffic Type	Bytes Sent
TCP Connection	180 (approx.)
GET Request	256 (approx.)
5-KB file	5,120
Protocol overhead	1,364 (approx.)
Total	6,920

To find the number of bits, multiply the number of bytes sent by 8 bits per byte: 6,920 x 8 = 55,360. As stated previously, a T1 line can transmit 1.536 Mbps. Dividing bits per second by bits per page, 1,536,000/55,360, provides a maximum rate of just under 28 pages per second. (Because modems add a start bit and a stop bit to each byte, they are slower than the raw numbers appear to indicate.)

Table 4.2 illustrates the relative speeds of several network interface types, using the small, text-only page just mentioned. Numbers of pages transmitted at speeds faster than the standard Ethernet rate of 10 Mbps are rounded.

Table 4.2 Relative Network Interface Speed

Connection Type	Connection Speed	5-KB Pages Sent per Second
Dedicated PPP/SLIP via modem	28.8 Kbps	Roughly half of 1 page
Frame Relay or fast modem	56 Kbps	Almost 1 page
ISDN	128 Kbps	Just over 2 pages
Typical DSL	640 Kbps	Almost 11 pages
DS1/T1	1.536 Mbps	26 pages
10-Mb Ethernet	8 Mbps (best case)	(Up to) 136 pages
DS3/T3	44.736 Mbps	760 pages
OC1	51.844 Mbps	880 pages
100-Mb Ethernet	80 Mbps (best case)	(Up to) 1,360 pages
OC3	155.532 Mbps	2,650 pages
OC12	622.128 Mbps	10,580 pages
1-Gbps Ethernet	800 Mbps (best case)	(Up to) 13,600 pages

If you add a small graphic to the 5-KB page, the results will be considerably different. An image, in the form of a .jpg file that appears on screen as, perhaps, a 1-by-2-inch rectangle (the actual physical size depends on monitor settings), takes up about as much disk space as the original text file. Adding one such image file to each page nearly doubles the average page size. This increased size reduces the number of page requests that the server can send to the Internet on a DS1/T1 line to a maximum of about 15 pages per second, regardless of how fast the computer itself runs. If there are several images on each page, if the images are relatively large, or if the pages contain other multimedia content, they will take considerably longer to download.

Given a page of moderate size and complexity, there are several ways to serve more pages per second:

Remove some images from the page.
Use smaller pictures if you currently send large ones, or compress the existing pictures (if they are already compressed, compress them further).
Offer reduced-size images with links to larger ones, and let the user choose; use images of a file type that is inherently compact, such as a .gif or a .jpg file, to replace inherently large file types such as a .tif.
Connect to the network by using a faster interface.

The last option resolves the issue at the server but not necessarily at the client, as will be shown later in this section.

A site that serves primarily static HTML pages, especially those with simple structure, is likely to run out of network bandwidth before it runs out of processing power. On the other hand, a site that performs a lot of dynamic page generation, or that acts as a transaction or database server, uses more processor cycles and can create bottlenecks in its processor, memory, disk, or network. There are no hard and fast rules that apply to all sites (one of the reasons why a comprehensive discussion of capacity planning is difficult), but the general relationship between bandwidth and CPU use for static versus dynamic pages is shown in Figure 4.1.

Bb742409.iis0401(en-us,TechNet.10).gif

Figure 4.1 Relative Demands of Static vs. Dynamic Content for a Page of a Given Size

Browser Download Time

The number of pages a server can send is one half of the bandwidth equation. The other half is the time it takes a browser to download a page.

Consider how much time a browser needs to download a page that, including overhead, amounts to 90 KB or so-which equals about 720 kilobits. (Pages of this size are not at all unusual.) Ignoring latencies, which typically add a few seconds before any of the data arrives, it takes roughly 25 seconds to download 720 kilobits through a 28.8 kilobits per second (Kbps) connection if everything is working perfectly. On the other hand, if there's any blocking or bottlenecking going on at the server, if the network is overloaded and slow, or if the user's connection is slower than 28.8 Kbps (because of poor line quality, for example), the download will take longer.

If the client computer has a higher-bandwidth connection on an intranet, for example, the download time should be much shorter. If your Web site is on the Internet, however, you cannot count on a majority of users having faster connections until the next wave of connection technology becomes well-established. At the time of writing, a 56 Kbps modem standard has been adopted, but many (if not most) telephone lines are too noisy to allow full-speed connections with 56 Kbps modems. In addition, cable modem and Digital Subscriber Line (DSL) technologies are just beginning to appear in enough areas to compete in earnest. For this reason, it is not possible to tell which connection mode will take the upper hand or, for that matter, whether some other technology will appear and supersede both.

From the Server Side

It takes about 52 connections at 28.8 Kbps to saturate a DS1/T1 line. If no more than 52 clients simultaneously request the page used in the preceding example, and if the server can keep up with the requests, the clients will all receive the page in the 25 seconds calculated in the example (again, ignoring the typical delays).

If 100 clients simultaneously request that same page, however, the total number of bits to be transferred will be 100 times 737,280 (720 kilobits). It takes between 47 and 48 seconds for that many bits to travel down a DS1/T1 line. At that point the server's network connection is the limiting factor, not the client's.

Figure 4.2 shows the relationship between concurrent connections and saturation for DS1/T1 and DS3/T3 lines, assuming all clients are using a modem transmission speed of 28.8 Kbps and are always connected. A DS3/T3 line carries nearly 45 Mbps, about 30 times as much capacity as a DS1/T1 line, and it takes more than 1,500 clients at 28.8 Kbps to saturate its bandwidth. Moreover, the increase in download time for each new client is much smaller on a DS3/T3 line. When there are 2,000 simultaneous 28.8 Kbps connections, for example, it still takes less than 33 seconds for a client to download the page.

Bb742409.iis0402(en-us,TechNet.10).gif

Figure 4.2 Download Time vs. Server Network Bandwidth

This example assumes that the server is capable of performing the relevant processing and handling of 2,000 simultaneous connections. That's not the same as handling 2,000 simultaneous users: Users occasionally stop to read or think, and typically spend only a modest percentage of their time downloading, except while receiving streaming multimedia content. Because of this difference between users and connections, the number of users that IIS 5.0 can support is larger than the figures would seem to indicate. A Web server on a DS1/T1 line can typically handle several hundred users connecting at 28.8 Kbps, and with a DS3/T3 line the number typically climbs to 5,000 or more. While these numbers are derived from actual servers, you can expect performance to vary with differing content types and user needs, and with the number and type of services being performed by a particular computer.

Essentially, the network performance differences shown here scale linearly, and the scaling continues at larger data-transfer rates. If you have two DS3/T3 lines, for example, you can serve approximately twice as many clients as you can with one, provided that you have enough processor power to keep up with user demand and that no bottlenecks prevent your servers from maximizing their processing power. For an example of such a bottleneck, see "A Large-Site Case Study: microsoft.com" later in this chapter.

Perceived Time

It is important to remember that the amount of time a user perceives that a page appears is not identical to the (measurable) amount of time it takes for the page to fully display. If the first thing the user sees upon reaching a given page is a set of buttons allowing further navigation, the user may never know or care that the rest of the page takes over a minute to download. If, on the other hand, the page takes over a minute to download and the navigation buttons don't appear until after the rest of the page, users probably won't bother to wait unless they are forced to by circumstances. An acceptable delay depends to some extent on the kind of information provided by the page, but it is ordinarily no more than 30 seconds. If the information is important, your users will be more likely to wait longer for it, albeit reluctantly.

Other Services and Activities

If your server is also acting as a Dynamic Host Configuration Protocol (DHCP), Windows Internet Name Service (WINS), or Domain Name System (DNS) server on a local area network (LAN), or as an e-mail and news group server, you must take into account the network bandwidth used by these services when planning your bandwidth budget (as well as your processing budget). If you anticipate heavy traffic, you should move these other services to a different server computer.

When testing new applications, particularly on a site that is actually in service, it's a good idea to run them out of process, despite the fact that this introduces additional computing overhead. For more information about running out-of-process applications and services, see "Data Access and Transactions" in this book.

Considerations

It is important to consider both server-side and client-side bandwidth when choosing your server's network connection(s). If your server connects only to an intranet, this question may be moot, except when planning a major upgrade; but if your site connects to the Internet, there are many possible types and speeds of network connection. The speed you require depends on the amount and type of traffic your site generates. The 5-KB static Web page discussed in the section on traffic earlier in this chapter is representative of many short, text-only pages, but relatively few Web pages contain only text. In the worst case, a page containing a substantial number of graphic elements can require one GET request per graphic. These requests add up quickly and have an impact on performance.

Web pages are increasingly being built as applications, and as a result are more processor-intensive. This by itself does not necessarily have much effect on bandwidth requirements. Content, however, does have an effect. Streaming multimedia, for example, is inherently bandwidth-intensive, and unless you can guarantee that your server will have only a small number of users on a high-speed intranet, you may find that spikes in user volume will easily overwhelm your network bandwidth capacity. Pages that perform extensive lookups in a database, on the other hand, may put a heavy load on the link from the Web server to the database server (and may place a really heavy load on the number of CPUs in the server computer for the database). However, these pages will not place much demand on the link from the Web server to the intranet or Internet, unless the returned datasets are quite large.

HTTP 1.1

IIS 5.0 automatically determines the size of any objects (such as text, files, or graphics) on a static page and the size of the page itself. When the client issues a GET request, the server uses a "Content-Length" header entry to report the size of the requested object or page. This HTTP 1.1 header entry has minimal cost in terms of overhead, and allows the browser to determine approximately how long the active connection will be used, thus affecting the browser's connection strategy. A browser such as Microsoft® Internet Explorer creates a new connection only when an existing one is "blocked." If the order of requests is such that larger .gif files are downloaded first, or if the connection speed is slow, more connections may be necessary.

Caching and the Refresh Process

In a refresh request, Internet Explorer (the client) tells the server the datestamp of the version of a file it has in its cache, using the "If-Modified-Since" header. IIS 5.0 then determines whether the file has been modified since that time. If it has not, IIS 5.0 replies with a brief "Not Modified" response.

A static HTML page is not retrieved during a screen refresh if it has not been updated. Some publication processes copy files that haven't been modified, which gives them new timestamps and thus "updates" them as far as the system is concerned. These files are downloaded even though they haven't really changed. When you set up your publication process, you should make every effort to avoid this waste of resources.

By default, IIS 5.0 sets HTTP cache-control to prevent browsers from caching processed page scripts in Active Server Pages (ASP), because there is no way to guarantee that an ASP page will be the same the next time it is requested. (IIS 5.0 caches compiled, readyto-process scripts in ASP pages in its Script Engine Cache.)

Thus, under ordinary circumstances, just changing a file's extension from *.*htm to .asp, without putting any script on the page, causes a screen refresh to take longer. Because IIS 5.0 checks for this condition, the extra time is minimized but not entirely eliminated.

Secure Sockets Layer

When a browser makes a Secure Sockets Layer (SSL) request for a page, a delay occurs while the server encrypts the page. When the server sends the page, SSL adds additional network overhead. To enhance security, SSL also disables proxy and browser caching, so the considerable performance gains these allow are lost. The transaction time with SSL can be as much as a full order of magnitude longer. Finally, once the browser has received the requested files, the user must wait while they are decrypted, causing the download time to be even longer.

For these reasons, you should use SSL only when it is really necessary, such as when you need to ensure the security of financial transactions. Also, you should use it only on the specific pages to which it applies, rather than on an entire site.

Web Application Performance

If Web applications are an important part of your site, the performance of those applications is critical in determining the site's capacity. Testing is the only way to find out the capacity and performance of a Web application. The Web Capacity Analysis Tool (WCAT) and Web Application Stress Tool utilities included on the Microsoft® Windows® Server Resource Kit companion CD are useful testing tools. Before writing an application, however, it's useful to have a sense of the performance capabilities of different Web application types. In IIS 5.0, Internet Server Application Programming Interface (ISAPI) applications running as inprocess dynamic-link libraries (DLLs) generally offer the best performance. Next best are ASP pages, followed by Common Gateway Interface (CGI) applications.

For most applications, the recommendation is to use scripting in ASP pages to call serverside components. This strategy offers performance comparable to ISAPI performance, with the advantage of more rapid development time and easier maintenance. For more information about Web application development strategy, see "Developing Web Applications" in this book.

It would be helpful here to look at some comparative performance data. Tests compared uniprocessor and multiprocessor performance for several tasks that were designed to be as similar as possible. These tests also measured and compared the costs of ISAPI, CGI, and static content, and ASP. In the test, each program takes a URL passed in the query string, maps it to a physical file on the server, and sends back that file. (For static content the URL is passed directly to the server, which responds normally.)

The ISAPI version achieves most of its speed by having much less overhead (and functionality) than the others; it uses the ServerSupportFunction(HSE_REQ_TRANSMIT_FILE) function. Even though the CGI version is nearly a line-for-line copy of the ISAPI version, it is slower because a new process must be started each time the CGI program is executed. Static content is fastest of all because IIS 5.0 is highly optimized for transmitting static content. In-process ISAPI and ASP applications execute faster than (pooled) out-of-process applications. Note that the test programs, while equivalent, are idiomatic. That is, each uses the best and most natural choice of tools from its framework. The ISAPI uses TransmitFile, which is extremely fast but is not available to ASPs or CGIs.

The following code sample is the ASP version, stripped of error handling:

<% @ EnableSessionState = false %> 
<% ' Usage: https://server/test/sendfile.asp?test/sample/somefile.txt 
strPhysicalFilename = Server.MapPath(Request.QueryString) 
Set oFS = Server.CreateObject("Scripting.FileSystemObject") 
Set oFile = oFS.OpenTextFile(strPhysicalFilename) 
Response.ContentType = "text/plain" 
Response.Buffer = true 'Default in IIS 5.0, but not in IIS 4.0. 
Response.Write oFile.ReadAll 
oFile.Close 
Response.End 
%>

The entire Sendfile Test Suite is included on the Resource Kit companion CD, with instructions on setting up and running the tests.

The scripts and executables were tested with Beta 3 Release of IIS 5.0 and Microsoft® Windows® 2000 Server, using WCAT and the Web Application Stress Tool. The tests were run for 60 seconds, with 10 seconds of warm-up time and 10 seconds of cool-down time.

Table 4.3 shows the performance of the different application types, running on uniprocessor and multiprocessor kernels. Table 4.4 shows the hardware and software used for the tests.

The figures given in Table 4.3 are the actual numbers of pages per second that were served during testing, and in that narrow sense they are absolute. In several other senses, however, they are relative. First, the software being tested was not the final release version. Second, different computer types will, of course, provide different performance for the same test. Finally, the performance of different application types depends greatly on the application's task. Because this particular task is a relatively light load, it maximizes the differences among the various methods. Heavier tasks result in performances differences that are smaller than those reflected in the table.

Table 4.3 Performance of IIS 5.0

Test	Non-SSL1 CPU	Non-SSL2 CPUs	Non-SSL4 CPUs	SSL1 CPU	SSL2 CPUs	SSL4 CPUs
ISAPI In-Process	517	723	953	50	79	113
ISAPI Out-of-Process	224	244	283	48	76	95
CGI	46	59	75	29	33	42
Static file (file8k.txt)	1109	1748	2242	48	80	108
ASP In-Process	60	107	153	38	59	83
ASP Out-Of-Process	50	82	109	28	43	56

All numbers denote requests per second. Each test repeatedly fetched the same 8-KB file.

Table 4.4 Setup for the Performance Test

Server	Compaq Proliant 6500 (4 x Pentium Pro 200MHz) with 512 MB of 60ns RAM
Clients	16 Gateway Pentium II machines, 350MHz with 64 MB RAM
Network	Each client: one Intel Pro100+ 10/100MB network adapter Server: four Intel Pro100+ 10/100MB network adapters Four separate networks were created to distribute the workload evenly for the server, with four clients per network. Two Cisco Catalyst 2900 switches were used, each having two Virtual LANs (VLANs) programmed.
Software	Server: Windows 2000 Beta 3 Advanced Server (Build 2031), IIS 5.0 Clients: Windows 2000 Professional (Build 2000.3) Testing: Web Application Stress Tool (Build 236)

Note: These tests were conducted with "out-of-the-box" computers and programs. No additional registry changes or performance enhancements were administered.

Multiprocessor Scaling

IIS scaling to multiple processors is improving, but when all processors in one computer must share the system bus and other resources, symmetric multiprocessor (SMP) scaling simply cannot map 1:1 with the number of processors. Multiple system buses can help with this issue; but for computers in which all processors share a single bus, a pair of twoprocessor machines can provide more throughput than a single four-processor machine, and may have a similar price tag.

Reliability

Some sites can afford to fail or go offline; others cannot. Many financial institutions, for example, require 99.999 percent or better reliability. Site availability takes two forms: The site might need to remain available if one of its servers crashes, and it might need to remain online while information is being updated or backed up.

Even if your requirements are less rigorous than those of a major financial institution, you will probably want to use Redundant Arrays of Independent Disks (RAID). You can also consider creating a "Web farm" with Network Load Balancing; in addition, you can create subsystem redundancy by clustering component servers.

Server Clustering

The term "failure" commonly brings to mind the idea of a system crash, but in fact many system failures are deliberate: the administrator brings a server down for routine maintenance or for hardware installation or an upgrade. Clustering makes it possible to take a server down for maintenance or service without causing the site itself to fail, and also provides reliability in the event of an unscheduled failure.

Microsoft supports two clustering solutions. The first of these is Windows 2000 Server clustering; the second is Network Load Balancing. The next two sections describe them.

Windows 2000 Server Clustering

With Windows 2000 Server clustering you can set up applications on two servers (nodes) in a cluster and provide a single, virtual image of the cluster to clients, as shown in Figure 4.3. If one node fails, the applications on the failed node become available on the other node. That is, the actual content and applications are shared so that both machines have full access to them. Failover times range from 20 seconds to 2 minutes.

Bb742409.iis0403(en-us,TechNet.10).gif

Figure 4.3 A Windows 2000 Server Cluster

Clustering is available with Windows 2000 Advanced Server. To use the Windows 2000 clustering feature, you must have two servers that are connected by a high-speed private network. Each of the servers must have at least one shared Small Computer Systems Interface (SCSI) bus, with a storage device connected to both servers, and at least one storage device that is not shared. For the most reliability, each computer should have its own uninterruptible power supply.

Windows 2000 Server clustering provides mirroring with rapid failover, and is an excellent way to ensure reliability on, for example, a Microsoft® SQL Server" connected to either a single Web server or a group.

Network Load Balancing

The Network Load Balancing feature of Windows 2000 Advanced Server allows you to create server clusters containing up to 32 machines. Network Load Balancing is fully distributed, and is entirely software-based; it does not use a centralized dispatcher, and does not require any specialized hardware.

This feature transparently distributes client requests among the hosts in the cluster, using virtual Internet Protocol (IP) addresses. You must run IIS 5.0 or another TCP/IP service on each host, and the hosts must serve the same content so that any of them can handle any request. You can copy updated pages to local disks on the hosts, or use commercial filereplication software to perform updates. Network Load Balancing allows you the option of specifying that all connections from the same client IP address be handled by a particular server (unless, of course, that server fails). It also permits you to allocate all requests from a Class C address range to a single server. In addition, Network Load Balancing supports SSL.

DNS Round-Robin Distribution

Domain Name System (DNS) round-robin distribution is an earlier and less sophisticated technique for allocating requests among servers in a group. Consider a scenario in which there are four IP address entries for the same host on a DNS server:

domain.com A 172.17.21.31 
domain.com A 172.17.21.35 
domain.com A 172.17.28.41 
domain.com A 172.17.28.52

If a client sends a query, the DNS server returns all four IP addresses, but typically the client uses only the first one it receives. With round-robin distribution, the next time the DNS server receives a query for this host the order of the list is changed in a cyclic permutation or "round-robin" (the address that was first in the previous list is last in the new list). Thus, when the client chooses the first IP address in the list, it connects to a different server.

This technique distributes incoming requests evenly among the available IP addresses, but does not fully balance the load because it is not interactive. That is to say, the DNS server neither checks on the loading of an IP address, nor whether a particular server is currently running. Nonetheless, round-robin distribution can be a useful starting point or a low-cost alternative for small groups of servers. You should, if you are using round-robin distribution, keep close tabs on your servers so that you can quickly remove any failed machines from the distribution list.

Determining Your Installation's Requirements

If you are building a new server site, you must develop a sense of its requirements in order to decide what hardware and software to acquire. Consider what the site will be required to do, and what services it will provide. Will it:

Serve a known set of clients on an intranet or intranet/extranet connection to the Web? (That is, can you predict the number of clients with any accuracy, and is it possible to get a good sense of their usage behavior?)
Allow people to download and possibly upload files with File Transfer Protocol (FTP) service?
Provide e-mail with Simple Mail Transfer Protocol (SMTP) service?
Provide news with Network News Transfer Protocol (NNTP) service, which is typically coupled with large storage requirements?
Allow clients to access a database or large index using SQL or other query language, or with Indexing Service? Allowing this kind of access will entail large storage requirements for the database itself.
Perform transactions using Microsoft® Component Services, which is processor-intensive?
Provide name service?
Function as a domain controller?
Provide file and print service?

As discussed earlier, there are also page construction and Web application issues. Will your server or servers primarily send out static HTML pages? (Doing so involves memory for caching files.) Will the server run scripts in ASP pages (ASP service involves queuelength optimization issues as well as cache space), ISAPI, and CGI applications? (ASP, ISAPI, and CGI are all processor-intensive, and CGI applications are inefficient under Windows 2000 Server.) Do you need to run some of your pages out of process, for testing or other purposes? (This is also processor-intensive.) In addition, you must figure in the constraints of budget, facilities, and staffing.

If you can find a site with requirements that are similar to yours and examine its history, you may be able to discover (and possibly avoid) some of these potential pitfalls.

If you are working with an existing site, you can monitor the site server and log its performance. This will give you a sense of current conditions and will let you see how well the existing hardware and software are meeting current user needs. (See "Monitoring and Tuning Your Server" in this book.)

A Capacity Planning Checklist

This checklist provides an in-depth look at your site and helps you anticipate where bottlenecks are likely to occur.

Purpose or Type of Site

First, decide whether your site will be transactional; that is, determine whether customers will be retrieving and storing information, typically in a database. A transactional site involves both reliability and security requirements that do not apply to most nontransactional sites.

Complexity Level

Next, consider whether the content on your site will be static. A site that makes use of SQL, ASP, ISAPI, CGI, or multimedia requires much more processing capability than a static site.

Customer Base

Consider the size of your customer base and its potential for expansion, on at least the following time scales:

Current to one month
Six months
One year

Finding Potential Bottlenecks

Find out what is likely to break first. Unless your site is extremely small, you'll need a test lab to determine that. (There are suggestions for building and using such a lab in the following list.)

To determine potential bottlenecks

Draw a block diagram showing all paths into the site. Include, for example, links to FTP download sites as well as other URLs.
Determine what machine hosts each functional component (database, mail, FTP, and so on).
Draw a network model of the site and the connections to its environment. Define the topography throughout. Identify slow links.

For each page, create a user profile that answers the following questions:
- How long does the user stay on the page?
- What data gets passed to (or by) the page?
- How much database activity (or other activity) does the page generate?
- What objects live on each page? How system-friendly are they? (That is, how heavily do they load the system's resources? If they fail, do they do so without crashing other objects or applications?)
- What is the threading model of each object? (The "agile" model, in which objects are specified as both-threaded, is typically preferable, and is the only appropriate choice for application scope.)
Define which objects are server-side and which are client-side.
Build a lab. You'll need at least two computers, because if you run all the pieces of WCAT on one computer, your results will be skewed by WCAT's own use of system resources. Monitor the performance counters at 1-second intervals. When ASP service fails it does so abruptly, and an interval of 10 or 15 seconds is likely to be too long-you'll miss the crucial action. Relevant counters include CPU utilization, Pool nonpaged bytes, connections/sec, and so on. (For more information about counters, see "Monitoring and Tuning Your Server" in this book.)

Throw traffic at each object, or at a simple page that calls the object, until the object or the server fails. Look for:
- Memory leaks (steady decrease in pool nonpaged bytes and pool paged bytes).
- Stop errors and Dr. Watsons.
- Inetinfo failures and failures recorded in the Windows® Event Log.
Increase the loading until you observe a failure; document both the failure itself and the maximum number of connections per second you achieve before the system tips over and fails.
Go back to the logical block diagram, and under each block fill in the amount of time and resources each object uses. This tells you which object is most likely to limit your site, presuming you know how heavily each one will actually be used by clients. Change the limiting object to make it more efficient if you can, unless it is seldom used and well off the main path.
Next, traceroute among all points on the network. Clearly, you can't traceroute the entire Internet; but you can certainly examine a reasonable number of paths between your clients and your server(s). If you are operating only on an intranet, traceroute from your desk to the server. This gives you a ballpark estimate of the routing latencies, which add to the resolution time of each page. Based on this information, you can set your ASP queue and Open Database Connectivity (ODBC) timeouts. (For more information about ASP queuing, see "Monitoring and Tuning Your Server" in this book.)

Note: If the first seven steps appear to bear some resemblance to an inverted version of the Open Systems Interconnect (ISO) "layer cake" model, there's a reason. The ISO model is a highly useful lens through which to examine server behavior.

Network Bandwidth

Once you determine how many customers/clients you want to serve during a given time period, you have the lower limit for your network connection bandwidth. You need to accommodate both normal load and usage spikes; you can average these to get a reasonable estimation of network capacity.

Of course, the type of site you operate has a large effect on this issue. For example, if you are largely or entirely subscriber-based, or if your site is only on an intranet or an intranet/extranet combination, you probably already have a good idea of the maximum spike size. If, on the other hand, you issue software revisions to an audience of unknown size on the Web, there may not be a good way to predict the size of resulting spikes. You might, in fact, have to measure one or more actual occurrences to decide whether your bandwidth is sufficient.

A Note for Internet Service Providers

It's probably a good idea to set some standards. Tell customers that certain (tested and approved) objects are permitted on your servers and that others (tested and failed, or untested) are not. If they want to use untested objects or others known to be bad, they can do so, but you will likely want them to have their own server computer(s). Provide them with a page on one of your servers that they can log into through a secure connection, in order to reset their server remotely if it goes down.

Capacity Planning Scenarios

While all Internet sites differ in how they are implemented, they do follow standard capacity planning profiles that can be categorized into the following scenarios:

The Intranet Web Site
The Internet Marketing Web Site
The Internet Transactional Web Site
The Internet Commerce Web Site

It is important to understand that while the technology within IIS 5.0 is the same for each scenario, there are external technical and business decisions that influence the way a Web site is deployed and then expanded.

The Intranet Web Site

Many of today's Web sites exist only within corporations, and are not connected directly to the Internet. This scenario addresses the issues involved in establishing corporate intranets.

Purposes

Frequently, intranets are built to fulfill one or both of the following functions:

Provide information to internal employees.
Provide different groups with access to their own content areas.

Capacity Planning Issues

Each type of Web site has its own set of parameters and issues. Here are some issues that are specific to intranet sites:

This site is usually put up on existing hardware and, as it becomes more valuable to the company, quickly expands to the maximum limitation of the platform.
The site usually offers any kind of content that management requires.
The site may be called upon to offer downloads or custom-built components as necessary.

Initial Model

As shown in Figure 4.4, the initial model for an intranet site consists of a single server running IIS 5.0, usually with FTP service enabled so that departments can add their own information as necessary.

Bb742409.iis0404(en-us,TechNet.10).gif

Figure 4.4 Initial Model for an Intranet Site

Growth Models

The main options for expanding an intranet site fall into two categories:

Content Striping

With this option, multiple servers have the same content installed and a load-balancing algorithm (such as Network Load Balancing) is implemented to serve multiple customers. This requires that the content be updated equally on all servers at once.

This option is good when there is no clear organizational division among the components.
Content Specialization

With this option, content or components are separated on dedicated servers. The user typically engages a primary intranet server, and is redirected to a dedicated server if necessary.

This option is usually good when there is a clear organizational division for a particular set of content. It is also a good choice when resource-intensive content will impact the performance of the entire site if it is not segregated. When combined with content striping, this option is also good for extremely large sites.

Suggested Capacity Planning Methods

The following list presents some key issues for planning an intranet Web site:

Determine the site's audience.
Determine the type of information the site will offer (static content, active content, file downloads, and so on).
Determine additional services (newsgroups, chat, knowledge management) the site will host.
Based on this information, install hardware that will, in its minimum configuration, meet the current needs of the site. (This allows for maximum growth potential before expansion of the site is necessary.)
As the site grows, use content striping to handle increased customer requests or the decrease in Web server performance that comes with more complex Web content.
As individual departments request that the site host more information, you can implement content specialization (that is, install more servers and divide the content among them). Content specialization will allow those departments to meet their requirements without impacting the overall performance of the site.

The model for a larger site would look something like the one shown in Figure 4.5:

Bb742409.iis0405(en-us,TechNet.10).gif

Figure 4.5 Mature Model for an Intranet Site

The Internet Marketing Web Site

An increasing number of sites conduct marketing (as opposed to commerce) on the Internet. A typical marketing site is relatively straightforward, involving HTML or simple ASP pages and graphics. It does not usually require an extensive database or much input from customers. This section discusses the issues involved in setting up such a site.

Purposes

A marketing site is likely to fulfill one or more of the following needs:

The company wants to have an Internet presence available to its current and potential customers.
The company wants to provide an inexpensive alternative to catalog shopping by showing customers its latest product offerings.
The company wants to allow customers who have Internet access to provide feedback on products and company business practices.

Capacity Planning Issues

The following issues are commonly encountered in the process of developing a marketing site:

The company has no initial idea of the number of customers it can expect to access its site.
The company has no gauge of what speed or from where customers will be connecting.

Initial Model

As shown in Figure 4.6, the initial model for an Internet marketing site consists of a single server running IIS 5.0, usually with FTP service enabled (to allow internal divisions to add their own information as necessary), as well as some e-mail functionality for customer feedback.

Bb742409.iis0406(en-us,TechNet.10).gif

Figure 4.6 Initial Model for an Internet Marketing Site

Growth Models

The growth models for an Internet marketing site are the same as those for a corporate intranet site. Both include content striping and specialization. However, instead of separating the site's content by organizational unit, it is useful to segregate content either by functionality (Web, e-mail, newsgroups, and so on) or by product line. A sports products company, for example, would probably list gear for summer sports on one set of pages, and gear for winter sports on another.

Suggested Capacity Planning Methods

Here are some guidelines for capacity planning for a marketing site that does not already have a large customer base:

Put the initial Web server on a medium-size platform.
Examine the IIS 5.0 performance logs to determine how heavily the server's resources are loaded by incoming traffic.
Examine the IIS 5.0 server logs to determine the number of customer visits the site experienced.
Using the results of steps 2 and 3, plot out the growth metrics of the site, with special emphasis on periods when the site was being marketed heavily or was more visible to Internet customers than usual (for example, just prior to a new product launch).
As the Web server's resource utilization level reaches approximately 70 percent, use either content striping or specialization to add additional capacity to the site.

The Internet Transactional Web Site

The term transactional, as it is used here, indicates a bidirectional flow of data, typically to and from a database. (It does not indicate e-commerce transactions.) Customers need to be able not only to look up information already stored at the site, but also to add new information of their own. Transactional sites tend to use Web applications that are more complex and extensive than those of marketing sites, and thus require careful development and testing.

Purposes

A transactional site is likely to fulfill one or more of the following needs:

The company wants to communicate with its customers online.
The company wants to personalize its site to better meet the needs of its customers.
The company wants to provide a forum to help online customers with complaints and issues.

Capacity Planning Issues

There are two main issues involved in setting up a transactional site:

The site will almost always rely on a database, which gathers customer information. But usually the database cannot be replicated in the same way that content can be striped. As a result, adding Web servers as the site grows will increase the traffic load on the database server.
As the amount of data and traffic grows with an increasing number of customers, additional components need to be implemented to deal with customer connection issues. Some of these issues include ODBC time-outs, WINS errors, and server failures. In fact, connection issues can encompass anything that keeps a customer from putting information into the database or retrieving information from it. The additional overhead of these components has effects on the overall performance of the site, which must be taken into account as the site expands.

Initial Model

As shown in Figure 4.7, the initial model for a transactional Internet site consists of a single server running IIS 5.0, usually with FTP service enabled (to allow divisions to add their own information as necessary), as well as some e-mail functionality for customer feedback. In addition, the Web server has some connectivity (such as ODBC or OLE DB) that allows it to offer dynamic content and to store customer information. Typically, the Web server is connected to a separate database server.

Bb742409.iis0407(en-us,TechNet.10).gif

Figure 4.7 Initial Model for an Internet Transactional Site

Growth Model

This site can use the same options as the Internet marketing site, with the additional option of dividing the database server according to whether the data is effectively static (comes from the company) or dynamic (is entered by the customer). Separating the static and dynamic databases provides optimum performance on the Web servers, without the complex code necessary to handle data redundancy and multimastering issues.

Suggested Capacity Planning Methods

A transactional site has the potential for moderately heavy traffic, both outgoing and incoming, and requires more capacity at the outset than is required for either of the previous scenarios. Here are some guidelines:

Put the initial Web server on a high-end platform.
Examine the IIS 5.0 performance logs to determine how heavily used the server's resources are. Focus on CPU utilization, ASP requests per second (if you are using ASP), and memory paging.
Examine the IIS 5.0 server logs to determine the number of customer visits the site experienced.
Examine the logs of the database server to determine the average and maximum numbers of database connections.
Using the results of steps 2 and 3, plot out the growth metrics of the site. As with the marketing site, put special emphasis on periods when the site was more visible to customers than usual.
As use of the Web server's resources reaches approximately 70 percent, employ either content striping or specialization to add additional capacity to the site.
Using the results of step 4, determine when the database server is close to bottlenecking either from additional customer database requests, or from database requests that have become more complex. Use more server resources, and segregate the database as necessary.

The model for a transactional site that has grown to accommodate increased traffic looks something like the one shown in Figure 4.8:

Bb742409.iis0408(en-us,TechNet.10).gif

Figure 4.8 Mature Model for an Internet Transactional Site

The Internet Commerce Web Site

In 1998, e-commerce was the most rapidly expanding sector of activity on the Web. There is no reason to think that this will change in the immediate future.

Purposes

Internet commerce is complex. It involves several kinds of data that must be handled in appropriate ways at the right time. Some of the controlling considerations for an Internet commerce site include:

The company wants to sell products or services to customers via the Internet.
The company requires that the customer transactions be both secure and reliable.
The site will usually have data feeds and batch jobs between the site and back-end systems (accounts receivable and payable, inventory, fulfillment vendors, and so on).

Capacity Planning Issues

The Internet commerce site has special requirements that affect capacity planning in specific ways. These involve performance, security, and availability:

The site, because of its "real-time" nature, must be designed to handle the maximum number of concurrent customer connections that can be expected, with a good safety margin so that it can smoothly handle spikes in usage.
The site designer must implement a high level of security and potentially some form of data encryption. This can easily add 15 to 20 percent in system resource overhead to each transaction.
It is possible for the site's performance to be constrained by a downstream resource. For example, the Web server that handles catalog content may be able to handle 200 transactions per second; but if the sales tax or credit card components can only handle 50 transactions per second, there is a potential for a serious bottleneck to develop. It is crucial for site designers and administrators to be aware of these external constraints where they occur.

Initial Model

The initial model for this site consists of multiple servers running IIS 5.0 with some database connectivity enabled. Other components, such as certificate servers, tax, and credit card validation are common.

Growth Model

This type of site can expect growth from both increased customer numbers and additional online commerce offerings on its Web server(s). These factors are, to some extent, multiplicative, and can over time cause exponential increases in server loading. Another key to this type of site is the fact that while the physical hardware of the site itself may support a certain performance rating, very often the components that interact with legacy and third-party systems are much slower.

Suggested Capacity Planning Methods

Designing a site for commerce is more complex and difficult than designing for the other scenarios presented here. Here are some guidelines:

Prior to purchasing any equipment, create a logical diagram of the site and identify all components and major systems with which the Web site must interact.
Determine how many transactions per second the Web site will be expected to handle during its peak traffic time.
Using the information from steps 1 and 2, determine which components will be the most frequently used and rate them in terms of their use of server resources. (It is probably sufficient, at this stage, to focus on three parameters: connections per second, CPU utilization, and memory utilization.)
As the components are installed, test each of them by running simulated traffic into the component and recording how many transactions per second it can handle. At the same time, observe (using PerfMon, which is a performance monitoring tool that measures memory use system-wide) how much of the system's resources are used by each component.

Using the information from steps 3 and 4, create a table similar to Table 4.5 (the numbers in this example are arbitrary):

Table 4.5 Transaction Component Performance Comparison

Component Name	Component "Rating"	Peak Transactions Expected	Maximum Transactions per Server	Server Resources Used per Transaction
Sales Tax	1	200	70	1.4%
Inventory	2	150	120	0.23%
Customer Service	3	80	100	0.45%

Using the table you generated in step 5, determine which components are the most resource-intensive, and whether any will fail to deliver the peak transaction numbers expected, on a per-server basis.
Multiply the maximum simultaneous transactions per server by the server resources utilized per transaction to find the sum result for all components that run on the same server. Any total that exceeds about 70 percent indicates that you should either content stripe by adding servers, separate components onto their own servers, or both.
As the Web site is deployed, analyze periodic PerfMon performance reports on metrics like connections per second, memory, and CPU utilization to determine how accurate your chart is.
Update the chart as necessary with production data and add additional servers (using either component striping or specialization) to maintain the Web site, so that the peak transactions per second will only use approximately 70 percent of the resources of any one server computer.

A Large-Site Case Study: microsoft.com

At the time of writing, microsoft.com, one of the largest and most active sites on the Web, was receiving well over 200 million hits per day and hosting more than 24 gigabytes (GB) of content. Building a high-traffic Web site of this kind involves careful planning and constant monitoring in order to achieve a balance between user demand and the site's key components: hardware, software, and network infrastructure. These components need to be in balance to efficiently handle the content on the site, the number of hits to the site, and spikes in usage.

Despite the size of microsoft.com and the special needs that follow from its size, the processes that the Microsoft team follows in order to plan, deploy, and maintain the site are relevant to many other sites. These processes are worth examining, even if your Web server is only connected to a small intranet.

A Snapshot of the Site

The microsoft.com site started in 1994 as a single computer under a developer's desk. It sustained as many as a million hits a day. Today, the site is one of the largest on the Internet. It has far more server hardware and bandwidth than necessary for day-to-day use, but all available capacity is needed for the inevitable spikes that occur when thousands of users concurrently download, register, or participate in some type of online activity, such as with a major product release.

Figure 4.9 shows the site, with its hardware and internal connectivity:

Bb742409.iis0409(en-us,TechNet.10).gif

Figure 4.9 The microsoft.com Web Site

Current statistics and resources include:

Traffic

43 million page views per day.
240 million hits per day.
2.5 million users per day.
15 million users per month.
More than 6 GB of successful downloads per day.

Growth

115 percent increase in page views between July 1997 and July 1998.
14 percent increase in users during that twelve-month period.

Content

24 GB of content.
324,000 HTML or ASP files.
307,000 .gif and .jpg files.
About 18 GB of files available for download.
Content updated every three hours worldwide.

Hardware

Over 100 servers for all parts of the site: Compaq Proliant 5000s, 5500s, and 6500s, each with four Pentium Pro processors and 512 megabytes (MB) of RAM. (There are also 30 more servers at other U.S. locations, as well as overseas mirror sites.)
Six internal Ethernets provide 100 megabits per second (Mbps) of capacity each.
2 OCI2 SONET fiber optic lines provide 1.2 gigabits per second of capacity to the Internet.

Software

Microsoft® Windows® NT Server 4.0.
IIS 4.0.
Microsoft® Site Server 3.0.
Microsoft® SQL Server 7.0.
Other Microsoft tools and applications.

Planning for Spikes

The site's involvement in the Microsoft® Windows® 95 launch led to a surge of activity inside Microsoft, as product and marketing groups added content to it. During 1996, the number of hits per month on the site grew from 118 million to over 2 billion (it is now considerably higher). The product groups' increased focus on Web-based marketing has fed the ramp-up, as additional users access more content on the site.

Beyond continuing growth and regular periods of peak usage, irregular spikes and special events place much larger burdens on many Web sites and servers. For Microsoft, many of these coincide with software product releases. The release of Microsoft® Internet Explorer 4.0 in October of 1997 is a good example: In one week, more than 2 million users downloaded an average of 18 MB each. Peak usage exceeded 6 terabytes per day.

While it is true that Microsoft's site has some special requirements, companies of every size need to plan for spikes, which can occur when they start a new ad campaign, appear in a news article, or are linked to a popular Web site. Even a single Web server connected only to a small LAN can experience spikes in activity. However, if the server supports only a small number of clients and their needs are not urgent, the impact won't be catastrophic. At the same time, if it is important to the enterprise that even a few of those clients be able to access the Web server without delay under any circumstance, or if it is crucial that the site be available all the time, the site manager make sure that it has the necessary capacity and reliability.

When Planning Is Not Enough

Spikes are not the only pitfalls. The continued expansion of day-to-day operations and the ongoing change in orientation from static HTML to dynamically generated pages and increased interactivity led to unforeseen difficulties at microsoft.com. Toward the end of 1996, for example, problems began to emerge at https://home.microsoft.com as users found error messages appearing on their browsers. By May of 1997, the servers had begun to show signs of blocking, as thousands of delayed access requests backed up. Performance of the Internet Start page went downhill gradually for two or three weeks, then suddenly took a nosedive. The site's existing hardware and software technology was simply unable to keep pace with demand.

Working Toward a Resolution

Because the Internet Start page at https://home.microsoft.com is the default home page for the Internet Explorer browser, the hit rate at the site is continually increasing. At the time of this writing, each of the servers was handling between 2,000 and 4,000 viewers per second. In May of 1997, as access to https://home.microsoft.com became increasingly difficult, a task force was formed and given the charter to find a solution. The group met daily for six weeks, working to return the site to nominal performance.

The first step was to examine the hardware, beginning with the servers-Internet connection bandwidth was not saturated, and therefore could not be the central issue. With the number of viewers increasing rapidly, the servers were adding viewer requests to the queue faster than they were delivering responses. More servers were added to handle the load, but this alone was not sufficient. The task force began probing the connections among servers, databases, and viewers to find and eliminate possible signal breakdowns or bottlenecks. They also initiated a process of streamlining the site's HTML and ASP code to make it as lean and efficient as possible. By mid-June some progress had been made, but it was clear that something else was wrong.

A Case of External Dependency

The task force began to look at the objects-text files, graphics, and other files that make up individual Web pages-that were being called by the servers. Some of those objects are housed in the Properties Database (PD), which receives and stores individual user preferences: news headlines, personalized home-page options, and stock quotes, for example. Through a process of elimination, the task force came to suspect the Properties Database of being the main culprit behind the ongoing slowdowns and failures. All of the servers were drawing on the same central repositories or databases for page elements, like many wells drawing on a common aquifer. In other words, the problem was external to the servers themselves; adding more of them just made it worse, because doing so only increased the loading on the bottlenecked component.

Part of the solution was to cut down the number of requests to that particular set of databases. New, less demanding ways of personalizing pages were developed, and the server software was adjusted to work better for the rapidly increasing numbers of viewers using the site. In addition, the database servers were moved physically closer to the Web servers in order to allow the team to eliminate intervening routers and other hardware. Meanwhile, the team continued to examine the entire process of creating Web pages and of serving them to visitors.

A Case of Inefficient Access

A new Internet Explorer download page was deployed just before the release of Internet Explorer 4.0. This page was complex, and the path to it led through two other processor-intensive pages, one of which was the microsoft.com home page. Almost immediately the site began to suffer processor bottlenecks. Analysis showed that a large number of clients were following a single path into the site, straight to the Internet Explorer download page, and that there were serious inefficiencies along that path. Streamlining the ASP code was not sufficient to resolve the problem. Nor was adding hardware the answer: Site engineers estimated that even an order-of-magnitude increase in the number of servers wouldn't be enough to handle the demand. Instead, temporary measures were put into place:

Pages involving frames, other than the site's home page, were reduced to single panels.
The download page and both pages leading to it were temporarily redesigned using only static HTML, rather than as dynamic pages.

This solution demonstrates another way to deal with spikes: Reduce the load by decreasing functionality, thereby lessening overhead. Once the load returns to normal levels, the original functionality can be restored.

Finding a Balance

Building an Internet or intranet site requires balancing server hardware and software, content, and network resources, and then maintaining that balance as site traffic grows. It is important for site managers to be on the lookout for hidden assumptions and external dependencies that interfere with the balance. Constant monitoring keeps the balance in view.

The Hardware

Hardware is a necessary part of site capacity and performance, and it is important to select hardware that can be scaled or upgraded easily when demands increase. The microsoft.com team continually reviews its hardware to be sure that capacity is staying ahead of demand. The current CPU utilization for microsoft.com is as high as 40 percent per server; it is deliberately kept well below capacity so that the site can handle spikes.

The Software

In addition to the Microsoft software used to run the site, the microsoft.com team has developed a few internal content management utilities by using Microsoft® Visual Basic® and Microsoft® Access. For example, a content-tracking tool scans the site each day and fills a database with information about each of the 250,000 HTML and ASP pages, including where the links on each page point. When a page is to be deleted, a check of the database shows other pages that point to that page. Links on the other pages can be changed before the page is deleted, so someone viewing them won't find broken links.

Another document management tool keeps track of who is responsible for each page and when the page was sent to the servers. Even with thorough testing, problems-such as errors in ASP pages-sometimes get through and are difficult to pinpoint among 250,000 pages of content. The tool provides a list of pages sent to the servers shortly before the problem appeared, narrowing the number that have to be investigated.

The Network

In another move to increase network capacity at Microsoft, system engineers installed two network adapters in each server. One is for the Fiber Distributed Data Interface (FDDI) ring that carries Internet traffic, and the other is for the corporate network, which handles administration and content replication. With this arrangement, administrative traffic doesn't take bandwidth away from Internet traffic.

The Content

Just as the microsoft.com team constantly reviews its hardware capacity to ensure quick response for the user, it also studies the way it configures its servers to manage content. Currently, the content is updated eight times a day, with 5 to 7 percent of the total content changing each day. Because each of the microsoft.com Web servers contains a complete copy of the site's content, each set of changes has to be replicated to every server. The team is now using Network Load Balancing to accomplish this.

Ongoing Changes

Groups producing content for microsoft.com want to use the latest Web publishing features to make their content more eye-catching and interesting to users. The result is a tug-of-war between content providers, who want users to have the richest experience possible, and the team running the Web site, which wants to increase download speeds and the site's capacity. The site currently has a size limit of 100 KB per page, including graphics, but preferably each page should be smaller than 60 KB. The bottom line is that content needs to strike a balance between being visually interesting and being as small as possible in order to improve download performance.

The User Experience Counts

The amount of content a site delivers does not necessarily indicate user satisfaction. To monitor the user download experience, the microsoft.com team periodically tests the access, search, and download times for specific pages, using 28.8 Kbps modems stationed in various U.S. cities. The test that was performed in January of 1997, showed that users averaged 50 seconds per task. A more recent test of download times indicated a 20second improvement to 30 seconds per task, thanks in part to the new network design.

Summary

Microsoft.com team members are hard-pressed to define a formula for how much server power, software, and network bandwidth is needed to build a high-traffic Web site. Instead, the team relies on constant monitoring, worst-case planning, and powerful servers. Their experience and recommendations can be summarized in four rules of thumb:

Learn as much as you can about your audience, potential content, marketing plans, and future initiatives, and then take your best guess.
When in doubt, choose bigger and faster. You may overbuild today, but you'll likely need the extra capacity sooner than you expect.
Once your site is in operation, watch it closely. Find out which of the three components-hardware, software (including Web applications), and network bandwidth-is lagging behind, and bring it into balance with the others. Watch for changes in content as well.
Know that nothing will stay static for long. Keep an eye on growth and usage trends, and be prepared to react (that is, to add or shift hardware, software, or network capacity) quickly.

General Guidelines

To conclude, here are some useful things to keep in mind:

Inform yourself about the way IIS 5.0 works, so that you can track down trouble before it starts.
Remember the old rule of thumb: Computing needs rise to fill available resources in about six months, on average. You should be planning for at least six months out.
Whatever you get, even if it seems sufficient now, is going to require upgrading sooner or later; in today's climate, it's more likely to be sooner than later.
Don't skimp on RAM. A server computer running IIS 5.0 should have a minimum of 128 MB of RAM, though you can get by with less if you can tolerate some impact on performance. It's probably impossible to set up a server with "too much" RAM.
If you are concerned with the speed and reliability of disk access, consider caching controllers and "RAID 01" arrays. However, a caching controller is pointless unless lots of your clients tend to request the same information.
Be extremely aware of security issues. For more information, see "Security" in this book.

Additional Resources

The following Web sites and books provide additional information about IIS 5.0 and about other features of Windows 2000 Server.

Web Links

https://www.microsoft.com/ntserver/ntserverenterprise/exec/feature/wlbs/faq.asp

This contains Network Load Balancing Frequently Asked Questions.

Microsoft® Backstage has various articles and information about the microsoft.com Web site.

https://foldoc.hld.c64.org/foldoc.cgi?Open+Systems+Interconnection

This site describes the Open Systems Interconnect seven-layer model.

https://www.softwareqatest.com/qatweb1.html

This page provides a wide variety of load and performance tools for download.

Books

Capacity Planning for Web Performance: Metrics, Models, and Methods by Daniel A. Menasce and Virgilio A. F. Almeida, 1998, Upper Saddle River: Prentice Hall.

Web Performance Tuning by Patrick Killelea; Linda Mui, Editor,1998,
Sebastopol: O'Reilly & Associates.

Chapter 4 - Capacity Planning

On This Page

Capacity Planning Issues

Traffic

Browser Download Time

Other Services and Activities

Considerations

Caching and the Refresh Process

Secure Sockets Layer

Web Application Performance

Multiprocessor Scaling

Reliability

Server Clustering

Windows 2000 Server Clustering

Network Load Balancing

DNS Round-Robin Distribution

Determining Your Installation's Requirements

A Capacity Planning Checklist

Capacity Planning Scenarios

The Intranet Web Site

Purposes

Capacity Planning Issues

Initial Model

Growth Models

Suggested Capacity Planning Methods

The Internet Marketing Web Site

Purposes

Capacity Planning Issues

Initial Model

Growth Models

Suggested Capacity Planning Methods

The Internet Transactional Web Site

Purposes

Capacity Planning Issues

Initial Model

Growth Model

Suggested Capacity Planning Methods

The Internet Commerce Web Site

Purposes

Capacity Planning Issues

Initial Model

Growth Model

Suggested Capacity Planning Methods

A Large-Site Case Study: microsoft.com

A Snapshot of the Site

Planning for Spikes

When Planning Is Not Enough

Working Toward a Resolution

Finding a Balance

Summary

General Guidelines

Additional Resources

Web Links

Books

Additional resources