Scaling Out

Applies To: Windows Server 2003, Windows Server 2003 with SP1

Scaling out is the process of adding servers to an existing server environment to improve performance and to increase the number of Web sites that the system can host or publish. Scaling out reduces bottlenecks and lock contention because requests coming into the system do not share resources. The request load is balanced among servers.

The Contoso administrators understood that although scaling up to an eight-processor server would meet their performance and scalability needs, they would potentially gain more throughput and increased reliability by hosting their Web application on multiple servers. They decided to use two four-processor servers (one with four 550-MHz processors and one with four 900-MHz processors) with Network Load Balancing for load balancing and failover. After installing and configuring Network Load Balancing on both servers, the administrators ran their WCAT stress test using 500 virtual clients (five client computers running 100 threads each).

As one Network Load Balancing node, the two servers processed (on average) 1,900 requests per second when the Processor(_Total)\% Processor Time counter approached 100 percent. The administrators were initially concerned that the requests per second had not increased beyond the results of the eight-processor server by itself. After looking at the actual megacycles per request, as shown in Table 7.13, the administrators realized they actually had an 18 percent increase in throughput, because the number of megacycles per request had decreased in the Network Load Balancing test.

Table 7.13 Megacycles/Request Calculations

Server Megacycles/Second (# of Processors/MHz) Requests/Second Megacycles/Request

4 x 550 MHz

2,200

900

2.45

8 x 900 MHz

7,200

1,900

3.8

4 x 550 MHz +

4 x 900 MHz

5,800

1,900

3.1

With the 18 percent increase in throughput and the added reliability of hosting the Web application on two servers, the administrators asked themselves, Are we done? Is this good enough? Have we scaled this installation as much as we can scale it? They realized that the only truly accurate benchmark they could use for their application was the application itself. In other words, they had scaled the installation out to the point where they could readily process the expected traffic loads while staying within the 30 percent CPU utilization guideline. They had increased reliability by hosting the site on two servers, and they had executed their stress testing with no errors. Unless they began to experience errors from clients, or they noticed — during regular performance monitoring — that traffic loads were increasing and CPU usage was surpassing the 30 percent restriction, they had achieved their objective.