Moving Microsoft Update Downloads to x64
Deighton Maragh and Mark W. Roellich
One of the responsibilities of the Microsoft.com operations team is to manage the infrastructure that supports the Windows Update and Microsoft Update services, which have client bases in the hundreds of millions and growing. The Windows Update site provides critical updates, security fixes, software downloads, and device drivers for Windows operating systems. Microsoft Update is the service that brings you all the features and benefits of Windows Update plus downloads for other Microsoft applications including Office. We must also support automatic updates, a major feature of the update services, which lets your PC automatically check for important updates and download (and possibly install) them for you.
We have performed various adjustments to optimize these services. For example, we use different servers to host the binary downloads than we use for the application itself, and we fine-tuned these download servers for one purpose: serving downloads. To increase our network throughput and limit disk I/O we have configured IIS 6.0 on the download servers to use user-mode static-file caching. This also saves resources, because most of the requests are served from memory. Plus, we configured the cache with a number of parameters, including MaxCachedFileSize (the maximum size of the file that the cache can hold) and ObjectCacheTTL (the length of time the cached object is to remain in the cache). For more information on caching in IIS 6.0, see "Web and Application Server Infrastructure - Performance and Scalability
The Challenge of a 32-bit Operating System
Even with all these optimizations we were still looking for better and faster ways to support our huge user base. Running on a 32-bit OS, we faced a number of limitations that we hoped would improve if we moved to 64-bit Windows®, so we ran some tests which we’ll describe shortly.
Some of the limitations of a 32-bit operating system included the amount of virtual memory that the application pool could use: 2GB user-mode for one application pool. This limited the size of the cache we could configure on each server. Because we would sometimes note a very high number of concurrent connections on our 32-bit servers (roughly 12,000–14,000 per server) several problems occurred including a large number of connection resets from the server. In addition, our processor utilization would climb to about 70 to 80 percent and the throughput on each network interface card (NIC) would fall to only about 500 to 600Mbps.
Time to Upgrade
As you see, we had a number of bottlenecks on these servers. So when the opportunity to upgrade arose because a portion of our infrastructure needed to be replaced, we decided to purchase x64 hardware that can run either a 32-bit or a 64-bit operating system.
To get an idea of what the improvements would look like, we ran a side-by-side comparison of the 32-bit and 64-bit operating systems. The servers we tested had identical hardware and IIS configurations, but one box ran Microsoft® Windows Server™ 2003 Enterprise Edition SP1 and the other ran Windows Server 2003 Enterprise x64 Edition. The actual configurations for both Windows Server 2003 Enterprise Edition SP1 and Windows Server 2003 Enterprise x64 Edition were the same. The hardware included two HP ProLiant DL585 machines, each with four 2.2 GHz AMD processors, 4GB of RAM, and a GigE NIC.
To understand the details of the test we’ll describe in a moment, you should be familiar with the workings of the automatic update service. The automatic update client application calls the download servers and uses Background Intelligent Transfer Service (BITS) to download the updates. BITS is a file transfer service that transfers files between a client and a server in the background (by default). Transferring files this way uses only idle network bandwidth to preserve the user’s interactive experience with other network applications, such as Microsoft Internet Explorer®
. BITS accomplishes this stealth transfer by examining the network traffic and adjusting its use of bandwidth as more bandwidth becomes free. For more information on BITS, see About Background Intelligent Transfer Service
The Test Scenario
In the test scenario, we sent an equal number of requests to each server via DNS global load balancing. A typical request consists of an HTTP byte range request, which may specify a single range of bytes, or multiple ranges within a single request to download. The average download is approximately 30KB per request and the client download speed averages about 80KB per second. Having HTTP keep-alives enabled in IIS dramatically increases the server efficiency. HTTP keep-alives allow the server to keep the connections open across multiple requests. Instead of opening a new connection for every byte range request, the client uses the same connection. Once the request has been fulfilled, the servers log an HTTP 206 response message. (A 206 response is the result of a fulfilled partial GET request for the resource.)
The only difference between the 32-bit OS and 64-bit OS was that we ran in native 64-bit mode. On 64-bit Windows, if you want a native 64-bit application pool, you need to verify the correct metabase.xml value (Enable32BitAppOnWin64="FALSE" for native 64-bit application pools).
In switching to 64-bit hardware, we were hoping to increase our concurrent connections, thereby increasing our bandwidth. We were also hoping to increase the size of our user-mode cache in order to serve more files out of it. The results of the test are shown in Figure 1.
Figure 1 64-Bit Hardware Running 32-Bit and 64-Bit Windows
||Windows Server 2003 Enterprise Edition SP1
||Windows Server 2003 Enterprise x64 Edition
|Concurrent connections average
|Get req/sec average
|Get req/sec max
|Application process (VM usage)
|HTTP 500 errors
Examining the data, it is clear that there are several advantages to using a 64-bit OS compared to a 32-bit OS for this Web site. First, on a 32-bit OS the application pool and the processes running in them are bottlenecked by the 4GB virtual memory application pool limits: 2GB for user-mode processes and 2GB for the kernel. Since the servers are configured to cache the files in user mode, this is a limiting factor. On the 64-bit OS the application pool has an 8TB limit for user-mode processes and 8TB for the kernel.
Second, we noticed much better network throughput on the 64-bit OS compared to what we were able to achieve with a 32-bit OS. On average with the 64-bit OS, we observed 20 percent more utilization of the NIC because the server could successfully process more get requests per second. This resulted in the 64-bit OS having to maintain fewer concurrent connections (see Figure 2).
Figure 2 Get Requests and Concurrent Connections
Finally, our prediction of an increased number of concurrent connections was incorrect. Actually, the 64-bit OS is so much better at serving requests than the 32-bit OS (3,400 average gets for 64-bit versus an average of 2,000 gets for 32-bit) that the connections do not remain open long enough to increase our current connections.
The results of our tests affect our strategies for financial cost and capacity management. When we talk about cost and capacity we use the term "scalable unit." For us, a scalable unit consists of a 5-server node cluster configured using Network Load Balancing (NLB) for local load balancing across hosts. Traditionally, one scalable unit has the maximum capacity of 3,000 Mbps on 32-bit Windows 2003 Advance Server on a 32-bit hardware platform. The new scalable unit benchmark established for the 64-bit platform is approximately 5,000Mbps. That translates to a typical setup of 4 scalable units handling 12,000Mbps of traffic. Today we can easily achieve that same capacity with only 2.5 scalable units. This equates to roughly 60 percent of the datacenter footprint, less administration overhead for managing fewer servers, and a reduction in datacenter hosting costs of nearly 40 percent.
Another way to analyze these benefits is that we have been able to maintain the same server footprint while almost doubling capacity. This in return has allowed us to service more clients and drive down the cost/client update ratio. Moving forward we will consider adding more network adapters into the servers to further maximize all the available system resources as the GigE NIC is now the limiting factor for delivery.
Deighton Maragh has spent four years on the Microsoft.com operations team as a systems engineer working in the download space for Microsoft.com and Windows Update. He currently manages the Web operations team for the download systems.
Mark W. Roellich has spent three years on the Microsoft.com operations team, a year as manager of the labs, and two years as a systems engineer working in the download space for Microsoft.com and Windows Update.
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited