Microsoft Clustering Solutions
Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. |
Article from Windows 2000 Magazine
MSCS, NLB, CLB, and Application Center Add Scalability, Availability, and Reliability to Windows
by Greg Todd
Over the years, Microsoft has endeavored to expand the scalability, availability, and reliability of its server solutions. Clustering is a proven avenue to this objective, and Microsoft has embraced the notion of clustering, working to make it an integral part of Microsoft operating systems and product offerings. With the delivery of Microsoft Windows® 2000, Microsoft's clustering solutions have matured significantly.
On This Page
Scalability, Availability, and Reliability
Four Clustering Solutions
Microsoft Cluster Services
Network Load Balancing
Component Load Balancing
Application Center
Clustering and Microsoft .NET
Related Microsoft Websites
Scalability, Availability, and Reliability
A cluster is a group of independent computers that work together to run a common set of applications and to provide the image of a single system to the client and the application. The goal of clustering is to boost scalability, availability, and reliability across multiple tiers of a network.
Scalability is a computer's ability to handle increasing loads while maintaining acceptable performance. Hardware scalability (scaling up, in Microsoft parlance) relies on one large extensible machine to perform work. Software scalability (scaling out) depends on a cluster of multiple moderately performing machines working in tandem, not unlike a set of RAID drives. In fact, Microsoft has coined the informal term Redundant Array of Computers (RAC) to refer to its scale-out clusters. Just as you add disks to a RAID array to improve performance, you can add nodes to a scale-out cluster to improve performance.
Availability and reliability are closely related to each other but differ slightly. Availability is the quality of being present, ready for use, at hand, and accessible. Reliability refers to dependability. Even the most reliable machine fails eventually. Hardware manufacturers prepare for failure by providing redundancies in key areas such as disk drives, power supplies, network controllers, and cooling fans. However, redundancy on one machine doesn't insulate users from application failure. If the database software on one server fails, that server might be reliable, but that software and server combination isn't available. Thus, a single machine can't meet all the necessary scalability, availability, and reliability challenges that a cluster can.
Here again, this cluster can mimic a RAID array in providing availability and reliability. In a fault-tolerant disk configuration such as RAID 1 or RAID 5, all the disks work together in a redundant array. If one disk fails, you unplug it and insert a new one; the rest of the array keeps running—with no configuration, no setup, and, most importantly, no downtime. The RAID system automatically rebuilds the new drive so that it will work with the others. Similarly, when a computer in a cluster fails, you can simply replace it with a new system and keep running. Some clustering software can automatically configure the server and integrate it into the cluster—all while the cluster stays available.
Four Clustering Solutions
Microsoft offers four basic clustering technologies: Microsoft Cluster Services (MSCS), Network Load Balancing (NLB), Component Load Balancing (CLB), and Microsoft Application Center 2000. These services are delivered in three solutions: MSCS, NLB, and Application Center. CLB is part of, and is only available with, Application Center. You can use NLB with Application Center or as a standalone solution. Windows 2000 Advanced Server and Windows 2000 Datacenter Server include MSCS and NLB, but you must purchase Application Center separately.
Table 1 summarizes which of the four clustering technologies are available in the various members of the Windows 2000 Server and Microsoft Windows NT® Server 4.0 family. As you might imagine, none of these technologies are applicable to Windows 2000 Professional or Windows NT Workstation 4.0. Table 2 lists some of the cluster technologies' characteristics. You can refer to the table as I compare and contrast the technologies below.
Table 1 Clustering Technologies Supported by Different Operating System Versions
MSCS |
NLB |
CLB |
Application Center |
|
---|---|---|---|---|
Windows 2000 Server |
Not available |
Not available |
Supported (requires Application Center) |
Supported (requires third-party IP load balancer) |
Windows 2000 Advanced Server |
Included |
Included |
Supported (requires Application Center) |
Supported |
Datacenter |
Included |
Included |
Supported (requires Application Center) |
Supported |
Windows NT Server 4.0 |
Not available |
Supported (called WLBS) |
Not available |
Not available |
Windows NT Server 4.0, Enterprise Edition |
Supported (2-node max) |
Supported (called WLBS) |
Not available |
Not available |
Table 2 Clustering Technology Feature Comparison
MSCS |
NLB |
CLB |
Application Center |
|
---|---|---|---|---|
Purpose |
Application failover and failback |
Load balancing IP traffic |
Load balancing COM+ objects |
Creating and managing Web farms |
Benefit |
Availability and manageability |
Availability and scalability |
Availability and scalability |
Availability, scalability, and manageability |
Maximum number of nodes per cluster |
2 for Windows 2000 Advanced Server |
32 |
16 |
16 |
Type of clustering |
Shared storage |
Shared nothing |
Shared nothing |
Shared nothing |
State information |
Stateful |
Stateless (supports stateful connections if needed) |
Stateless |
Stateless |
Modifications required to server application |
Yes |
No |
No |
No |
Specialized hardware required |
Yes |
No |
No |
No |
Standalone |
Yes |
Yes |
No (requires Application Center) |
Yes |
Microsoft Cluster Services
MSCS, once known as Microsoft Cluster Server and now by Microsoft Cluster Services, was Microsoft's first foray into the world of clustering on Windows NT and is arguably Microsoft's best-known clustering solution. In an MSCS cluster, the MSCS software connects up to four physical computers running on a high-speed network. Typically, the clustered computers share a common storage subsystem and function in an "active-active" fashion, meaning that all cluster computers (nodes) are actively doing work to share the load but can also take up the slack if one of the nodes fails. Figure 1 shows a 4-node MSCS cluster.
Figure 1: A 4-Node Cluster with Windows 2000 MSCS
MSCS exists primarily to increase application availability through its failover capabilities. Failover is a cluster's ability to move processing from a failed application (because of causes ranging from failed hardware to software bugs) at one node to another healthy node in the cluster. When the failed application is restored, a cluster should be able to "fail back" to the original cluster node. MSCS manages the failover and failback of applications running on a cluster without losing any data associated with a failed application, maintaining the user and application state across a failover. This type of clustering is known as stateful clustering. In contrast, NLB, CLB, and Application Center provide stateless clustering and dynamic load balancing (which I discuss in more detail later), in addition to promoting availability.
MSCS is a good choice for running crucial applications such as email servers or database applications. Let's say you decide to run Microsoft Exchange 2000 Server on a 4-node MSCS cluster. After you install the MSCS software and cluster-aware version of Exchange 2000, you can configure the cluster so that Exchange 2000 will fail over to a backup node if a problem occurs on the primary node. Users will undoubtedly have sessions open on the main server when it fails, but MSCS performs the failover quickly and automatically, without losing any data. The backup node picks up the workload and the data from the failed node, and service to users continues.
MSCS also lets users keep working while you upgrade an application. You can perform a rolling upgrade (i.e., upgrading an application on one cluster node at a time while the application stays continues to be available on the other nodes) instead of having to take an application down when you upgrade. For example, say you have a 2-node cluster. Node 1 runs Exchange 2000 and node 2 runs Microsoft SQL Server®, and you've configured your cluster so that Exchange 2000 and SQL Server will fail over to the other node when necessary. When the time comes to upgrade SQL Server, you can use MSCS Cluster Administrator to initiate a failover of SQL Server on node 2. When node 1 takes over the task of running SQL Server (along with Exchange 2000), you can upgrade the SQL Server software on node 2. When you've finished, you can fail back SQL Server from node 1 to node 2 and repeat the process with node 1's SQL Server software. When you're finished, you've updated the SQL Server software without causing any downtime for users.
You don't typically use MSCS to scale an application for more users, as you do with the other three Microsoft clustering solutions. An MSCS cluster neither provides dynamic load balancing nor distributes applications across its nodes in a stateless, shared-nothing fashion as do NLB, CLB, and Application Center. In fact, the only real way to achieve application scalability with MSCS is to manually divide an application among the cluster resources during installation. For example, if you need to serve 5000 users on Exchange 2000, you could use a 2-node active-active cluster with 2500 users on each node. That way, you get the benefit of two servers handling the users plus availability in the event of failure. However, when a failover occurs, the remaining node must be able to handle all 5000 users until you can restore the failed node.
Network Load Balancing
NLB, formerly known as Windows NT Load Balancing Service (WLBS), distributes the incoming load of IP requests across multiple nodes running the NLB software. NLB provides scalability and availability for an IP-based application, such as a Web server. As user demand grows for server resources, NLB lets you add servers to handle the load. For example, Exchange 2000 benefits from using NLB with its Microsoft IIS–based communication front end for Outlook Web Access (OWA) to offload work from the main Exchange 2000 servers. The NLB cluster routes client requests to the back-end server or servers. If one NLB node goes down, the others pick up the extra load and the user notices no interruption in service.
The underlying NLB software is a network device interface specification (NDIS) driver that sits between the NIC and TCP/IP. You install the driver on each server in an NLB cluster. All NLB nodes share a virtual IP address that represents the desired network resource (e.g., the Web server). All NLB servers listen to all user requests, but only one responds. A load-balancing scheme based on a fast hashing algorithm that incorporates the client IP address, its port number, or both determines which server responds. You can specify an affinity to allow varying amounts of traffic between servers (i.e., you can specify that some servers should get more traffic than others). A heartbeat feature lets all the NLB nodes know about any changes in the cluster, such as a failure or the addition of a node. When changes occur, NLB starts a convergence process that automatically reconciles the changes in the cluster and transparently redistributes the incoming load.
NLB has its genesis in Microsoft's 1998 acquisition of Oregon-based Valence Research. Valence's Convoy Cluster Software became WLBS, which was an add-on product for Windows NT Server 4.0 and Windows NT Server 4.0, Enterprise Edition. In Windows 2000, Microsoft renamed and enhanced WLBS, but the core technology is still the same. NLB is an integral part of the network services in Windows 2000 Advanced Server and Datacenter.
MSCS and NLB in Windows 2000 work well together as long as you run them on separate computers—for example, in the configuration that Figure 2 shows. Microsoft doesn't recommend running MSCS and NLB on the same computer and doesn't support doing so because of potential hardware sharing conflicts between MSCS and NLB. For information about uninstalling MSCS or NLB, see the Microsoft article "Windows 2000 Interoperability Between MSCS and NLB" (https://support.microsoft.com/default.aspx?scid=kb;en-us;235305).
Figure 2: An n-node NLB Cluster Working as a Single Virtual IP Server
Component Load Balancing
CLB is something completely new for Windows 2000. COM+, the next step in the evolution of COM, is also new for Windows 2000. COM+ integrates COM, Microsoft Transaction Server (MTS), and system services with the goal of making Windows 2000 a better platform on which to design, develop, deploy, and maintain component-based applications. Put simply, COM+ is COM with a bunch of system services, including services that let you distribute components across multiple systems. One COM+ service is the ability to load-balance access to COM+ objects. CLB is simply the load-balancing cluster—multiple servers that share the load of activating and executing COM+ objects.
The need for CLB, like the need for NLB, stems from availability and scalability requirements. When you run a critical application that consists of COM+ objects, a failure in the application or server causes serious problems. CLB ensures that the application will continue to run if a failure occurs and that the user won't experience a lapse in service. Furthermore, some COM+ objects can be large and fairly complex, and running them on a server along with other key applications such as IIS could bog down system performance. To provide scalability in this case, you could move the COM+ objects off the IIS servers and distribute them among multiple servers in their own CLB cluster.
Suppose you're a computer manufacturer with a commercial website where people come for product and technical information, product support, purchasing, and more. Users around the world work with your products 24 hours a day, so your website must be available and performing well all the time. You could take the approach that Figure 2 shows and run NLB on your Web servers with access to the back-end MSCS database cluster. However, let's say that much of the logic behind the services you provide is coded in COM+ objects. You could run those objects on the Web servers, but Web server response time might slow because the machine running the Web server also must process the COM+ objects. You probably need CLB.
Figure 3 depicts how you might deploy a CLB cluster in a highly available and scalable website. CLB balances the load of accessing the business logic, which COM+ objects in the application's middle tier provide. (A CLB cluster implicitly requires Application Center, which I explain in more detail in the "Application Center" section, but now you know why you would use CLB.)
Figure 3: Using CLB to Load-Balance Access to COM+ Objects
CLB uses a combination of server response time and a round-robin algorithm to determine which server will handle the next request. CLB polls the COM+ servers in the cluster at regular preset subsecond intervals to determine how quickly the servers respond to the poll (their response time is directly linked to how busy they are). CLB then lists the servers in order by response time, with the fastest server at the top so that it will get the next COM+ activation request. Then, CLB distributes the work to the servers in the order they appear on the list until the next polling interval, when CLB reorders the activation list by server response time.
Because all this processing takes place over the network in real time, you can see that network contention could be a problem if you add CLB to a slow or congested network. You should deploy CLB clusters on a fast network backbone of at least 100Mbps. You don't typically put a CLB cluster on the regular corporate network where all the other network traffic lives.
Distributing COM+ objects in a CLB cluster doesn't make sense in all situations; you must base the decision to use CLB on an analysis of your application requirements. Clustering adds the overhead associated with the client requests that traverse the network and the overhead of selecting a server and activating the COM+ object to satisfy the client request. In some cases in which applications use a small number of lightweight COM+ objects, simply instantiating the objects locally on the Web server might provide better performance. Three scenarios in which you should consider CLB are as follows:
The COM+ objects that your business logic comprises are relatively "heavy" and must always run on the fastest server.
Security is a major concern, and you want to isolate COM+ objects by placing them behind an additional firewall.
Your COM+ applications are partitioned into multiple tiers for development or design reasons, and you need to employ CLB to separate the tiers.
CLB isn't available in any of the Windows 2000 Server family of products, nor can you purchase it as a standalone product. Originally, Microsoft intended to include CLB in the Windows 2000 Server family, but in September 1999, the company pulled it from Windows 2000 Release Candidate 2 (RC2) to put it into the newly announced Application Center. Today, the only way to get CLB is with Application Center.
Application Center
Application Center is part of the Microsoft Windows Server System™, whose precursor was Windows Distributed interNet Applications (Windows DNA) servers. Application Center's purpose is to be a single management point for your Web farm (i.e., multiple physical Web servers working together to serve common Web content), providing a unified user interface (UI) and leveraging both NLB and CLB for load balancing. Using Application Center, you can create clusters, join existing clusters, add and remove cluster members, deploy new content, configure load balancing, and monitor cluster performance. The result is a Web farm that looks like a single Web server to the outside user and is scalable, easy to manage, and highly reliable. These capabilities are important as an increasing number of critical applications become Web-based.
To see the full topology of Application Center, with all of Microsoft's clustering technologies working together, look again at Figure 3. The NLB cluster could be a cluster of IIS servers, for example, and the CLB cluster could provide the business logic. Together, the NLB and CLB clusters embody the Application Center Web cluster, and the database cluster uses MSCS.
Suppose you have an e-commerce site and are planning a big product rollout, during which many customers will want to buy the product. This situation will significantly increase website traffic, but you're not sure by how much. You've always just added servers as you need them, but setting them up is a pain. For the product rollout, you'd like to be able to scale the performance of the website by adding servers to your Web farm as easily as you can plug in disk drives to a RAID set. This type of scenario is precisely the kind in which the notion of RACs applies.
Application Center provides wizards for creating a cluster, adding new servers to a cluster, and deploying new content and configurations to cluster members. When you create a new cluster, you define a cluster controller that not only participates in the cluster but also owns all the configuration information. Then, you can specify additional members for the cluster. When you do so, Application Center deploys the COM+ settings, CryptoAPI settings, Registry keys, Windows Management Instrumentation (WMI) settings, file-system information, IIS metabase settings, and Web server content to each new cluster member. You end up with a cluster of clones, and you can use the Application Center Administrator to easily add to and remove from their numbers. Plus, Application Center transparently handles the usually tedious NLB configuration and deployment.
Application Center will support third-party IP load balancers in addition to NLB. As of this writing, Microsoft is working on support for Cisco Systems' LocalDirector, F5 Networks' BIG-IP, and Alteon WebSystems' ACEdirector. However, Application Center doesn't integrate these load balancers' administrative operations the way it does NLB's administration, so maintaining third-party load balancers will require some extra work.
You can install the full version of Application Center, which includes all the necessary components to create an Application Center cluster, on the following Windows 2000 versions:
Windows 2000 Server Service Pack 1 (SP1)
Windows 2000 Advanced Server SP1
Datacenter SP1
You must have Windows 2000 SP1 to support Application Center. If you don't, you can install only Application Center Administrator, which lets you remotely administer Application Center and IIS. The following Windows 2000 and Windows NT versions support Application Center Administrator:
Windows 2000 Professional
Windows 2000 Server
Windows 2000 Advanced Server
Datacenter
Windows NT Workstation 4.0 SP6 or later (x86-based computers only)
Windows NT Server 4.0 SP6 or later (x86-based computers only)
You can purchase Application Center, which should be shipping by the time you read this, separately as one of Microsoft's new Windows Server System products.
Clustering and Microsoft .NET
You might be wondering how Microsoft's clustering solutions fit into the company's next-generation
Windows services, dubbed Microsoft® .NET. Figure 4 compares Windows 2000 and Windows DNA to the Windows Server System and Microsoft .NET. Microsoft's cluster solutions fit into two areas of Microsoft .NET: Microsoft Windows Server™ 2003 (MSCS) and the Windows Server System (NLB, CLB, Application Center). Windows Server 2003 represents the evolution of the Windows 2000 operating system. The Windows Server System includes the following products (with integrated XML capabilities), many of which you will recognize from Microsoft BackOffice® today:
Exchange 2000
SQL Server 2000
BizTalk® Server 2000
Application Center 2000
Host Integration Server 2000
Commerce Server 2000
Internet Security and Acceleration (ISA) Server 2000
Figure 4: Evolution of the Microsoft .NET Platform from Windows DNA
Windows Server 2003 and the Windows Server System are two foundational components that Microsoft will enhance as it moves forward with Microsoft .NET. Microsoft is serious about Windows' scalability, availability, and reliability. Delivering on these characteristics becomes even more important as Microsoft advances .NET further into the Internet world as the platform for business.
Clearly, the presence of robust clustering technologies such as MSCS, NLB, CLB, and Application Center are vital to Microsoft's success in supporting business-critical applications. I expect to see enhancements and extensions to these clustering technologies, especially in Application Center and in new .NET technologies such as Active Server Pages+ (ASP+) load balancing, which Microsoft introduced at the Professional Developers Conference earlier this year. Take the time now to understand how these powerful clustering solutions work and how they might address your business problems.
Greg Todd, a program manager in the PATROL business unit of BMC Software, is responsible for setting technical direction for the company's Windows 2000 solutions. He has worked with Windows NT and its related technologies since 1993. You can reach him at gregt@bmc.com.
Related Microsoft Websites
"Exploring Windows Clustering Technologies"
https://www.microsoft.com/windows2000/guide/server/features/clustering.asp
"Network Load Balancing Technical Overview"
https://www.microsoft.com/technet/prodtechnol/windows2000serv/deploy/confeat/nlbovw.mspx
https://www.microsoft.com/windows2000/library/howitworks/cluster/nlb.asp
"Windows 2000 and Component Load Balancing"
https://www.microsoft.com/windows2000/news/fromms/clb.asp
"Microsoft Application Center 2000"
https://go.microsoft.com/fwlink/?linkid=591
"Introducing Windows 2000 Advanced Server"
https://www.microsoft.com/windows2000/guide/server/solutions/overview/advanced.asp
"Microsoft Windows 2000 Datacenter Server"
https://www.microsoft.com/windows2000/guide/datacenter/overview/default.asp
"Windows DNA Application Services"
https://www.microsoft.com/net/
"Microsoft Windows DNA"
https://www.microsoft.com/dna/default.asp
"Microsoft Services: Overview"
https://www.microsoft.com/business/default.mspx
"Microsoft .NET"
https://www.microsoft.com/net/
"Microsoft PDC 2000"
https://msdn.microsoft.com/events/pdc/
We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as-is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages. All prices for products mentioned in this document are subject to change without notice.