Deployment Notes for Windows NT Load Balancing Service (WLBS)

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Abstract

This document is intended as a general outline of deployment scenarios for the IP load balancing feature of Microsoft Windows NT Server 4.0, Enterprise Edition, the Windows NT Load Balancing Service (WLBS).

On This Page

Introduction
Key Features of WLBS
Common Deployment Scenarios
Network Monitoring Tools for WLBS
Port Rules for Common Applications
Setup Notes
For More Information

Introduction

The two principal goals of Microsoft Windows NT Load Balancing Service (WLBS) are to:

  • Provide high availability for Internet/intranet server programs.

  • Scale server performance.

It accomplishes these goals by using a cluster of two or more computers (called hosts) working together. Internet clients access the cluster using a single IP address (or a set of addresses for a multi-homed host). The clients cannot distinguish the cluster from a single server. Also, server programs cannot tell that they are running in a cluster. However, a WLBS cluster differs significantly from a single host running a single server program because it provides uninterrupted service even if a host fails and it can respond to the clients more quickly than a single host can (for load-balanced ports).

Because of this unique, fully distributed architecture, WLBS is ideal for creating highly scalable, highly available Internet server farms. WLBS enable sites to be created that can effectively handle heavy loads of Internet traffic, while guaranteeing that they will always be available.

Services that have been successfully tested for compatibility with WLBS include:

  • Web services (such as Microsoft Internet Information Server).

  • Virtual private networking solutions

  • Streaming media (such as Microsoft Windows NT Server NetShow Services).

  • Proxy services (such as Microsoft Proxy Server)

Note: Because WLBS is designed to work with a wide variety of IP-based services, it can potentially benefit services other than those listed above. Microsoft plans to test WLBS compatibility with various applications and services, and will update this compatibility list accordingly.

Key Features of WLBS

Scalable performance

  • Load balances requests for individual TCP/IP services across the cluster.

  • Supports up to 32 computers in a single cluster.

  • Optionally load balances multiple server requests from a single client.

  • Fully pipelined implementation ensures high performance and low overhead.

Fault tolerance

  • Automatically detects and recovers from a failed or offline computer.

  • Automatically rebalances the network load when the cluster set changes.

  • Recovers and redistributes the workload within 10 seconds.

  • Handles inadvertent subnetting and rejoining of the cluster network.

Controllability

  • You can specify the load balancing for a single IP port or group of ports using straightforward "port management rules" that tailor the workload for each computer. Optional support for client sessions can be enabled.

  • Optional single-host rules let you direct all client requests to a single host to further refine load balancing among different programs.

  • You can block undesired network access to certain IP ports.

  • Logs all actions and cluster changes to the Windows NT event log.

  • You can remotely start, stop, and control WLBS's actions from any networked computer that is running Windows NT and using console commands or scripts.

Ease of Use

  • Installs as a standard Windows NT networking driver component.

  • Requires no hardware changes to install and run.

  • Lets clients access the cluster with a single logical Internet name and IP address (virtual IP address) while retaining individual names for each computer. Allows multiple virtual IP addresses for multi-homed servers.

  • Server programs need not be modified to run in a WLBS cluster.

  • All operations, including recovery, require no human intervention.

  • Computers can be taken offline for preventive maintenance without disturbing cluster operations.

  • Comes with full online Help facilities.

Common Deployment Scenarios

Note: Additional deployment scenarios will be posted as available to the WLBS Frequently Asked Questions.

Using WLBS with Microsoft Cluster Service (MSCS)

Situation: To take full advantage of the complete clustering solutions provided in Windows NT Server 4.0 Enterprise Edition, it is recommended to create two layers of clustering using both WLBS and Microsoft Cluster Service, formerly known by the code name "Wolfpack". The two technologies working together can provide a superior level of availability and scalability for both front and back-end services. A common situation would involve a WLBS cluster of Web servers processing e-commerce transactions against an MSCS cluster of Microsoft SQL Server databases (see Figure 1 below).

Figure 1: Using WLBS and MSCS together.

Figure 1: Using WLBS and MSCS together.

When Do You Use WLBS vs. MSCS?

  • Microsoft Cluster Service, which allows clustering of 2 nodes, is ideal for availability of services such as SQL Server, file and print, Microsoft Exchange Server, etc. In most situations where large volumes of highly dynamic data need to be made highly available, MSCS is the preferred solution.

  • WLBS, which allows clustering of up to 32 nodes, is ideal for high availability and scalability of TCP/IP-based services such as Web servers (e.g. IIS), streaming media, Virtual Private Networking (VPN), and proxy, services generally considered to be "stateless".

  • Additional information on deployment considerations for multi-tier WLBS and MSCS networks will be posted to the Windows NT Server 4.0 Enterprise Edition Web site (see "For More Information" below) as it becomes available.

Standalone Web Servers

Situation: You have one Windows NT Web server hosting your entire site (Internet/intranet/extranet). You are currently faced with one of two issues (or both):

  1. Demand has outstripped the capacity of the server and you are looking to expand.

  2. While demand may be manageable, the site is mission-critical and you need to guarantee its availability.

In either case you will want a solution that has a minimal impact on your network design, is compatible with existing systems, and is cost-effective. WLBS is a perfect solution.

Solution:

  1. Purchase or set up additional servers; the precise number depends on expected demand, type of application or service being hosted, and a variety of other factors. Start small and recommend adding one additional server to start with. Note: The hardware configuration does not have to match that of the original server.

  2. License Windows NT Server 4.0, Enterprise Edition for each server to be clustered. Note: While a Windows NT Server 4.0, Enterprise Edition license is required for each server in a WLBS cluster, WLBS will install and function on standard Windows NT Server.

  3. Configure the site content so it is consistent across all servers to be added into the WLBS cluster. Tools such as Microsoft Content Replication Server are often useful for this.

  4. Download WLBS from the Microsoft Web site (see link "For More Information" at the end of the guide) and install according to the instructions. Rolling installations (including configuring WLBS on an offline server, taking down the original server to install WLBS and then brining the first WLBS server online, then adding a second WLBS server) are recommended to keep the site online during the process.

Web Servers - Using Round Robin DNS

Situation: You have implemented a limited form of IP load balancing known as Round Robin DNS (RRDNS). While RRDNS does provide a basic, static level of load balancing, it does not provide a high-availability solution. In the event of a server failure, a portion of the site's clients will not have access to the site, and will have to wait for the failed server's IP address to be manually removed from the DNS tables and for their connections to timeout. The end result is they could be unable to access the site for several hours or more. You are likely looking for the following improvements:

  1. More control over how incoming traffic is load balanced across the Web server farm.

  2. A solution that automatically and transparently removes a failed server from the cluster and redistributes its share of the IP traffic.

Solution:

This is an ideal situation for WLBS. Because you have already set up a distributed Web site, and likely have settled on a way to ensure content consistency across all the servers on the site, WLBS will install transparently. An added advantage of WLBS is that your network will expose fewer IP addresses to the world.

  1. Purchase enough new or upgrade licenses of Windows NT Server, Enterprise Edition. Download WLBS from the Microsoft Web site.

  2. Install WLBS on each server in the existing Web farm. Remove all but one IP address from the DNS tablethe remaining IP address will serve as your cluster virtual IP address.

  3. Conduct a rolling installation (see previous example).

Once the cluster is up and running you will see a dramatic improvement in the ability to conduct planned downtime (such as hardware or software upgrades), smoothly handle unplanned downtime (such as server failure), and control the statistical breakdown of traffic to each server in the WLBS cluster.

You may elect to combine RRDNS with WLBS in order to provide high availability and load balancing for geographically distributed Web servers. On each site, use a WLBS cluster to ensure high availability and scalability within that site. Use RRDNS to load balance client requests across multiple sites.

Virtual Private Network Servers

Situation: WLBS enables you to create highly available and scalable VPN solutions. It works in the same manner and with the same basic setup and configuration principles as for Web servers, with some minor differences as noted in the "Solutions" sub-section below. The two key advantages WLBS provides for your Windows NT Server-based VPN solutions are:

  1. Increased reliability: WLBS spreads incoming IP requests across a group of VPN servers in the same manner as for other services set with WLBS Affinity to "Single" or "Class C". Each session initiated by a client is handled in its entirety by a single WLBS host. Upon host failure that client is prompted to log on again. The subsequent connection is sent to one of the remaining WLBS hosts without any additional action required by the client.

  2. Performance gains through scalability: By spreading the load of each client's initial IP requests, WLBS enables you to significantly scale the performance of your VPN services by adding additional servers as needed.

Solution: Create a 2+ node cluster of VPN servers, using Microsoft Routing and Remote Access Service (RRAS). Some things to note:

  1. Two Network Interface Cards (NICs) must be used on each node for this deployment. One NIC must be set to respond to the WLBS virtual IP address, while the other is set to respond to the dedicated IP address unique to each system. Due to a current limitation of VPN, these two addresses must be in different Class C address spaces.

  2. Set the bindings so that the WLBS Driver is bound to the cluster NIC card (the NIC which does not have the dedicated IP address), and add the WLBS Virtual IP Address to the WLBS Virtual NIC. This limitation is similar to the NetBIOS limitation outlined later in this guide.

  3. For simple failover: Do not create any port rules. The host with the highest host priority (i.e. "1") set in the WLBS Setup screen will handle all incoming connections by default. Upon node failure all connections will then be handled by the node with the second highest host priority (i.e. "2").

  4. For load balancing: Set one port rule on all cluster hosts. The port rule should be for port range 1 to 65,535 (i.e. all ports), with Single Affinity set. This is the default port rule for WLBS.

Note: Client sessions handled by a particular cluster host will break in the event that the host becomes unavailable. Clients will then be prompted to log on again, with their new session then being handled by the appropriate remaining host.

Streaming Media

Situation/Solution: To cluster streaming media servers with WLBS, set up two or more servers configured identically with streaming media services, and then install WLBS on each with the default port rule (i.e. Ports 1-65,535 enabled, Single Affinity, equal load distribution, etc.). Optionally, a simple failover cluster can be configuredfor setup information, please see the notes above in the section "WLBS with Virtual Private Networking (VPN) Server". With respect to maintaining a continuous stream in the event of a server failure, a streaming media cluster will behave in the same manner as a VPN cluster. Again see notes listed in the preceding section "Virtual Private Network Servers".

Secure Servers

Situation/Solution: The site needs to maintain client sessions, such as connections to secure servers. WLBS handles this situation by enabling system administrators to specify types of "Client Affinity". Note that this section describes continuous session state through affinity; to configure a site using Active Server Pages please refer to the "WLBS with Active Server Pages" scenario below.

Solution: There are two types of session support available in WLBS (default is set to "Single").

  1. Single: Use this when deploying solutions such as secure socket layer (SSL). Set a port rule to "Single Affinity" for port 443. FTP sites are also commonly deployed using "Single" affinity.

  2. Class C: Class C affinity is useful when a site is expecting a large number of users to be coming from one particular Class C address space, such as from a large corporate proxy array.

Active Server Pages (ASP)

Situation: Sites often use Active Server Pages to maintain a level of state information. As WLBS does not handle the synchronization of state across cluster nodes, it is necessary to implement a solution whereby a client's ASP session is maintained when connecting to other cluster nodes (such as immediately following a node failure).

Solution: There are three possible solutions:

  1. Implement a Microsoft or third-party data synchronization solution to perform synchronization of ASP client state across all cluster nodes.

  2. Encapsulate all of the state in a client-side cookie. Cookies are character strings that are provided by the server to the client to be stored and used by the client during subsequent requests to the server. In this technique, every request that the client gives the server it connects to all of the necessary "context" to process that request. Note that this works only if there is relatively small amount of state that is associated with each client transaction. As state grows larger, it becomes very inefficient and often infeasible to have the client forward a large cookie to the server every time. WLBS does not provide a solution for this setup at this time.

  3. Cache state on the Web servers, but use backend databases or middle tier application servers as the authoritative repository of the state. In this way, while a client is connected to the server, it has all of the necessary state in its memory and can process requests very fast. However, there needs to be a mechanism for "flushing" this state to the backend in case the client connects to another server, whereby it can simply fetch the necessary context from the backend.

See the section below: "Scaling remote file / printer access and adjusting NBT support parameters".

Multi-homed Web Servers

Situation: You want to host multiple Web sites on a WLBS cluster. On first inspection, you notice the WLBS Setup screen only allows for one "primary", or virtual, IP address.

Solution:

  1. Open "Advanced IP Addressing" in the Network control panel. (Network Control Panel, click the Protocols tab, click the TCP/IP Protocol, select "Properties", select the IP Address tab, and then select "Advanced").

  2. In the IP Addresses field for the WLBS Virtual NIC, enter as many additional virtual IP addresses as required. Ensure that this field is consistent across the cluster. WLBS will treat all IP addresses, except for each server's dedicated IP address, as virtual. The dedicated IP address should always be entered first in TCP/IP's list of IP addresses.

  3. For cluster control purposes, the "Primary IP address" entered in the WLBS setup screen will serve as the cluster ID. This ID is used to identify the cluster for remote control and to tag WLBS messages in case multiple clusters reside on the same subnet.

Note: In any of these scenarios, back-end data resources (e.g. SQL Server databases) can be made highly available using Microsoft Cluster Service (MSCS), a component of Windows NT Server 4.0, Enterprise Edition.

Network Monitoring Tools for WLBS

To monitor and evaluate WLBS's load balancing behavior and performance, you can use Microsoft's Performance Monitor (found under Programs\Administrative Tools in the Windows NT Start menu). Using this tool, you can graph network and Web server throughput, CPU load, and other important measures of server performance. The actual load balance across servers and failover behavior can be clearly visualized. Consider using a synthetic Web load generator (available from several third parties) to drive these performance measurements.

In version 2.2 for Windows NT 4.0, WLBS does not include application monitoring tools or features. Many leading monitoring tools on the market can be easily configured by an end-user to provide this support, by creating an alert option specifically for WLBS. In the alert functions of many monitoring tools, an option is provided for running a script, in addition to the standard "Page this number" or "Email this address". Using these scripts, WLBS can be tied into existing monitoring infrastructure.

Example:

IF the IIS application on Server A does not respond for X seconds, THEN run the following script: 'wlbs stop <cluster IP address>:<Server A's host ID or dedicated IP address>

Some examples of third-party tools:

SiteScope by Freshwater Software ( https://www.freshtech.com )

AppManager by NetIQ ( https://www.netiq.com )

WhatsUp Gold by Ipswitch ( https://www.ipswitch.com )

Port Rules for Common Applications

HTTP: Web servers typically listen on port 80. Affinity should be set to 'None', unless the Web server maintains client state in its memory, in which case set affinity to 'Single' or 'Class C'.

HTTPS: HTTP over SSL (encrypted Web traffic) is usually handled on port 443. Affinity should be set to 'Single' or 'Class C' to ensure that client connections are always handled by the server that has SSL session established.

FTP: FTP uses port 21 for control connection from the client and port 20 for return data connection from the server. Create two port rules that cover ports 20-21 and 1024-65,535 with affinity 'Single' or 'Class C' to ensure that both data and control connections are handled by the same server.

TFTP: TFTP servers (BOOTP, etc.) use port 69 and can easily be load balanced with WLBS. Affinity should be set to 'None', when creating port rule covering port 69.

SMTP: WLBS can be used effectively for scaling high-volume SMTP mailers. In this case you should use port 25 and set affinity to 'None'.

NBT: NetBIOS over TCP/IP uses port 139 on the server. Affinity can be set to either 'None' or 'Single', but we recommend 'Single' for maximum compatibility with server applications.

Setup Notes

Note: This information can also be found in the WLBS.hlp file included in the WLBS download.

Adjusting Convergence Parameters

To adjust WLBS convergence parameters, start the Windows NT registry editor and select the key HKEY_LOCAL_MACHINE \System \CurrentControlSet \Services \WLBS \Parameters. The AliveMsgPeriod value holds the message exchange period in milliseconds, and the AliveMsgTolerance value specifies how many exchanged messages from a host can be missed before the cluster initiates convergence. You usually should not need to change these parameters. If it is necessary to make changes, pick these numbers based on your failover requirements. A longer message exchange period will reduce the networking overhead needed to maintain fault tolerance, but it will increase the failover delay. Likewise, increasing the number of message exchanges prior to convergence will reduce the number of unnecessary convergence initiations due to transient network congestion, but it will also increase the failover delay. Using the default values, 5 seconds are needed to discover a missing host, and another 5 seconds for the cluster to redistribute the load. A total of 10 seconds to complete failover should be acceptable for most TCP/IP programs, and this configuration incurs very low networking overhead. Setting the values too low make may make the convergence process unstable or cause very high network traffic.

Scaling Remote File/Printer Access and Adjusting NBT Support Parameters

When using the Microsoft network commands (such as net use) or when mapping in remote shares, you can access a WLBS cluster as a whole using a single NetBIOS machine name. This lets you use WLBS to scale read-only file services or print services across the cluster. It also adds high availability because NetBIOS connections are automatically retried when a cluster host fails. This feature supports browsing for cluster-hosted shares that have been mounted as local drives using the net use command. However, the cluster's NetBIOS machine name currently does not appear in the Network Neighborhood folder.

To scale Microsoft (read-only) file or print services using WLBS, create a port rule for port 139 with Multiple Hosts filtering mode and Single client affinity. Be sure that the file shares or printers to be accessed by clients are identically available on all cluster hosts. (To identify the host to which a client connects, you may want to place a special file called Hostname on each host's share that contains the name of the particular cluster host.) On the client systems that will access the cluster, mount the desired cluster-hosted shares or printers using the net use command. For example, to mount a file share, use the following command, where x: is the client's local drive for the share and cluster is the cluster's NetBIOS machine name:

net use x: \\cluster\share_name

The cluster's NetBIOS machine name is the host name within the full Internet name, which is entered via the WLBS Setup dialog box. In order for the clients to locate the cluster using this cluster NetBIOS name (by resolving the cluster NetBIOS name to the cluster's IP address), you will need to create the appropriate static WINS, DNS, or LMHOSTS file entry; please consult your Microsoft documentation for instructions on how to do this. The cluster hosts will not advertise the cluster's NetBIOS name, but they will answer requests made to it. The individual cluster hosts will continue to advertise and respond to their respective NetBIOS machine names in addition to the cluster's machine name.

Note:

Microsoft's NBT support uses only the first IP address on each NIC to which the WINS Client is bound. To make use of the cluster IP address and retain normal access to the cluster host with its dedicated IP address, you must use two NIC's in each cluster host. This constraint is a limitation of NBT support, not of WLBS. Also, be sure that the cluster's primary IP address is the first IP address added by TCP/IP for the WLBS Virtual NIC and do not use a dedicated IP address for this NIC. The dedicated IP address should be used with the second NIC.

NBT support is enabled by default. You can disable it by changing the NBTSupportEnable registry parameter value from 1 to 0.

Adjusting Multicast Parameters

WLBS uses one of two methods to enable the cluster NIC's on all cluster hosts to simultaneously receive network traffic for the cluster's IP addresses.

  • When multicast support is enabled in the WLBS Setup dialog box, WLBS adds a multicast MAC address to the cluster NICs on all cluster hosts. The cluster NICs retain their original MAC addresses.

  • Otherwise, WLBS instructs the cluster NIC's driver (via the Windows NT registry) to change its MAC address to the cluster's MAC address used on all cluster hosts. (Note that some NICs do not support changing their MAC addresses.)

When multicast support is enabled in the WLBS Setup dialog box, WLBS adds a multicast MAC address to the cluster NIC's on all cluster hosts. The cluster NIC's retain their original MAC addresses. Otherwise, WLBS instructs the cluster NIC's driver (via the Windows NT registry) to change its MAC address to the cluster's MAC address on all cluster hosts. (Note that some low-end NIC's do not support changing their MAC addresses.)

Multicast support has several advantages, such as eliminating the single network interface card limitations and enabling WLBS to work properly with switches. However, it requires that WLBS handle the resolution of the cluster IP address to its associated multicast cluster MAC address within the Address Resolution Protocol (ARP). WLBS automatically provides this capability. In rare cases, you may want to disable WLBS's ARP support for multicast addresses. To do this, set the MulticastARPEnabled registry parameter to 0. By default, this parameter is set to 1 and is used only when multicast support is enabled.

The use of a multicast MAC address may not be supported by the ARP implementation on some routers. If this problem arises, the cluster will not be reachable from outside the local subnet, and it is necessary to create a static ARP entry within the router. Please refer to the documentation for your router to determine how to create a static ARP entry.

Microsoft has observed that some NIC drivers experience increased CPU load percentages (by 5 percent or more depending on network traffic) when handling multicast MAC addresses instead of unicast addresses. These NIC drivers do not appear to use the advanced pipelining features available in the NDIS 4.0 specification. The Compaq NetFlex driver has not exhibited this problem. If your cluster hosts have unusually high CPU load percentages when using WLBS's multicast support, you may want to disable this feature (using the multicast enable button in the WLBS Setup dialog box) to see if this is the source of the problem.

Optimizing Network Performance

If the cluster hosts are directly connected to a switch in order to receive client requests, incoming client traffic is automatically sent to all switch ports. If you need to avoid flooding the switch ports (for example, when computers outside the cluster also share the switch), do the following:

  1. Connect the cluster NIC from each host to a hub and uplink the hub to one switch port.

  2. Disable multicast support by clearing the enabled checkbox in WLBS Setup.

  3. Set the MaskSourceMAC registry parameter to 0. (The default value for this parameter is 1.)

These steps will cause all incoming client traffic to flow through one switch port and will thereby conserve switch bandwidth. In this configuration, you can consider adding a second, dedicated NIC in each host connected to an individual switch port. The use of two NIC's per host helps to pipeline network traffic through the cluster hosts. Incoming client traffic flows through the hub for simultaneous delivery to all hosts, while outgoing traffic flows directly to the switch ports. Be sure to enter a gateway IP address in the TCP/IP Properties dialog for each dedicated NIC to cause outgoing network traffic to flow through that NIC.

Handling LAN failures

The inadvertent subnetting of the cluster network due to a local area network failure presents a special problem, and the WLBS cluster attempts to handle it as gracefully as possible. If the cluster splits into multiple subnets that cannot communicate with each other, each part will reconverge as an individual cluster ready to handle all of the traffic for the cluster IP address. (This problem is distinct from a networking failure in which a single host goes offline, in which case the host just leaves the cluster.) If both subnets remain active (that is, they continue to handle incoming Internet requests), a conflict arises when the networking failure is corrected and the cluster rejoins. In this case, some outstanding connections may have to be closed in order to establish a consistent state for the rejoined cluster. This situation occurs infrequently. It is far more probable that only one subnet will have access to the Internet, in which case no conflict arises because the other subnets do not provide service during the network outage.

For More Information

Microsoft TechNet

https://www.microsoft.com/technet/default.mspx

WLBS Home Page (and download site):

www.microsoft.com/ntserver/nts/downloads/winfeatures/WLBS/default.asp

Windows NT Server, Enterprise Edition Site:

www.microsoft.com/ntserver/ProductInfo/Enterprise/default.asp

Windows 2000 Server Site:

https://www.microsoft.com/technet/prodtechnol/windows2000serv/default.mspx