NLB Fundamentals - FAQ

Article
10/08/2009

Applies To: Windows Server 2003 with SP1

Q. Do the Heartbeat Packets Consume a Lot of Bandwidth?

A. No, the heartbeat packets, which are emitted every second by each host, consume less than 1,500 bytes.

Q. Do the Heartbeats Need to Go on a Back-End Network?

A. No, the heartbeat packets are always sent out on the same network interface on which data packets are received and sent. There is no need for an additional back-end network for the control (heartbeat) packets. In fact, in order to be able to detect connectivity failures on the load-balanced network adapter, NLB actually requires that heartbeats traverse the load-balanced network adapter.

Q. Is NLB a Kernel Component?

A. Yes, NLB has a kernel component called wlbs.sys. This is an intermediate NDIS driver. NLB also has user-mode components for management purposes.

Q. What are the Benefits of NLB Over Simple Round Robin Domain Name Service (RRDNS)?

Automatic recovery within 5 seconds
More even load balancing

Q. How Does the NLB Load Balancing Algorithm Work?

A. NLB employs a fully distributed filtering algorithm to map incoming clients to the cluster hosts. This algorithm enables cluster hosts to independently and quickly make a load balancing decision for each incoming packet. It is optimized to deliver statistically even load balance for a large client population making numerous, relatively small requests, such as those typically made to Web servers. When the client population is small and/or the client connections produce widely varying loads on the server, the load-balancing algorithm is less effective. However, the simplicity and speed of NLBs algorithm allows it to deliver very high performance, including both high throughput and low response time, in a wide range of useful client/server applications. If No Affinity is set, NLB load balances incoming client requests so as to direct a selected percentage of new requests to each cluster host; the load percentage for each host is set in the NLB Properties dialog for each port range to be load balanced. The algorithm does not dynamically respond to changes in the load on each cluster host (such as the CPU load or memory usage). However, the load distribution is modified when the cluster membership changes, and load percentages are renormalized accordingly.

When inspecting an arriving packet, all hosts simultaneously perform a mapping to quickly determine which host should handle the packet. The mapping uses a randomization function that calculates a host priority based on their IP address, port, and other information. The corresponding host forwards the packet up the network stack to TCP/IP, and the other cluster hosts discard it. The mapping remains unchanged unless the membership of cluster hosts changes, ensuring that a given clients IP address and port will always map to the same cluster host. However, the particular cluster host to which the clients IP address and port map cannot be predetermined since the randomization function takes into account the current and past clusters membership to minimize remappings.

In general, the quality of load balance is statistically determined by the number of clients making requests. This behavior is analogous to dice throws where the number of cluster hosts determines the number of sides of a die, and the number of client requests corresponds to the number of throws. The load distribution improves with the number of client requests just as the fraction of throws of an N-sided die resulting in a given face approaches 1/N with an increasing number of throws. As a rule of thumb, with client affinity set, there must be at least five times more clients than cluster hosts to begin to observe even load balance.

The Network Load Balancing client affinity settings are implemented by modifying the statistical mapping algorithms input data. When client affinity is selected in the NLB Properties dialog, the clients port information is not used as part of the mapping. Hence, all requests from the same client always map to the same host within the cluster. Note that this constraint has no timeout value and persists until there is a change in cluster membership. When single affinity is selected, the mapping algorithm uses the clients full IP address. However, when Class C affinity is selected, the algorithm uses only the Class C portion (upper 24 bits) of the clients IP address. This ensures that all clients within the same Class C address space map to the same cluster host.

Q. How Does NLB Cluster Convergence Work?

A. Convergence involves computing a new cluster membership list and recalculating the statistical mapping of client requests to the cluster hosts. There are two instances in which cluster traffic has to be remapped due to a change in cluster membership: when a host leaves the cluster and when a host joins the cluster. A convergence can also be initiated when several other events take place on a cluster, such as changing the load balancing weight on a host or implementing port rule changes.

Removing a Member. Two situations cause a host to leave the cluster or go offline. First, the host can fail, an event that is detected by the NLB heartbeat. Second, a system administrator can explicitly remove a host out of the load-balancing cluster or stop NLB on that host.

The NLB heartbeat. NLB uses a heartbeat mechanism to determine the state of the hosts that are load balanced. This message is an Ethernet-level broadcast that goes to every load-balanced cluster host.

NLB assumes that a host is functioning normally within a cluster as long as it participates in the normal exchange of heartbeat messages between it and the other hosts. If the other hosts do not receive a message from a host for several periods of heartbeat exchange, they initiate convergence. The number of missed messages required to initiate convergence is set to five by default (but can be changed).

During convergence, NLB reduces the heartbeat period by one-half to expedite completion of the convergence process.

Server Failure. When a cluster host fails, the client sessions associated with the host are dropped.

After convergence occurs, client connections to the failed host are remapped among the remaining cluster hosts, who are unaffected by the failure and continue to satisfy existing client requests during convergence. Convergence ends when all the hosts report a consistent view of the cluster membership and distribution map for several heartbeat periods.

Q. Can NLB Balance Load Based on CPU/Memory Usage?

A. No, NLB does not respond to changes in the server load (such as CPU usage or memory utilization) or the health of an application.

NLB Fundamentals - FAQ

Additional resources