NLB Fundamentals

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

To create a successful Network Load Balancing design and to ensure that Network Load Balancing is correct for your solution, you need to know the fundamentals of how Network Load Balancing provides improved scalability and availability, and how Network Load Balancing compares with other strategies for providing scalability and availability.

How NLB Provides Improved Scalability and Availability

Network Load Balancing improves scalability and availability by distributing client traffic across the servers that you include in the Network Load Balancing cluster. Each cluster host (a server running on a cluster) runs an instance of the applications supported by your cluster. Network Load Balancing transparently distributes client requests among the cluster hosts. Clients access your cluster by using one or more virtual IP addresses. From the perspective of the client, the cluster appears to be a single server that answers the client request.

As the scalability and availability requirements of your solution change, you can add or remove servers from the cluster as necessary. Network Load Balancing automatically distributes client traffic to take advantage of any servers that you add to the cluster. In addition, when you remove a server from the cluster, Network Load Balancing redistributes the client traffic among the remaining servers in the cluster.

As an example, assume that your organization has a Web application farm running Microsoft® Internet Information Services (IIS) version 6.0 that hosts your organization’s Internet presence. As seen in Figure 8.2, Network Load Balancing allows your individual Web application servers to service client requests from the Internet by distributing them across the cluster. On each of the servers, you install IIS 6.0 and Network Load Balancing. By combining the individual Web application servers into a Network Load Balancing cluster, you can load balance the requests to improve client response times and to provide improved fault tolerance in the event that one of the Web application servers fails.

Figure 8.2   Network Load Balancing Cluster in a Web Farm

Network Load Balancing Cluster in a Web Farm

Network Load Balancing automatically detects and recovers when the entire server fails or is manually disconnected from the network. However, Network Load Balancing is unaware of the applications and services running on the cluster, and it does not detect failed applications or services. To provide awareness of application or service failures, you need to add management software, such as Microsoft® Operations Manager (MOM), Microsoft® Application Center 2000, a third-part party application, or software developed by your organization.

When your design requires fault tolerance for servers that support your Network Load Balancing cluster, such as servers running Microsoft® SQL Server™ 2000, include Microsoft server clusters. For example, you can improve the availability of the network database (SQLCLSTR-01 in Figure 8.2) by creating a two-node server cluster. For more information about server clusters, see "Designing and Deploying Server Clusters" in this book.

Network Load Balancing runs as an intermediate network driver in the Windows Server 2003 network architecture. Network Load Balancing is logically situated beneath higher-level application protocols, such as Hypertext Transfer Protocol (HTTP) and File Transfer Protocol (FTP), and above the network adapter drivers. Figure 8.3 illustrates the relationship of Network Load Balancing in the Windows Server 2003 network architecture.

Figure 8.3   Network Load Balancing in the Windows Server 2003 Network Architecture

Load Balancing in Windows Server 2003 Network

To maximize throughput and to provide high availability in your solution, Network Load Balancing uses a distributed software architecture. A copy of the Network Load Balancing driver runs on each host in the cluster. The Network Load Balancing drivers allow all hosts in the cluster to concurrently receive incoming network traffic for the cluster.

On each host in the cluster, the driver acts as an intermediary between the network adapter driver and the TCP/IP stack. This allows a subset of the incoming network traffic to be received by the host. Network Load Balancing uses this filtering mechanism to distribute incoming client requests among the servers in the cluster.

Network Load Balancing architecture maximizes throughput by using a common media access control (MAC) address to deliver incoming network traffic to all hosts in the cluster. As a result, there is no need to route incoming packets to the individual hosts in the cluster. Because filtering unwanted network traffic is faster than routing packets (which involves receiving, examining, rewriting, and resending), Network Load Balancing delivers higher network throughput than dispatcher-based software load balancing solutions. Also, as you add hosts to your Network Load Balancing cluster, the scalability grows proportionally, and any dependence on a particular host diminishes.

Because Network Load Balancing load balances client traffic across multiple servers, it provides higher availability in your solution. One or more cluster hosts can fail, but the cluster continues to service client requests as long as any cluster hosts are running.

NLB and Round Robin DNS

Round robin Domain Name System (DNS) is a software method for distributing workload among multiple servers, but does not prevent clients from detecting server outages. If one of the servers fails, round robin DNS continues sending client requests to the server until a network administrator detects the failure and removes the server from the DNS address list. This results in service disruption for clients.

In contrast, Network Load Balancing automatically detects servers that have been disconnected from the cluster and redistributes client requests to the remaining servers. Unlike round robin DNS, this prevents clients from sending requests to the failed servers.

For more information about methods for improving availability and scalability, see "Planning for High Availability and Scalability" in this book.