Troubleshooting NLB

This topic provides guidance for diagnosing and resolving issues that you may encounter when you use Network Load Balancing (NLB) to load balance traffic among Forefront TMG array members.

The following sections provide:

  • Understanding Cluster Operation Modes

  • Flowchart for troubleshooting NLB

  • Procedures for troubleshooting NLB

Understanding Cluster Operation Modes

NLB assumes that NLB interfaces are connected to a Layer 2 device by default. This configuration uses the MaskSourceMAC feature to ensure that the switch is unable to learn the original source MAC addresses of the NLB hosts.

In Unicast cluster operation mode, if the switch is unable to associate a MAC address with a particular port (because it is masked) it sends the data to all switch ports; thereby ensuring that all NLB hosts process the traffic.

To identify NLB-enabled hosts when using switch or network tracing software look for MAC addresses that start with 02. The masked MAC address is similar to the original MAC address, but with the first two fields replaced as follows: 02-[Host ID including zero]-[Original MAC address values]. That is, an NLB host with a host ID of 3 and a MAC address of 00-19-BB-3C-29-08 has a substituted source MAC address of 02-03-BB-3C-29-08.

In Multicast cluster operation mode, when the source MAC address is masked, the ARP response from an NLB host includes a substitute source MAC address in the Ethernet frame, but contains the correct NLB cluster MAC address in the ARP header. Some Layer 3 switches and routers are confused by this response and cannot perform the ARP mapping automatically. In this case, create a static ARP entry on the affected switch/router which maps the NLB virtual IP address to the NLB cluster MAC address.

Layer 3 switches provide routing capabilities and must be configured to interoperate with Microsoft NLB. Create VLANs that operate in Layer 2 mode and then connect NLBenabled interfaces to ports which are associated with this special Layer 2 mode VLAN. These VLANs now function as Layer 2 devices.

In unicast mode (the default Forefront TMG cluster operation mode) NLB induces switch flooding, by design, relaying packets sent to the VIP addresses to all cluster hosts. Switch flooding is part of the NLB strategy for obtaining the best throughput for any specific load of client requests. However, if the NLB interfaces share the switch with other (non-cluster) computers, switch flooding can add to the other computers' network overhead by including them in the flooding and consequently have a detrimental effect on network and/or server performance.

To solve this problem, isolate the NLB hosts so that the inherent switch flooding mechanism only affects cluster nodes, as opposed to other non-cluster computers on the same network (broadcast domain). This can be achieved by placing the NLB interfaces in their own LAN or virtual LAN, thereby creating an isolated network for NLB-related communications. Another option to avoid flooding non-cluster computers is to place a network hub between the switch and the NLB interfaces, and then disable the MaskSourceMAC feature.

Although multicast mode is often used to remove unicast mode limitations such as switch flooding, this operational mode can also cause switch flooding. As with unicast mode, this can be solved by placing the NLB interfaces into their own LAN or virtual LAN, thereby creating an isolated network across which to pass multicast traffic. If this is not possible, map the switch ports to which NLB-enabled interfaces are attached to the NLB cluster MAC address via static entries in the Content-Addressable Memory (CAM) table of the switch. This ensures that the switch is aware of which switch ports are NLB-enabled and eliminates the need to flood all ports.

If you have the network hardware to support it, use the Multicast with IGMP cluster operation mode and configure appropriate network devices to support IGMP snooping. This restrains multicast traffic in a switched network without the use of dedicated VLANs. By default, a LAN switch floods multicast traffic within the broadcast domain, consuming bandwidth if several multicast servers send streams to the same segment. With IGMP snooping, the switch intercepts IGMP messages from the host and updates its MAC table accordingly, eliminating the need to manually update the CAM entries.

Flowchart for troubleshooting NLB

This flowchart guides you through the steps that are required for troubleshooting NLB.

Flowchart for troubleshooting VoIP

Procedures for troubleshooting NLB

The following procedures describe the steps you might need to take when you use the flowchart to troubleshoot NLB.

  • How to check if the array members are in synch with the CSS

  • How to check if NLB integration is enabled

  • How to check if the NLB status is green

  • How to check if cluster nodes converged successfully

  • How to check if NLB cluster is configured in Unicast mode

  • How to check if the traffic is blocked by Forefront TMG

  • How to check if traffic is distributed equally in the cluster

  • How to check if the traffic is blocked at the NLB driver level

  • How to check the NLB Hook Rules

  • How to check Layer 3 switch

  • How to check if the route from the array member to the CSS goes through a NIC with MAC address starting with 02-bf

  • How to run NLBClear

  • How to check if IP is configured correctly

How to check if the array members are in synch with the CSS

For more information see Creating a standalone array.

To check if the array members are in synch with the CSS

  1. In the Forefront TMG Management console, in the tree, click the Monitoring node.

  2. Click the Configuration tab.

  3. Verify the configuration status of the servers listed under Configuration Status.

How to check if NLB integration is enabled

To check if NLB integration is enabled:

  1. In the Forefront TMG Management console, in the tree, click the Networking node.

  2. In the Tasks pane, check if Configure Load Balanced Network appears.

  3. If not, click Enable Network Load Balancing Integration to configure NLB.

  4. Select the networks which will be load balanced. NLB cannot be configured for enterprise-level networks and for the following default array-level networks: Local Host, Quarantined VPN Clients, and VPN Clients.

  5. Select a network and click Configure NLB Settings.

  6. Define the Primary VIP, Subnet mask and Cluster operation mode (Unicast, Multicast or IGMP Unicast).

    Important

    The virtual IP address must belong to the network.

  7. Add additional VIPs if required.

  8. Repeat the steps above for each network.

  9. Click Next and then click Finish.

How to check if the NLB status is green

For more information, see Monitoring alerts.

To check if the NLB status is green:

  1. In the Forefront TMG Management console, in the tree, click the Monitoring node.

  2. In the Alerts tab, check that NLB Started for each array member.

  3. In the Network Load Balancing Manager (nlbmgr), check the status of the current node for all clusters define. Repeat this for each member.

  4. Run nlb display to view the current state of the NLB cluster and hosts. Repeat this for each member.

How to check if cluster nodes converged successfully

To check if cluster nodes converged successfully:

  1. Run nlb query. If the node has converged with the cluster a message appears indicating that the node entered a converging state.

  2. View the event log for entries generated by NLB.

How to check if NLB cluster is configured in Unicast mode

To check if NLB cluster is configured in Unicast mode:

  1. In the Forefront TMG Management console, in the tree, click the Networking node.

  2. In the Tasks pane, click Configure Load Balanced Network.

  3. Click Next.

  4. Select the required network, and then click Configure NLB Setting.

  5. In the Cluster operation mode area, check if Unicast is selected.

How to check if the traffic is blocked by Forefront TMG

To check if the traffic is blocked by Forefront TMG:

  1. In the Forefront TMG Management console, in the tree, click the Troubleshooting node.

  2. Click the Traffic Simulator tab.

  3. Run the Web access and Non-Web access simulation scenarios. If required update the policy rules.

  4. Check the logs.

How to check if traffic is distributed equally in the cluster

To check if traffic is distributed equally in the cluster:

  1. In the Forefront TMG Management console, in the tree, click the Monitoring node.

  2. Click the Sessions tab.

  3. Verify that the sessions are distributed across all members.

  4. Click the Logs & Reports node.

  5. In the Logging tab, check the historical data by filtering the logs by Server Name for a period.

  6. Click the Troubleshooting node.

  7. In the Connectivity Test tab, ping each VIP.

  8. If all the traffic is sent to only one MAC address, it is possible that the switch has learnt the spoofed MAC address and is sending to one member only. If so, reset the switch.

How to check if the traffic is blocked at the NLB driver level

Use NLB tracing to check for any dropped packets.

To check if the traffic is blocked at the NLB driver level:

  1. Use event tracing to understand why NLB has decided to drop or accept a given network packet. For more information, see:

    1. Network Load Balancing in R2: Using ETW Tracing

    2. Network Load Balancing Testing – NLB Tracing

    3. Network Load Balancing Testing – Tracing

How to check the NLB Hook Rules

Forefront TMG uses NLB Hook Rules to make decisions either it will use the source IP address or destination IP address as affinity.

To check the NLB Hook Rules:

  1. Run netsh tmg show nlb on one of the Forefront TMG servers in the array.

  2. Check which NLB Hook Rules are defined.

How to check Layer 3 switch:

To check Layer 3 switch:

  1. Check your switch documentation to identify whether the switch is a Layer 3 switch.

  2. Configure the switch or configure Forefront TMG for multicast:

    1. In the Forefront TMG Management console, in the tree, click the Networking node.

    2. In the Tasks pane, click Configure Load Balanced Network.

    3. Click Next.

    4. Select the required network, and then click Configure NLB Setting.

    5. Select Unicast as the Cluster operation mode.

How to check if the route from the array member to the CSS goes through a NIC with MAC address starting with 02-bf

To check if the route from the array member to the CSS goes through a NIC with MAC address starting with 02-bf:

  1. If the route from the array member to the CSS goes through a NIC with MAC address starting with 02-bf this means that a separate NIC is not used for intra array communications.

How to run NLBClear

Run this procedure on each array member.

To run NLBClear:

  1. Run net stop fwsrv.

  2. In the Forefront TMG Management console, in the tree, click the Troubleshooting node.

  3. In the Tasks pane, click Remove Network Load Balancing Configuration.

  4. Each time an NLB Clear is successfully performed, the Network Load Balancing configuration settings removed successfully alert appears in the Monitoring node.

How to check if IP is configured correctly

Make sure that the virtual IPs/subnet masks reflect what is configured in Forefront TMG.

To check if IP is configured correctly:

  1. In the Forefront TMG Management console, in the tree, click the Networking node.

  2. In the Tasks pane, click Configure Load Balanced Network.

  3. Click Next.

  4. Select the required network, and then click Configure NLB Setting.

  5. Check the VIPs and subnet mask.

Concepts

Forefront TMG Troubleshooting