Network state management and failure detection

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

Network state management and failure detection

Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition use sophisticated algorithms to monitor the health of network interfaces in a server cluster. The following tables summarize the states that a network interface and a network can be in.

Network interface states

State Description

Failed

The node that owns the network interface is active; however, it cannot communicate through its network interface. The Cluster service has determined that the error is isolated to the network interface.

Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition use several methods to determine whether failures are isolated to network interfaces. First, the Cluster service uses a heartbeat to help detect network problems. Second, the operating system can receive hardware failure notifications from a network adapter that supports the Network Driver Interface Specification (NDIS). Using network adapters that support newer NDIS features, for example, media sense, improves cluster network state management. Finally, the Cluster service attempts to ping external hosts to estimate the scope of a network interface failure. This can cause a temporary increase in network traffic while the operating system investigates the extent of a network failure.

Note

  • When a failed network adapter is replaced, the Cluster service will determine that the new card is the interface for an existing network if the card is on the same subnet.

Unavailable

The network is disabled for cluster use, or the node associated with this network interface is down.

Unreachable

The node cannot communicate through this network interface. The reason for the failure is unknown.

Up

The network interface is active and can communicate with all other interfaces on the network (except those currently in Failed or Unavailable states). Up is the normal operational state.

Unknown

The state cannot be determined.

Network states

State Description

Down

The network has failed. None of the active nodes in the cluster can communicate with one another using this network.

Unavailable

The network is disabled for cluster use, or all of the cluster nodes attached to this network are inactive.

Partitioned

The network has partially failed. Some active cluster nodes cannot communicate with one another over this network.

Up

The network is functioning normally. Up is the normal operational state.

Unknown

The state cannot be determined.