Understanding HPC Cluster Network Topologies

Microsoft HPC Pack supports five cluster topologies designed to meet a wide range of user needs and performance, scaling, and access requirements. These topologies are distinguished by how the nodes in the cluster are connected to each other and to the enterprise network.

In this topic:

HPC cluster networks

The following table lists and describes the networks to which the nodes in an HPC cluster can be connected.

Network name Description
Enterprise network An organizational network connected to the head node and in some cases to other nodes in the cluster. The enterprise network is often the public or organization network that most users log on to perform their work. All intra-cluster management and deployment traffic is carried on the enterprise network unless a private network and an optional application network also connect the cluster nodes.
Private network A dedicated network that carries intra-cluster communication between nodes. This network, if it exists, carries management, deployment, and application traffic if no application network exists.
Application network A dedicated network, preferably with high throughput and low latency. This network is normally used for parallel Message Passing Interface (MPI) application communication between cluster nodes.

Cluster topologies

The following table lists the five cluster network topologies that are supported by HPC Pack.

Topology Description
1. Compute nodes isolated on a private network - Network traffic between compute nodes and resources on the enterprise network (such as databases and file servers) pass through the head node. Depending on the amount of traffic, this might impact cluster performance.
- The private network carries all communication between the head node and the compute nodes, including deployment, management and application traffic (for example, MPI communication).
- A possible drawback is that compute nodes are not directly accessible by users on the enterprise network. This has implications when developing and debugging parallel applications for use on the cluster.
2. All nodes on enterprise and private networks - Communication between nodes, including deployment, management, and application traffic, is carried on the private network.
- Traffic from the enterprise network can be routed directly to a compute node.
- This topology is well suited for developing and debugging applications because all compute nodes are connected to the enterprise network.
- This topology also provides users on the enterprise network with direct access to compute nodes.
- This topology provides compute nodes with faster access to enterprise network resources.
3. Compute nodes isolated on private and application networks - The private network carries deployment and management communication between the head node and the compute nodes. This offers more consistent cluster performance because intra-cluster communication is routed onto a private network, while application communication is routed on a separate, isolated network.
- MPI jobs running on the cluster use the high-performance application network for cross-node communication.
- A possible drawback is that compute nodes are not directly accessible by users on the enterprise network. This has implications when developing and debugging parallel applications for use on the cluster.
4. All nodes on enterprise, private, and application networks - The private network carries deployment and management communication between the head node and the compute nodes.
- MPI jobs running on the cluster use the high-performance application network for cross-node communication.
- Traffic from the enterprise network can be routed directly to a compute node.
- This topology is well suited for developing and debugging applications because all compute nodes are connected to the enterprise network.
- This topology provides users on the enterprise network with direct access to compute nodes.
- This topology provides compute nodes with direct access to enterprise network resources.
5. All nodes only on an enterprise network - All traffic, including enterprise, intra-cluster and application traffic, is carried over the enterprise network.
- This topology provides users on the enterprise network with direct access to compute nodes.
- This topology provides compute nodes with direct access to enterprise network resources.
- This topology is well suited for developing and debugging applications because all cluster nodes are connected to the enterprise network.
- Because all nodes are connected only to the enterprise network, you cannot use the deployment tools in HPC Pack to deploy nodes from bare metal or over iSCSI.

Connecting broker nodes, workstation nodes, or unmanaged server nodes

If you want to add broker nodes, workstation nodes, or unmanaged server nodes to your cluster, you must choose a network topology that will work with the type of jobs and services that these two types of nodes will be running. Also, you must connect the nodes to the HPC networks of the topology that you choose, in a way that they can communicate with all the nodes that they need to interact with.

Note

Unmanaged server nodes are supported starting in HPC Pack 2008 R2 with Service Pack 3 (SP3).

For example, broker nodes must be connected to the network where the clients that are starting service-oriented architecture (SOA) sessions are connected (usually the enterprise network) and to the network where the compute nodes that are running the SOA services are connected (if different from the network where the clients are connected). In most cases, having a private network, and if possible also a high-throughput and low-latency application network, will make the work of broker nodes more efficient because all communication between the broker nodes and the clients that are starting SOA sessions will not have to occur over the enterprise network, which is a busy network in most organizations.

In the case of workstation nodes and unmanaged server nodes, topology 5 (all nodes on an enterprise network) is the recommended topology because in that topology the nodes (usually already connected to the enterprise network) are able to communicate with all other types of nodes in the cluster. Although other topologies are supported for workstation nodes and unmanaged server nodes, depending on the type and the scope of the jobs that you want to run, there might be important limitations that you need to consider. For example, if you choose topology 1 (compute nodes isolated on a private network) or topology 3 (compute nodes isolated on private and application networks), and workstation nodes are already connected to the enterprise network, communication between compute nodes and workstation nodes will not be possible.

For detailed information about network topologies, as well as information about advanced network configurations, see HPC Cluster Networking.

Additional references