Compute Cluster Network Requirements

Applies To: Windows Compute Cluster Server 2003

Supported Cluster Topologies

Windows Compute Cluster Server 2003 supports five different cluster topologies. Each topology has implications for performance and accessibility. The topologies involve at least one and possibly three different networks: Public, Private, and Message Passing Interface (MPI). In this documentation, these terms are defined as follows:

Term Definition

Public network

An organizational network connected to the head node and optionally, the cluster compute nodes. The public network is often the business or organizational network most users log onto to perform their work. All intra-cluster management and deployment traffic is carried on the public network unless a private network (and optionally, an MPI network) also connect the cluster nodes.

Private network

A dedicated network that carries intra-cluster communication between nodes. This network, if it exists, carries management, deployment, and MPI traffic if no MPI network exists.

MPI network

A dedicated network, preferably high bandwidth and low latency, that carries parallel MPI application communication between cluster nodes. This network, if it exists, is usually the highest bandwidth network of the three listed here.

If the jobs you intend to submit to the cluster do not use MPI libraries, no MPI traffic will be generated and an MPI network is not needed.

Examples of high speed networks include but are not limited to:

  • Gigabit Ethernet

  • 10 Gigabit Ethernet

  • Myrinet©

  • InfiniBand©

These networking technologies can be used for any of the three networks listed above and are recommended, but not required, when implementing MPI and private cluster networks. We recommend that if you plan to implement a high speed cluster network, you should make that network an MPI network. If you are not planning to use an MPI network, then implement your high speed network as the private network.

Note

If you plan to use a high speed network (such as InfiniBand© or Myrinet©) as a private network, note that some implementations of these technologies do not support the Pre-Boot Execution Environment (PXE). PXE is a prerequisite for using the Automated method of adding nodes to the cluster.

Choosing a Cluster Topology

The following graphic illustrates the five cluster topology scenarios supported by Windows Compute Cluster Server 2003.

Supported Network Topologies

Each topology offers varying degrees of cluster performance and accessibility from the public network.

In all topology scenarios except scenario 5, one or more separate dedicated networks carry intra-cluster and MPI traffic between the compute nodes and the cluster head node. This improves cluster performance and offloads that traffic from the public network.

In topology scenarios 2, 4, and 5, the compute nodes have network interfaces directly to the public network. This facilitates application development and debugging. If your developers are writing applications for the cluster and want to run a debugger on each compute node (for instance, when developing parallel applications), consider one of these topologies to allow developers to directly access each compute node from the public network. Otherwise, it is necessary to run the debugger from a node on the private network. This can be the head node, a compute node, or a dedicated development machine.

The data requirements of jobs run on the cluster impacts your choice of topologies. If your users will submit jobs that read and write large amounts of data to resources on the public network, consider choosing cluster topologies where the compute nodes have interfaces to the public network. If the jobs have more modest requirements, those topologies where the compute nodes are isolated on a private network behind the head node are acceptable.

Considerations for each topology scenario

Scenario 1 - Compute nodes isolated on private network:

  • Improved cluster response because internal cluster network is routed onto the private network.

  • Cluster compute nodes are not directly accessible by users on the public network.

Scenario 2 - All nodes on both public and private networks:

  • Improved cluster response because internal cluster network is routed onto the private network.

  • Well-suited to applications development and debugging because all cluster nodes are connected to the public network.

Scenario 3 - Compute nodes isolated on private and MPI networks:

  • Improved cluster response because internal cluster network traffic is routed onto private and MPI networks.

  • Cluster compute nodes are not directly accessible by users on the public network.

Scenario 4 - All nodes on public, private, and MPI networks:

  • Improved cluster response because internal cluster network is routed onto private and MPI networks.

  • Well-suited to applications development and debugging because all cluster nodes are connected to the public network.

Scenario 5 - All nodes only on public network:

  • Well-suited to applications development and debugging because all cluster nodes are connected to the public network.

  • The automated method of creating and adding compute nodes (which uses RIS) cannot be used with the topology described in scenario 5 because a private intra-cluster network is required for RIS.

Networking Services For Clusters

Whatever cluster topology you choose, consider how you will deliver DHCP and DNS services to compute nodes on each network to which the cluster nodes are connected. Each network interface on a compute node to the public and private networks needs an IP address (statically or dynamically assigned) and may require name resolution.

In cluster topology scenarios where a private network exists and the cluster compute nodes are not connected to the public network, network address translation (NAT) is required on the cluster head node so that compute nodes can access domain controllers and services on the public network.

There are two ways to provide these services (NAT, DHCP, and DNS):

  • Enable Internet Connection Sharing (ICS) on the head node (Recommended)

  • Use Routing and Remote Access Service (RRAS) (Required if you have more than 254 compute nodes)

ICS and RRAS cannot be used at the same time. The relative advantages of ICS and RRAS are shown in the following table:

ICS RRAS

Can be enabled through Configure Cluster Topology wizard.

Must be enabled manually (Refer to RRAS documentation).

Enables NAT between private and public network, providing a mini DHCP and DNS proxy services on HN.

Enables NAT between private and public network, providing a mini DHCP and DNS proxy services on the head node.

ICS automatically assigns 192.168.0.1 (SM: 255.255.255.0) static IP address to private network adapter on the head node.

You must manually assign a static IP address to private network adapter on the head node.

Supports up to 255 devices (C class network). Specifically, ICS distributes 192.168.0.x (SM: 255.255.255.0) addresses on private network (this network address can not be changed).

Supports as many devices as you want (B or C class networks). You can specify network address that will be distributed on private network.

Integrated with Windows Firewall, with firewall managed by Compute Cluster Server 2003.

Does not support Windows Firewall (has it’s own firewall).

Using ICS

Enabling ICS on the head node provides NAT, mini-DHCP, and mini-DNS services to compute nodes on a private network. Enabling ICS is recommended whenever a private network is present. If ICS is enabled on the head node, the private interface on the head node is set to 192.168.0.1 with a network mask of 255.255.255.0.

If ICS is not enabled as part of Compute Cluster Pack configuration, you will have to provide DHCP and DNS services to the private network interfaces of the compute nodes.

Note

ICS will not provide NAT, DHCP, or DNS services on an MPI network. If your implementation of an MPI network requires these services, provide them in whatever means is appropriate for the MPI network technology you've chosen. Some implementations of an MPI network require static IP addresses, and might require name resolution services.

The Automated method of adding nodes to the cluster, which depends on Remote Installation Services (RIS), requires dynamically assigned IP addresses for the private network interfaces of the compute nodes. To provide DHCP services, you can:

  • Enable ICS on the head node during Compute Cluster Pack setup.

  • Configure the head node as a DHCP server.

  • Install a DHCP server on the private network.

Important

If the head node is a domain controller, you must install the ICS hotfix. ICS network address translation cannot be enabled if the ICS hotfix is not installed. To obtain the ICS hotfix, see Update for Windows Server x64 Edition (KB897616) (https://go.microsoft.com/fwlink/?linkid=55166).

For more information about ICS, see Description of Internet Connection Sharing (https://go.microsoft.com/fwlink/?LinkID=55073).

Active Directory

Active Directory® is a prerequisite for Windows Compute Cluster Server. Active Directory provides Kerberos for authentication and LDAP for directory services. Compute Cluster Server uses the Active Directory security infrastructure to manage user and resource security of compute nodes and jobs.

Important

All cluster compute nodes must be in the same Active Directory domain as the head node.

It is strongly recommended that you join the cluster to an existing corporate Active Directory domain. If there is no existing corporate Active Directory domain, you are encouraged to set one up, particularly if you are installing multiple clusters or plan to add clusters in the future. If you are unable to set up a network-wide Active Directory domain, or if you want to test the cluster in an isolated state before joining it to an existing domain, the head node can be made a domain controller of a new domain in a new forest.

Because of the administrative complexities involved, the head node-as-domain controller solution is not recommended except for test purposes or unless there is no other option. For more information, see Using the Head Node as an Active Directory domain controller.

Note

We do not recommend moving existing clusters between domains.

Corporate Active Directory Domains with Internet Protocol Security

Clusters in corporate Active Directory domains with Internet Protocol Security (IPsec) may require special configuration. IPsec provides secure communications over Internet Protocol (IP) networks using cryptographic security services. IPsec is a collection of IP protocols used to provide secure network communication. IPsec supports network-level peer authentication, data origin authentication, data integrity, data encryption, and replay protection. IPsec is integrated with Active Directory. IPsec policies can be assigned through Group Policy, which allows IPsec to be configured at the domain, site, or organizational unit level. Network administrators frequently use IPsec to isolate domains and servers from insecure communications.

When isolating a domain, administrators use IPsec, Group Policy settings, and membership in an Active Directory domain to require that computers in a domain accept only authenticated and secured communications from computers in that domain. This network policy isolates computers which are members of a domain from computers which are not members of that domain.

Similarly, when isolating servers, administrators use IPsec, Group Policy settings, and membership in an Active Directory domain to require that specific server computers that are members of a domain accept authenticated and secured communications only from other computers in that domain. This network policy isolates specific servers from computers that are not members of that domain. For example, to protect traffic to and from specific database servers, you would configure and deploy server isolation utilizing Group Policy settings to require secure network communication traffic between client computers and the database servers.

Using Windows Compute Cluster Server 2003 with IPsec

When ICS is enabled on the head node, Windows Compute Cluster Server 2003 uses the IP address range 192.168.0.x for interfaces on the cluster private network. This range is often excluded by network administrators from the range of permitted originating addresses, since it is frequently used for home networks.

In cluster topologies 1 and 3, the compute nodes reside on a private network and all traffic to the public network passes through the head node (which has ICS enabled). If IPsec is implemented in your network environment and these cluster topologies are used, when the cluster nodes initiate a session, that session will fail. The reason is that while the head node and the resource computer on the public network will mutually authenticate per policy, the compute nodes have excluded IP addresses. If this occurs, do one of the following:

  • Remove the exempt 192.168.0.0/24 subnets from the IPsec policy.

  • Use another NAT technology like RRAS.

  • Create boundary servers.

Some administrators with a secured environment still need to allow untrusted computers limited access to a secure server, called a 'boundary' server. A boundary server is an IPsec-enabled computer that accepts non-IPsec inbound connections. Using boundary servers will permit flexibility in meeting users needs while increasing network security, at the cost of increased security management of those servers.

Note

Because of the enhanced security measures in IPsec, if you want to set up any form of access between the two domains, setting up your head node as a domain controller is not recommended on a corporate network using IPsec.

Using the Head Node as an Active Directory domain controller

Important

Because of the potential administrative difficulties involved, we do not recommend using the head node as a domain controller unless isolation is required (for example, for test purposes) or no other option exists.

If there is a need to isolate the cluster from the existing corporate Active Directory domain, or if there is no existing corporate Active Directory domain, the head node can be made the domain controller of a new domain in a new forest. This usually involves the following tasks:

  • Providing name resolution

  • Providing user access from the corporate network

Providing name resolution

If the head node is the domain controller, you will usually want to set it up as the DNS server as well for all cluster networks that require name resolution. To do so, use the following steps:

  1. On the head node interfaces, point the head node to itself as the DNS server using the loopback address 127.0.0.1.

  2. On the compute node interfaces, ensure that the head node IP address is used as the DNS address. On the private interface, this takes place automatically if ICS is enabled. If DHCP is used, you can distribute the head node IP address using the DHCP scope option DNS Servers. To do this, create your DHCP scope, set the DNS Servers scope option, and add your head node name and IP address to the accompanying form.

  3. If compute nodes need to access hosts on the public network, set a forwarder on the head node DNS server for access to the public DNS server

  4. If possible, set an A-record on the public DNS server to allow access to the head node from the public network.

Providing user access from the corporate domain

Because no trust relationship exists between the corporate network and the cluster Active Directory domain, users on the corporate network will need to be given alternate credentials that are valid on the new domain in order to access the cluster. If desired, you can set up forest trusts to accept corporate credentials, though in that case the need for setting up cluster in a separate domain should probably be reconsidered. For more information about forest trusts, see Multiforest Deployments (https://go.microsoft.com/fwlink/?LinkID=66308) and Forest Trusts (https://go.microsoft.com/fwlink/?LinkID=66309) on the Microsoft Web site.

See Also

Concepts

Microsoft Networking Resources Configuring Network Adapters on the Head Node Enabling Remote Debugging