Cluster Installation

Applies To: Windows Server 2003 with SP1

Installation Overview

During the installation process, some nodes will be shut down while others are being installed. This step helps guarantee that data on disks attached to the shared bus is not lost or corrupted. This can happen when multiple nodes simultaneously try to write to a disk that is not protected by the cluster software. The default behavior of how new disks are mounted has been changed in Windows 2003 Server from the behavior in the Microsoft® Windows® 2000 operating system. In Windows 2003, logical disks that are not on the same bus as the boot partition will not be automatically mounted and assigned a drive letter. This helps ensure that the server will not mount drives that could possibly belong to another server in a complex SAN environment. Although the drives will not be mounted, it is still recommended that you follow the procedures below to be certain the shared disks will not become corrupted.

Use the table below to determine which nodes and storage devices should be turned on during each step.

The steps in this guide are for a two-node cluster. However, if you are installing a cluster with more than two nodes, the Node 2 column lists the required state of all other nodes.

Step Node 1 Node 2 Storage Comments

Setting up networks

On

On

Off

Verify that all storage devices on the shared bus are turned off. Turn on all nodes.

Setting up shared disks

On

Off

On

Shutdown all nodes. Turn on the shared storage, then turn on the first node.

Verifying disk configuration

Off

On

On

Turn on the first node, turn on second node. Repeat for nodes 3 and 4 if necessary.

Configuring the first node

On

Off

On

Turn off all nodes; turn on the first node.

Configuring the second node

On

On

On

Turn on the second node after the first node is successfully configured. Repeat for nodes 3 and 4 as necessary.

Post-installation

On

On

On

All nodes should be on.

Several steps must be taken before configuring the Cluster service software. These steps are:

  • Installing Windows Server 2003 Enterprise Edition or Windows Server 2003 Datacenter Edition operating system on each node.

  • Setting up networks.

  • Setting up disks.

Perform these steps on each cluster node before proceeding with the installation of cluster service on the first node.

To configure the cluster service, you must be logged on with an account that has administrative permissions to all nodes. Each node must be a member of the same domain. If you choose to make one of the nodes a domain controller, have another domain controller available on the same subnet to eliminate a single point of failure and enable maintenance on that node.

Installing the Windows Server 2003 Operating System

Refer to the documentation you received with the Windows Server 2003 operating system package to install the system on each node in the cluster.

Before configuring the cluster service, you must be logged on locally with a domain account that is a member of the local administrators group.

Note

The installation will fail if you attempt to join a node to a cluster that has a blank password for the local administrator account. For security reasons, Windows Server 2003 prohibits blank administrator passwords.

Setting Up Networks

Each cluster node requires at least two network adapters with two or more independent networks, to avoid a single point of failure. One is to connect to a public network, and one is to connect to a private network consisting of cluster nodes only. Servers with multiple network adapters are referred to as “multi-homed.” Because multi-homed servers can be problematic, it is critical that you follow the network configuration recommendations outlined in this document.

Microsoft requires that you have two Peripheral Component Interconnect (PCI) network adapters in each node to be certified on the Hardware Compatibility List (HCL) and supported by Microsoft Product Support Services. Configure one of the network adapters on your production network with a static IP address, and configure the other network adapter on a separate network with another static IP address on a different subnet for private cluster communication.

Communication between server cluster nodes is critical for smooth cluster operations. Therefore, you must configure the networks that you use for cluster communication are configured optimally and follow all hardware compatibility list requirements.

The private network adapter is used for node-to-node communication, cluster status information, and cluster management. Each node’s public network adapter connects the cluster to the public network where clients reside and should be configured as a backup route for internal cluster communication. To do so, configure the roles of these networks as either "Internal Cluster Communications Only" or "All Communications" for the Cluster service.

Additionally, each cluster network must fail independently of all other cluster networks. This means that two cluster networks must not have a component in common that can cause both to fail simultaneously. For example, the use of a multiport network adapter to attach a node to two cluster networks would not satisfy this requirement in most cases because the ports are not independent.

To eliminate possible communication issues, remove all unnecessary network traffic from the network adapter that is set to Internal Cluster communications only (this adapter is also known as the heartbeat or private network adapter).

To verify that all network connections are correct, private network adapters must be on a network that is on a different logical network from the public adapters. This can be accomplished by using a cross-over cable in a two-node configuration or a dedicated hub (not a smart hub) in a configuration of more than two nodes. Do not use a switch, smart hub, or any other routing device for the heartbeat network.

Note

Cluster heartbeats cannot be forwarded through a routing device because their Time to Live (TTL) is set to 1. The public network adapters must be only connected to the public network. If you have a virtual LAN, then the latency between the nodes must be less then 500 milliseconds (ms). Also, in Windows Server 2003, heartbeats in Server Clustering have been changed to multicast; therefore, you may want to make a Madcap server available to assign the multicast addresses. For additional information, see the following article in the Microsoft Knowledge Base: 307962 Multicast Support Enabled for the Cluster Heartbeat.

Figure 1 below outlines a four-node cluster configuration.

f4ecb036-76af-49f9-930a-47377425ec78

Figure 1. Connections for a four-node cluster.

General Network Configuration

Note

This guide assumes that you are running the default Start menu. The steps may be slightly different if you are running the Classic Start menu. Also, which network adapter is private and which is public depends upon your wiring. For the purposes of this white paper, the first network adapter (Local Area Connection) is connected to the public network, and the second network adapter (Local Area Connection 2) is connected to the private cluster network. Your network may be different.

To rename the local area network icons

It is recommended that you change the names of the network connections for clarity. For example, you might want to change the name of Local Area Connection 2 to something like Private. Renaming will help you identify a network and correctly assign its role.

  1. Click Start, point to Control Panel, right-click Network Connections, and then click Open

  2. Right-click the Local Area Connection 2 icon.

  3. Click Rename.

  4. Type Private in the textbox, and then press ENTER.

  5. Repeat steps 1 through 3, and then rename the public network adapter as Public.

    3838f3c0-9266-4ca2-98a2-0f9416a92f97Figure 2. Renamed icons in the Network Connections window.

  6. The renamed icons should look like those in Figure 2 above. Close the Network Connections window. The new connection names will appear in Cluster Administrator and automatically replicate to all other cluster nodes as they are brought online.

To configure the binding order networks on all nodes

  1. Click Start, point to Control Panel, right-click Network Connections, and then click Open

  2. On the Advanced menu, click Advanced Settings.

  3. In the Connections box, make sure that your bindings are in the following order, and then click OK:

    1. Public

    2. Private

    3. Remote Access Connections

Configuring the Private Network Adapter

  1. Right-click the network connection for your heartbeat adapter, and then click Properties.

  2. On the General tab, make sure that only the Internet Protocol (TCP/IP) check box is selected, as shown in Figure 3 below. Click to clear the check boxes for all other clients, services, and protocols.

    49ec850d-6731-4bc2-a880-7bab79d397e5Figure 3. Click to select only the Internet Protocol check box in the Private Properties dialog box.

  3. If you have a network adapter that is capable of transmitting at multiple speeds, you should manually specify a speed and duplex mode. Do not use an auto-select setting for speed, because some adapters may drop packets while determining the speed. The speed for the network adapters must be hard set (manually set) to be the same on all nodes according to the card manufacturer's specification. If you are not sure of the supported speed of your card and connecting devices, Microsoft recommends you set all devices on that path to 10 megabits per second (Mbps) and Half Duplex, as shown in Figure 4 below. The amount of information that is traveling across the heartbeat network is small, but latency is critical for communication. This configuration will provide enough bandwidth for reliable communication. All network adapters in a cluster attached to the same network must be configured identically to use the same Duplex Mode, Link Speed, Flow Control, and so on. Contact your adapter's manufacturer for specific information about appropriate speed and duplex settings for your network adapters.

    e6310806-b724-4aaa-9655-07c7207b4dafFigure 4. Setting the speed and duplex for all adapters

    Note

noteNote
Microsoft does not recommended that you use any type of fault-tolerant adapter or "Teaming" for the heartbeat. If you require redundancy for your heartbeat connection, use multiple network adapters set to Internal Communication Only and define their network priority in the Cluster configuration. Issues seen with early multi-ported network adapters, verify that your firmware and driver are at the most current revision if you use this technology. Contact your network adapter manufacturer for information about compatibility on a server cluster. For more information, see the following article in the Microsoft Knowledge Base: 254101 Network Adapter Teaming and Server Clustering.
</div></td>
</tr>
</tbody>
</table>
  1. Click Internet Protocol (TCP/IP), and then click Properties.

  2. On the General tab, verify that you have selected a static IP address that is not on the same subnet or network as any other public network adapter. It is recommended that you put the private network adapter in one of the following private network ranges:

    • 10.0.0.0 through 10.255.255.255 (Class A)

    • 172.16.0.0 through 172.31.255.255 (Class B)

    • 192.168.0.0 through 192.168.255.255 (Class C)

    An example of a good IP address to use for the private adapters is 10.10.10.10 on node 1 and 10.10.10.11 on node 2 with a subnet mask of 255.0.0.0, as shown in Figure 5 below. Be sure that this is a completely different IP address scheme then the one used for the public network.

    Note

noteNote
For more information about valid IP addressing for a private network, see the following article in the Microsoft Knowledge Base: 142863 Valid IP Addressing for a Private Network.
</div></td>
</tr>
</tbody>
</table>

![fc2e28bc-f937-411f-83f0-a76fb56b5b4a](images\Cc758783.fc2e28bc-f937-411f-83f0-a76fb56b5b4a(WS.10).gif "fc2e28bc-f937-411f-83f0-a76fb56b5b4a")**Figure 5. An example of an IP address to use for private adapters.**
  1. Verify that there are no values defined in the Default Gateway box or under Use the Following DNS server addresses.

  2. Click the Advanced button.

  3. On the DNS tab, verify that no values are defined. Make sure that the Register this connection's addresses in DNS and Use this connection's DNS suffix in DNS registration check boxes are cleared.

  4. On the WINS tab, verify that there are no values defined. Click Disable NetBIOS over TCP/IP as shown in Figure 6 on the next page.

    3cadd47b-7b1f-4d65-a1b9-1a7ccbb5cfd9Figure 6. Verify that no values are defined on the WINS tab.

  5. When you close the dialog box, you may receive the following prompt: “This connection has an empty primary WINS address. Do you want to continue?” If you receive this prompt, click Yes

  6. Complete steps 1 through 10 on all other nodes in the cluster with different static IP addresses.

Configuring the Public Network Adapter

Note

If IP addresses are obtained via DHCP, access to cluster nodes may be unavailable if the DHCP server is inaccessible. For this reason, static IP addresses are required for all interfaces on a server cluster. Keep in mind that cluster service will only recognize one network interface per subnet. If you need assistance with TCP/IP addressing in Windows Server 2003, please see the Online Help.

Verifying Connectivity and Name Resolution

To verify that the private and public networks are communicating properly, ping all IP addresses from each node. You should be able to ping all IP addresses, locally and on the remote nodes.

To verify name resolution, ping each node from a client using the node’s machine name instead of its IP address. It should only return the IP address for the public network. You may also want to try a PING –a command to do a reverse lookup on the IP Addresses.

Verifying Domain Membership

All nodes in the cluster must be members of the same domain and be able to access a domain controller and a DNS server. They can be configured as member servers or domain controllers. You should have at least one domain controller on the same network segment as the cluster. For high availability. another domain controller should also be available to remove a single point of failure. In this guide, all nodes are configured as member servers.

There are instances where the nodes may be deployed in an environment where there are no pre-existing Microsoft® Windows NT® 4.0 domain controllers or Windows Server 2003 domain controllers. This scenario requires at least one of the cluster nodes to be configured as a domain controller. However, in a two-node server cluster, if one node is a domain controller, then the other node also must be a domain controller. In a four-node cluster implementation, it is not necessary to configure all four nodes as domain controllers. However, when following a “best practices” model and having at least one backup domain controller, at least one of the remaining three nodes should be configured as a domain controller. A cluster node must be promoted to a domain controller by using the DCPromo tool before the cluster service is configured.

The dependence in Windows Server 2003 on the DNS further requires that every node that is a domain controller also must be a DNS server if another DNS server that supports dynamic updates and/or SRV records is not available (Active directory integrated zones recommended).

The following issues should be considered when deploying cluster nodes as domain controllers:

  • If one cluster node in a two-node cluster is a domain controller, the other node must be a domain controller

  • There is overhead associated with running a domain controller. An idle domain controller can use anywhere between 130 and 140 MB of RAM, which includes having the Cluster service running. There is also increased network traffic from replication, because these domain controllers have to replicate with other domain controllers in the domain and across domains.

  • If the cluster nodes are the only domain controllers, then each must be a DNS server as well. They should point to each other for primary DNS resolution and to themselves for secondary resolution.

  • The first domain controller in the forest/domain will take on all Operations Master Roles. You can redistribute these roles to any node. However, if a node fails, the Operations Master Roles assumed by that node will be unavailable. Therefore, it is recommended that you do not run Operations Master Roles on any cluster node. This includes Scheme Master, Domain Naming Master, Relative ID Master, PDC Emulator, and Infrastructure Master. These functions cannot be clustered for high availability with failover.

  • Clustering other applications such as Microsoft® SQL Server ™ or Microsoft® Exchange Server in a scenario where the nodes are also domain controllers may not be optimal due to resource constraints. This configuration should be thoroughly tested in a lab environment before deployment

Because of the complexity and overhead involved in making cluster-nodes domain controllers, it is recommended that all nodes should be member servers.

Setting Up a Cluster User Account

The Cluster service requires a domain user account that is a member of the Local Administrators group on each node, under which the Cluster service can run. Because setup requires a user name and password, this user account must be created before configuring the Cluster service. This user account should be dedicated only to running the Cluster service, and should not belong to an individual.

Note

The cluster service account does not need to be a member of the Domain Administrators group. For security reasons, granting domain administrator rights to the cluster service account is not recommended.

The cluster service account requires the following rights to function properly on all nodes in the cluster. The Cluster Configuration Wizard grants the following rights automatically:

  • Act as part of the operating system

  • Adjust memory quotas for a process

  • Back up files and directories

  • Increase scheduling priority

  • Log on as a service

  • Restore files and directories

For additional information, see the following article in the Microsoft Knowledge Base:

269229 How to Manually Re-Create the Cluster Service Account

To set up a cluster user account

  1. Click Start, point to All Programs, point to Administrative Tools, and then click Active Directory Users and Computers.

  2. Click the plus sign (+) to expand the domain if it is not already expanded.

  3. Right-click Users, point to New, and then click User.

  4. Type the cluster name, as shown in Figure 7 below, and then click Next.

    01a30e4e-95a1-419a-be8a-4e2dcd8a9b09Figure 7. Type the cluster name.

  5. Set the password settings to User Cannot Change Password and Password Never Expires. Click Next, and then click Finish to create this user.

    Note

noteNote
If your administrative security policy does not allow the use of passwords that never expire, you must renew the password and update the cluster service configuration on each node before password expiration. For additional information, see the following article in the Microsoft Knowledge Base: 305813 How to Change the Cluster Service Account Password.
</div></td>
</tr>
</tbody>
</table>
  1. Right-click Cluster in the left pane of the Active Directory Users and Computers snap-in, and then click Properties on the shortcut menu.

  2. Click Add Members to a Group.

  3. Click Administrators, and then click OK. This gives the new user account administrative privileges on this computer.

  4. Quit the Active Directory Users and Computers snap-in.

Setting up Shared Disks

Important

To avoid corrupting the cluster disks, make sure that Windows Server 2003 and the Cluster service are installed, configured, and running on at least one node before you start an operating system on another node. It is critical to never have more then one node on until the Cluster service is configured. To proceed, turn off all nodes. Turn on the shared storage devices, and then turn on node 1.

About the Quorum Disk

The quorum disk is used to store cluster configuration database checkpoints and log files that help manage the cluster and maintain consistency. The following quorum disk procedures are recommended:

  • Create a logical drive with a minimum size of 50 MB to be used as a quorum disk, 500 MB is optimal for NTFS.

  • Dedicate a separate disk as a quorum resource.

    Important

ImportantImportant
A quorum disk failure could cause the entire cluster to fail; therefore, it is strongly recommended that you use a volume on a hardware RAID array. Do not use the quorum disk for anything other than cluster management.
</div></td>
</tr>
</tbody>
</table>

The quorum resource plays a crucial role in the operation of the cluster. In every cluster, a single resource is designated as the quorum resource. A quorum resource can be any Physical Disk resource with the following functionality:
  • It replicates the cluster registry to all other nodes in the server cluster. By default, the cluster registry is stored in the following location on each node: %SystemRoot%\Cluster\Clusdb. The cluster registry is then replicated to the MSCS\Chkxxx.tmp file on the quorum drive. These files are exact copies of each other. The MSCS\Quolog.log file is a transaction log that maintains a record of all changes to the checkpoint file. This means that nodes that were offline can have these changes appended when they rejoin the cluster.

  • If there is a loss of communication between cluster nodes, the challenge response protocol is initiated to prevent a "split brain" scenario. In this situation, the owner of the quorum disk resource becomes the only owner of the cluster and all the resources. The owner then makes the resources available for clients. When the node that owns the quorum disk functions incorrectly, the surviving nodes arbitrate to take ownership of the device. For additional information, see the following article in the Microsoft Knowledge Base: 309186 How the Cluster Service Takes Ownership of a Disk on the Shared Bus.

During the cluster service installation, you must provide the drive letter for the quorum disk. The letter Q is commonly used as a standard, and Q is used in the example.

To configure shared disks

  1. Make sure that only one node is turned on.

  2. Right click My Computer, click Manage, and then expand Storage.

  3. Double-click Disk Management.

  4. If you connect a new drive, then it automatically starts the Write Signature and Upgrade Disk Wizard. If this happens, click Next to step through the wizard.

    Note

noteNote
The wizard automatically sets the disk to dynamic. To reset the disk to basic, right-click Disk n (where n specifies the disk that you are working with), and then click Revert to Basic Disk.
</div></td>
</tr>
</tbody>
</table>
  1. Right-click unallocated disk space.

  2. Click New Partition.

  3. The New Partition Wizard begins. Click Next.

  4. Select the Primary Partition partition type. Click Next.

  5. The default is set to maximum size for the partition size. Click Next. (Multiple logical disks are recommended over multiple partitions on one disk.)

  6. Use the drop-down box to change the drive letter. Use a drive letter that is farther down the alphabet than the default enumerated letters. Commonly, the drive letter Q is used for the quorum disk, then R, S,and so on for the data disks. For additional information, see the following article in the Microsoft Knowledge Base:

    318534 Best Practices for Drive-Letter Assignments on a Server Cluster

    Note

noteNote
If you are planning on using volume mount points, do not assign a drive letter to the disk. For additional information, see the following article in the Microsoft Knowledge Base: 280297 How to Configure Volume Mount Points on a Clustered Server.
</div></td>
</tr>
</tbody>
</table>
  1. Format the partition using NTFS. In the Volume Label box, type a name for the disk. For example, Drive Q, as shown in Figure 8 below. It is critical to assign drive labels for shared disks, because this can dramatically reduce troubleshooting time in the event of a disk recovery situation.

    d8ea2056-d6fd-4875-b26e-0d7f6cc86918Figure 8. It is critical to assign drive labels for shared disks.

If you are installing a 64-bit version of Windows Server 2003, verify that all disks are formatted as MBR. Global Partition Table (GPT) disks are not supported as clustered disks. For additional information, see the following article in the Microsoft Knowledge Base:

284134 Server Clusters Do Not Support GPT Shared Disks

Verify that all shared disks are formatted as NTFS and designated as MBR Basic.

To verify disk access and functionality

  1. Start Windows Explorer.

  2. Right-click one of the shared disks (such as Drive Q:\), click New, and then click Text Document.

  3. Verify that you can successfully write to the disk and that the file was created.

  4. Select the file, and then press the Del key to delete it from the clustered disk.

  5. Repeat steps 1 through 4 for all clustered disks to verify they can be correctly accessed from the first node.

  6. Turn off the first node, turn on the second node, and repeat steps 1 through 4 to verify disk access and functionality. Assign drive letters to match the corresponding drive labels. Repeat again for any additional nodes. Verify that all nodes can read and write from the disks, turn off all nodes except the first one, and then continue with this white paper.