Step 12: Create Failover Clusters Using WCF Broker Nodes

Applies To: Windows HPC Server 2008 R2

For an HPC cluster running Windows HPC Server 2008 R2, to make Windows Communication Foundation (WCF) broker nodes highly available, group them into one or more failover clusters, (where each WCF broker node belongs to a specific failover cluster). Before starting the procedures in this topic, be sure you have completed the steps that are described in Step 10: Create WCF Broker Nodes Running Windows HPC Server 2008 R2 and Step 11: Set Up Shared Storage for WCF Broker Nodes.

Important

To create failover clusters using broker nodes, perform the following steps, which are described in this topic:

  1. Install the Failover Clustering feature

  2. Validate the failover cluster configuration

  3. Create a failover cluster

  4. Verify the configuration of the shared storage

  5. Configure networks for communication within the failover cluster

  6. Configure servers that can run head node services to recognize WCF broker nodes in failover clusters

  7. Configure high availability for the WCF broker node or nodes

Install the Failover Clustering feature

In this step, you ensure that the Failover Clustering feature has been installed on the WCF broker nodes that you will be placing together in a failover cluster.

To install the Failover Clustering feature on the WCF broker nodes

  1. Confirm that the server you are configuring is a WCF broker node that is not already in a failover cluster.

  2. If you recently installed Windows Server 2008 R2, the Initial Configuration Tasks interface is displayed. If this interface is displayed, under Customize This Server, click Add features. Then skip to step 3.

  3. If the Initial Configuration Tasks interface is not displayed and Server Manager is not running, click Start, click Administrative Tools, and then click Server Manager. (If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.)

    In Server Manager, under Features Summary, click Add Features.

  4. In the Add Features Wizard, click Failover Clustering, click Next, and then click Install.

  5. Follow the instructions in the wizard to complete the installation of the feature.

  6. Repeat the process for each WCF broker node that you want to include in the failover cluster.

Validate the failover cluster configuration

Before creating a failover cluster, we strongly recommend that you validate your configuration. Validation helps you confirm that the configuration of your servers, network, and storage meets a set of specific requirements for failover clusters.

To validate the failover cluster configuration

  1. On one of the newly-installed WCF broker nodes, open the failover cluster snap-in. To do this, click Start, click Administrative Tools, and then click Failover Cluster Manager. (If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.)

  2. Confirm that Failover Cluster Manager is selected, and then in the center pane under Management, click Validate a Configuration.

  3. Follow the instructions in the wizard to specify a set of servers that you plan to place together in a failover cluster, and then specify the validation tests. Then run the tests. To fully validate your configuration, run all tests before you create a failover cluster. This may take several minutes. The storage tests are extensive, but they are worthwhile when you are setting up failover clustering for the first time.

  4. The Summary page appears after the tests run. To view Help topics that can help you interpret the results, click More about cluster validation tests.

  5. While still on the Summary page, click View Report and read the test results.

    Note

    To view the results of the tests after you close the wizard, see

    SystemRoot\Cluster\Reports\Validation Report date and time.html

    (where SystemRoot is the folder in which the operating system is installed, for example, C:\Windows).

  6. As necessary, make changes in the configuration, and then rerun the tests.

Create a failover cluster

To create a failover cluster, you run the Create Cluster Wizard from the Failover Cluster Manager snap-in on one of the WCF broker nodes. Use an account that has local administrator rights on all WCF broker nodes that you want to include in this failover cluster.

To run the Create Cluster Wizard

  1. Confirm that the servers you plan to include in the failover cluster are WCF broker nodes that are not already in a failover cluster.

  2. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Manager. (If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.)

  3. Confirm that Failover Cluster Manager is selected, and then in the center pane under Management, click Create a Cluster.

    Follow the instructions in the wizard to specify the following:

    • All servers that you want to include in the failover cluster (if the server that you are currently running will be in the cluster, be sure to include it in your list). Some checking occurs, to confirm that the servers can contact one another. If the checking fails, review and correct your network setup. (If you ran the failover cluster validation in step 2, the checking probably will not fail).

    • An access point (a network name and associated IP address information) for administering the cluster. To create an access point, choose a network name for the failover cluster, for example, Broker_Cluster1. This action creates a corresponding computer account (object) in Active Directory that uses this name, and it grants full control of that account to your account. The access point includes one or more IP addresses, which can be IPv6 addresses, IPv4 addresses supplied through DHCP, or static IPv4 addresses.

  4. After the wizard runs and the Summary page appears, to view a report of the tasks that the wizard performed, click View Report.

Verify the configuration of the shared storage

In this step, you confirm that the shared storage is accessible to the failover cluster and that the failover cluster quorum is configured appropriately.

To verify the configuration of the shared storage

  1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Manager. (If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.)

  2. In the console tree, if the failover cluster that you created is not displayed, right-click Failover Cluster Manager, click Manage a Cluster, and then select the appropriate cluster.

  3. In the console tree, expand the failover cluster that you created.

  4. Click Storage. You should see that the shared storage is added to the Storage container. If not, in the Action pane, click Add a Disk and add the appropriate disks. (If you do not see your disks in the resulting dialog box, rerun the storage validation tests.)

  5. If either of the following is true, you can skip the remaining steps in this procedure:

    • Your failover cluster has, or will have, an odd number of nodes

    • One of the disks has the label Disk Witness in Quorum

    Failover Cluster Manager, Summary of Storage

  6. If the failover cluster has an even number of nodes, ensure that the failover cluster quorum includes a disk witness. (Do not configure a disk witness if the failover cluster has an odd number of nodes.)

    1. In the console tree, right-click the fully qualified domain name of the failover cluster, click More Actions, and then click Configure Cluster Quorum Settings.

    2. In the Configure Cluster Quorum Wizard, if the Before You Begin page appears, click Next.

    3. In Select Quorum Configuration, view the selection. If Node and Disk Majority is already selected, click Next. Otherwise, select Node and Disk Majority, and then click Next. (If Node and Disk Majority is already selected, a disk witness was added to the quorum configuration during setup.)

    4. In Configure Storage Witness, if needed, expand the entries to see the drive letter or other information about a disk. Then ensure that the disk that is chosen is the one you want to use as the disk witness. You can use a relatively small disk for the disk witness, but no smaller than 512 MB.

    5. If you have made changes to the quorum configuration, follow the instructions to finish the wizard. Otherwise, click Cancel.

Configure networks for communication within the failover cluster

In this step, you configure the networks for communication within the failover cluster to ensure that they provide client access.

To configure the networks for communication within the failover cluster

  1. In Failover Cluster Manager, to view the networks, in the console tree, select Networks. In the Cluster Use column, if a network that you want the failover cluster to use for communication with other computers is not listed as “Enabled,” right-click the network, click Properties, and then make sure that Allow cluster network communication on this network and Allow clients to connect through this network are selected.

    Failover cluster Network Properties dialog box

  2. Click OK twice.

Configure servers that can run head node services to recognize WCF broker nodes in failover clusters

In this procedure, you provide the head node with configuration information that allows it to recognize a WCF broker node that is running in a failover cluster instead of on a standalone server.

To configure servers that can run head node services to recognize WCF broker nodes in failover clusters

  1. On a server that is configured to run the head node (not a WCF broker node), click Start, right-click Notepad, and then click Run as administrator. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.

  2. In Notepad, click File, click Open, and navigate to:

    SystemDrive\Program Files\Microsoft HPC Pack 2008 R2\Bin

    (where SystemDrive is the drive on which the operating system is installed, for example, C:\).

  3. In the Open dialog box, in the list box above the Open button, make sure that All Files is selected. Scroll down and select HpcSession.exe.config, which has a file type of CONFIG File. Click Open.

  4. In the file, find the <appSettings> element, which initially contains the following line:

    <add key="failoverClusterName" value="" />
    

    Modify that line to read as follows:

    <add key="failoverClusterName" value="<WCF_failover_cluster_name>" />
    

    Where <WCF_failover_cluster_name> is the name of the failover cluster (not an individual server name) that will run one or more WCF broker nodes. If you will have multiple failover clusters that run WCF broker nodes, specify the name of each failover cluster, separating the names with semicolons (;). Enclose the name or string of names with double quotation marks ("), as shown.

  5. Save and close the file.

  6. On the other server that can run the head node services, repeat steps 1-5 of this procedure.

  7. After the file has been modified on both of the servers that can run the head node services, on one of those servers, open Failover Cluster Manager.

  8. In the console tree, under Services and applications, click the clustered instance that runs the head node services.

  9. In the center pane, under Other Resources, right-click Microsoft HPC Session Launcher Service, and then click Take this resource offline. Then right-click Microsoft HPC Session Launcher Service again, and click Bring this resource online.

Configure high availability for the WCF broker node or nodes

In this procedure, you configure clustered instances of the WCF broker nodes. You use several wizards and commands to do this. The overall purpose is to provide the information that the failover cluster needs in order to start, stop, and fail over the WCF broker node or nodes in a coordinated way.

To configure high availability for the WCF broker node or nodes

  1. On a server that will run a WCF broker node (not the head node), click Start, click Administrative Tools, and then click Services. (If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Yes.)

  2. In the list of services, right-click HPC Broker Service, click Properties, click the Stop button and then, in the list for Startup type, click Disabled. Click OK.

  3. Repeat the previous two steps on the other servers that will run a WCF broker node in the same failover cluster.

  4. In the Failover Cluster Manager snap-in, if the cluster that you want to configure is not displayed, in the console tree, right-click Failover Cluster Manager, click Manage a Cluster, and then select or specify the cluster that you want to configure.

  5. If the console tree is collapsed, expand the tree under the cluster that you want to configure.

  6. Click Services and Applications and then, under Actions (on the right), click Configure a Service or Application.

  7. If the Before You Begin page of the High Availability Wizard appears, click Next.

  8. On the Select Service or Application page, click Message Queuing, and then click Next.

  9. Follow the instructions in the wizard to specify the following details:

    • A name for this clustered WCF broker node (also known as a "resource group" in the failover cluster). This name will be registered in DNS and associated with the IP address for this clustered WCF broker node.

    • Any IP address information that is not automatically supplied by your DHCP settings—for example, a static IPv4 address for this clustered WCF broker node.

    • The storage volume that this clustered instance of a WCF broker node will use for its Message Queuing files. (You can add also add a volume to this clustered instance later, if needed.)

      Important

      When you reach the Confirmation page of the wizard, carefully review the IP address information. All the networks that you expect to see (the networks that you want to use for communication between the WCF broker node and other nodes or clients) should be listed. If a network is missing from the list, click Cancel and review Configure networks for communication within the failover cluster, earlier in this topic.

  10. After the wizard runs and the Summary page appears, if you want to view a report of the tasks that the wizard performed, click View Report.

  11. Under Services and Applications, click the clustered instance of Message Queuing that you just created.

  12. Under Actions (on the right), click Add a resource, and then click 2 - Generic Application.

  13. In the Command line text box, type or paste the following command:

    %SystemDrive%\Program Files\Microsoft HPC Pack 2008 R2\Bin\HpcBroker.exe

  14. In the Parameters text box, type or paste the following parameter:

    -failover

    New Resource Wizard in a failover cluster

    When HpcBroker.exe is configured in this way with the –failover parameter, it runs as an application within the failover cluster, rather than running as a service on an individual server.

  15. Complete the wizard.

  16. In the center pane, ensure that the listing for the clustered WCF broker node is expanded.

  17. Right-click the Generic Application resource that you added (be sure to click the resource, not the clustered WCF broker node that contains the resource), and then click Properties.

  18. Click the Dependencies tab, and then add the Message Queuing resource (the resource with a name that starts with MSMQ) to the list. Click Apply.

    This ensures that the Message Queuing resource will always be started before the Generic Application resource is started.

  19. Click the General tab, and select the check box for Use Network Name for computer name. Then click OK.

  20. In the console tree, select the clustered WCF broker node (not an individual resource), and then under Actions (on the right), click Show Dependency Report.

    Scroll down to the dependency diagram and review the dependencies to ensure that the Generic Application resource is shown as being dependent on the Message Queuing resource (which in turn has other dependencies). Close the dependency report.

  21. To open a Windows PowerShell Command Prompt window, click Start, click Administrative Tools, and then click Windows PowerShell Modules. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

  22. At the Windows PowerShell command prompt, run the following command to view the properties of the clustered instance of the WCF broker node:

    Get-ClusterGroup <Clustered_WCF_Broker_Name> | fl *
    

    Where <Clustered_WCF_Broker_Name> is the name you gave to the clustered instance of the WCF broker node. In the resulting list, notice that the value for the AntiAffinityClassNames property is null {} (unless you previously set this value).

  23. At the Windows PowerShell command prompt, run the following series of commands:

    $s = New-Object System.Collections.Specialized.StringCollection
    $s.Add("HPC Server Broker Launcher")
    ( Get-ClusterGroup "<Clustered_WCF_Broker_Name>" ).AntiAffinityClassNames = $s
    Get-ClusterGroup <Clustered_WCF_Broker_Name> | fl *
    

    Where <Clustered_WCF_Broker_Name> is the name you gave to the clustered instance of the WCF broker node. When you give each clustered instance the same value for the AntiAffinityClassNames property (in this case, "HPC Server Broker Launcher"), it causes each clustered instance to be started on a different failover cluster node, where possible.

  24. In Failover Cluster Manager, right-click the new clustered WCF broker node (not an individual resource), and then click Bring this service or application online.

  25. For each active node that you want to have in this failover cluster, repeat this procedure to create another clustered instance of Message Queuing with the correct resources, resource dependencies, and properties.

    Leave at least one passive node in the failover cluster. In other words, do not configure as many clustered instances as there are servers in the failover cluster. Otherwise, failover cannot occur.

  26. Open the HPC Cluster Manager snap-in and click Node Management.

  27. In the Navigation Pane, under Nodes, click By Group, and then click WCFBrokerNodes. Right-click each of the new WCF broker nodes, and then click Bring Online.

Important

For each of the failover clusters that you want to create using WCF broker nodes, repeat all the procedures in this topic.

Additional references

Step 13: Test Failover on WCF Broker Nodes

Performing Maintenance on WCF Broker Nodes in a Failover Cluster with Windows HPC Server 2008 R2