6 Administer the Hybrid Operating System Cluster Prototype

Article
12/15/2009

Applies To: Windows HPC Server 2008

6.1 HOSC setup checking

Basically, the cluster checking is done as if there were 2 independent clusters. The fact that an HOSC is used does not change anything at this level. The usual cluster diagnosis tests should thus be used.

For HPCS, this means that the basic services and connectivity tests should be run first, followed by the automated diagnosis tests from the “cluster management” MMC.

For XBAS, the sanity checks can be done with basic Linux commands (ping, pdsh, etc.) and monitoring tools like Nagios (see [23] and [24] for details).

6.2 Remote reboot command

A reboot command can be sent remotely to compute nodes by the management nodes.

The HPCS head node can send a reboot command to its HPCS compute nodes only (soft reboot) with “clusrun”. For example:

C:\> clusrun /nodes:hpcs1,hpcs2 shutdown /r /f /t 5 /d p:2:4

Use “clusrun /all” for rebooting all HPCS compute nodes (the head node should not be declared as a compute node; otherwise this command would reboot it too).

The XBAS management node can send a reboot command to its XBAS compute nodes only (soft reboot) with pdsh. For example:

[xbas0:root] pdsh –w xbas[1-4] reboot

The XBAS management node can also reboot any compute node (HPCS or XBAS) with the NovaScale control “nsctrl” command (hard reboot). For example:

[xbas0:root] nsctrl reset xbas[1-4]

6.3 Switch a compute node OS type from XBAS to HPCS

To switch a compute node OS from XBAS to HPCS, type the from_XBAS_to_HPCS.sh command on the XBAS management node (you must be logged on as “root”). See Appendix D.2.3 for information on this command implementation. For example, if you want to switch the OS type of node xbas2, type:

[xbas0:root] from_XBAS_to_HPCS.sh xbas2

The compute node is then automatically rebooted with the HPCS OS type.

6.4 Switch a compute node OS type from HPCS to XBAS

Without sshd on the HPCS compute nodes

To switch a compute node OS from HPCS to XBAS, first execute the switch_dhcp_host command on the XBAS management node and restart the dhcp service. This can be done locally on the XBAS MN console or remotely from the HPCS HN using a secure shell client (e.g., putty or openssh). Type:

[xbas0:root] switch_dhcp_host hpcs2
[xbas0:root] service dhcpd restart

Then take the node offline in the MMC and type the from_HPCS_to_XBAS.bat command in a “command prompt” window of the HPCS head node. See Appendix D.1.3 for information on this command implementation. For example, if you want to switch the OS of node hpcs2, type:

C:\> from_HPCS_to_XBAS.bat hpcs2

The compute node is then automatically rebooted with the XBAS OS type.

With sshd on the HPCS compute nodes

If you installed a ssh server daemon (e.g., FreeSSHd) on the HPCS CNs then you can also type the following command from the XBAS management node. It will execute all the commands (listed in previous Section) from the XBAS MN without having to log on the HPCS HN. Type:

[xbas0:root] from_HPCS_to_XBAS.sh hpcs2

The compute node is then automatically rebooted with the XBAS OS type.

This script was mainly implemented to be used with a meta-scheduler since it is not recommended to switch the OS type of a HPCS CN by sending a command from the XBAS MN (see Section 4.3).

6.5 Re-deploy an OS

The goal is to be able to re-deploy an OS on an HOSC without impacting the other OS that is already installed. Do not forget to save your MBR since it can be overwritten during the installation phase (see Appendix C.2).

For re-deploying XBAS compute nodes, ksis tools cannot be used (it would erase existing Windows partitions). The “preparenfs” command is the only tool that can be used. The partition declarations done in the kickstart file should then be edited in order to reuse existing partitions and not to remove them or recreate new ones. The modifications are slightly different from those done for the first install. If the existing partitions are those created with the kickstart file shown as example in Appendix D.2.1:

/dev/sda1	/boot	100MB	ext3	(Linux)
/dev/sda2	/	50GB	ext3	(Linux)
/dev/sda3	SWAP	16GB		(Linux)
/dev/sda4	C:\	50GB	ntfs	(Windows)

Then the new kickstart file used for re-deploying a XBAS compute node should include the lines below:

/release/ks/kickstart.<identifier>
…
part /boot --fstype="ext3" --onpart sda1
part /     --fstype="ext3" --onpart sda2 
part swap  --noformat      --onpart sda3 
…

In the PXE file stored on the MN (e.g., /tftboot/C0A80002 for node xbas1), the DEFAULT label should be set back to ks instead of local_primary. The CN can then be rebooted for starting the re-deployment process.

For re-deploying Windows HPC Server 2008 compute nodes, check that the partition number in unattend.xml file is consistent with the existing partition table and if necessary edit it (in our example: <PartitionID>4</PartitionID>). Edit diskpart.txt file so that it only re-formats the NTFS Windows partition without cleaning or removing the existing partitions (see Appendix D.1.1). Manually update/delete the previous computer and hostname declaration in the Active Directory before re-deploying the nodes and then play the compute node deployment template as for the first install.

6.6 Submit a job with the meta-scheduler

For detailed explanation about using PBS Professional and submitting jobs read the PBS Professional User’s Guide [32]. This paragraph just gives an example of the specificities of our meta-scheduler environment.

Let us suppose we have a user named test_user. This user has two applications to run: one for each OS type. He also has two job submission scripts: my_job_Win.sub for the Windows application and my_job_Lx.sub for the Linux application:

my_job_Win.sub

#PBS -l select=2:ncpus=4:mpiprocs=4
#PBS -q windowsq

C:\Users\test_user\my_windows_application

my_job_Lx.sub

#!/bin/bash
#PBS -l select=2:ncpus=4:mpiprocs=4
#PBS -q linuxq

/home/test_user/my_linux_application

Whatever the OS type the application should run on, the scripts can be submitted from any Windows or Linux computer with the same qsub command. The only requirement is that the computer needs to have credentials to connect with the PBS Professional server.

The command lines can be typed from a Windows system:

C:\> qsub my_job_Win.sub
C:\> qsub my_job_Lx.sub

or the command lines can be typed from a Linux system:

[xbas0:test_user] qsub my_job_Win.sub
[xbas0:test_user] qsub my_job_Lx.sub

You can check the PBS queue status with the qstat command. Here is the example of an output:

[xbas0:root] qstat -n
xbas0: 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
129.xbas0       thomas   windowsq my_job_Win   3316   2   8    --    --  R 03:26
   hpcs3/0*4+hpcs4/0*4
130.xbas0       laurent  linuxq   my_job_Lx.  21743   2   8    --    --  R 01:23
   xbas1/0*4+xbas2/0*4
131.xbas0       patrice  linuxq   my_job_Lx.    --    2   8    --    --  Q   -- 
    -- 
132.xbas0       patrice  linuxq   my_job_Lx.    --    1   4    --    --  Q   -- 
    -- 
133.xbas0       laurent  windowsq my_job_Win    --    2   8    --    --  Q   -- 
    -- 
134.xbas0       thomas   windowsq my_job_Win    --    1   4    --    --  Q   -- 
    -- 
135.xbas0       thomas   windowsq my_job_Win    --    1   1    --    --  Q   -- 
    -- 
136.xbas0       patrice  linuxq   my_job_Lx.    --    1   1    --    --  Q   -- 
    --

6.7 Check node status with the meta-scheduler

The status of the nodes can be checked with the PBS Professional monitoring tool. Each physical node appears twice in the PBS monitor window: once for each OS type. For example, the first node appears with two hostnames (xbas1 and hpcs1). The hostname associated with the running OS type is flagged as “free” or “busy” while the other hostname is flagged as “offline”. This gives a complete view of the OS type distribution on the HOSC.

Figure 21 shows that the two first CNs run XBAS while the two other CNs run HPCS. It also shows that all four CNs are busy. This corresponds to the qstat output shown as example in the previous Section above. Figure 22 shows that there are three free CNs running XBAS and one busy CN running HPCS on our HOSC prototype. Since we do not want to run applications on the XBAS MN (xbas0), we disabled its MOM (see Section 5.6). That is why it is seen as “down” in both Figures.

Monitor with 4 busy nodes (2 XBAS and 2 HPCS)

Figure 21 - PBS monitor with all 4 compute nodes busy (2 with XBAS and 2 with HPCS)

One busy HPCS node and three free XBAS nodes

Figure 22 - PBS monitor with 1 busy HPCS compute node and 3 free XBAS compute nodes