Appendix D: Files Used in Examples

Here are the files (scripts, configuration files, etc.) written or modified to build the HOSC prototype and to validate information given in this document.

D.1 Windows HPC Server 2008 files

The first 2 files are used by the deployment template and they need to be modified in order to fulfill the HOSC requirements. The 3rd XML file is used for template deployment based on CN MAC addresses.

D.1.1 Files used for compute node deployment

 

C:\Program Files\Microsoft HPC Pack\Data\InstallShare\unattend.xml

Program file

 

C:\Program Files\Microsoft HPC Pack\Data\InstallShare\Config\diskpart.txt

Program file

 

my_cluster_nodes.xml

<?xml version="1.0" encoding="utf-8"?>
<Nodes xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="https://www.w3.org/2001/XMLSchema" xmlns="https://schemas.microsoft.com/HpcNodeConfigurationFile/2007/12">
  <Node Name="hpcs1" Domain="WINISV">
    <MacAddress>003048334cf6</MacAddress>
  </Node>
  <Node Name="hpcs2" Domain="WINISV">
    <MacAddress>003048334d04</MacAddress>
  </Node>
  <Node Name="hpcs3" Domain="WINISV">
    <MacAddress>003048334d3c</MacAddress>
  </Node>
  <Node Name="hpcs4" Domain="WINISV">
    <MacAddress>003048347990</MacAddress>
  </Node>
</Nodes>

D.1.2 Script for IPoIB setup

 

setIPoIB.vbs

 

set objargs=wscript.arguments
Set fs=CreateObject("Scripting.FileSystemObject")
Set WshNetwork = WScript.CreateObject("WScript.Network")
wscript.sleep(10000)
hostname=WshNetwork.ComputerName
ip=GetIP(hostname)
Set logFile = fs.opentextfile("c:\netconfig.log",8,True)
WScript.Echo "Computername: " & hostname
WScript.Echo "IP: " & ip
logfile.writeline("Computername: " & hostname)
logfile.writeline("IP: " & ip)
res=setIPoIB(ip)
logfile.writeline(res)
wscript.echo res
'-------------------------------------------------------------------------
Function GetIP(hostname)
   set sh = createobject("wscript.shell")
   set fso = createobject("scripting.filesystemobject")
   workfile = "c:\PrivateIPadress.txt"
   sh.run "%comspec% /c netsh interface ip show addresses private > " & workfile,0,true
   Set ts = fso.opentextfile(workfile)
   data = split(ts.readall,vbcr)
   ts.close
   fso.deletefile workfile
   for n = 0 to ubound(data)
      if instr(data(n),"Address") then
         parts = split(data(n),":")
         GetIP= trim(cstr(parts(1)))
      end if
      IP = "could not resolve IP address"
   Next
End Function
'---------------------------------------------------------------------
Function setIPoIB(IPAddress)
   PartialIP=Split(ipaddress,".")
   strIPAddress = Array("10.1.0." & PartialIP(3))
   strSubnetMask = Array("255.255.255.0")
   strGatewayMetric = Array(1)
   WScript.Echo "IB: " & strIPAddress(0)
   strComputer = "."
   Set objWMIService = GetObject("winmgmts:" _
   & "{impersonationLevel=impersonate}!\\" & strComputer & "\root\cimv2")
   Set colNetAdapters = objWMIService.ExecQuery _
   ("select * from win32_networkadapterconfiguration where IPEnabled=true and description like 'Mellanox%'")
   For Each objNetAdapter in colNetAdapters
      errEnable = objNetAdapter.EnableStatic(strIPAddress, strSubnetMask)
      If errEnable = 0 Then
         SetIPoIB="The IP address on Infiniband has been changed"
      Else
         SetIPoIB="The IP address on IB could not be changed. Error: " & errEnable
      End If
   Next
End Function

D.1.3 Scripts used for OS switch

Here are the scripts developed on the HPCS head node to switch the OS type of a compute node from HPCS to XBAS:

C:\hosc\activate_partition_XBAS.bat

@echo off
rem the argument is the head node hostname for shared file system mount. For example: \\HPCS0
echo ... Partitioning disk...
diskpart.exe /s %1\hosc\diskpart_commands.txt
echo ... Shutting down node %COMPUTERNAME% ...
shutdown /r /f /t 20 /d p:2:4

 

C:\hosc\diskpart_commands.txt

select disk 0
select partition 1
active

 

C:\hosc\from_HPCS_to_XBAS.bat

@echo off
rem the argument is the node hostname. For example: hpcs1
echo Check that file dhcpd.conf is updated on the XBAS management node !
if NOT "%1"=="" clusrun /nodes:%1 %LOGONSERVER%\hosc\activate_partition_XBAS.bat %LOGONSERVER%
if "%1"=="" echo "usage: from_HPCS_to_XBAS.bat <hpcs_hostname>"

D.2 XBAS files

D.2.1 Kickstart and PXE files

Here is an example of modifications that must be done in the kickstart file generated by the preparenfs tool in order to fulfill the HOSC requirements:

/release/ks/kickstart.<identifier> (for example kickstart.22038)

…
part / --asprimary --fstype="ext3" --ondisk=sda --size=10000
part /usr --asprimary --fstype="ext3" --ondisk=sda --size=10000
part /opt --fstype="ext3" --ondisk=sda --size=10000
part /tmp --fstype="ext3" --ondisk=sda --size=10000

Down-arrow icon

…
part /boot --asprimary --fstype="ext3" --ondisk=sda --size=100
part /     --asprimary --fstype="ext3" --ondisk=sda --size=50000

Here is an example of a PXE file generated by preparenfs for node xbas1. Before deployment, the DEFAULT label is set to ks and after deployment the DEFAULT label is set to local_primary automatically.

/tftboot/C0A80002 (complete file before compute node deployment)

# GENERATED BY PREPARENFS SCRIPT
TIMEOUT 10
DEFAULT ks
PROMPT 1

LABEL local_primary
        KERNEL chain.c32
        APPEND hd0

LABEL ks
KERNEL RHEL5.1/vmlinuz
  APPEND  console=tty0 console=ttyS1,115200 ksdevice=eth0 lang=en ip=dhcp ks=nfs:192.168.0.99:/release/ks/kickstart.22038 initrd=RHEL5.1/initrd.img driverload=igb

LABEL rescue
KERNEL RHEL5.1/vmlinuz
APPEND console=ttyS1,115200 ksdevice=eth0 lang=en ip=dhcp method=nfs:192.168.0.99:/release/RHEL5.1 initrd=RHEL5.1/initrd.img rescue driverload=igb

 

/tftboot/C0A80002 (head of the file after compute node deployment)

# GENERATED BY PREPARENFS SCRIPT
TIMEOUT 10
DEFAULT local_primary
PROMPT 1

 

The remainder of the file is unchanged. Set TIMEOUT and PROMPT to 0 in order to boot nodes quicker.

D.2.2 DHCP configuration

The initial DHCP configuration file must be changed for HPCS CN deployment: the global next-server field must be deleted and each CN host section must be modified as shown below:

 

 

/etc/dhcpd.conf

next-server        192.168.0.99;
########### END GLOBAL PARAMETERS 
subnet 192.168.0.0 netmask 255.255.0.0{
   authoritative;

host xbas1 { 
  filename           "pxelinux.0";
  fixed-address      192.168.0.2;
  hardware ethernet  

Down-arrow icon

# global “next-server” entry is removed.
########### END GLOBAL PARAMETERS 
subnet 192.168.0.0 netmask 255.255.0.0{
   authoritative;

host hpcs1 { 
  filename           "Boot\\x64\\WdsNbp.com";
  fixed-address      192.168.1.2;
  hardware ethernet  00:30:48:33:4c:f6;
  option host-name   "hpcs1";
  next-server        192.168.1.1;

 

Note

This modification can be done by the switch_dhcp_host script.

The NBP file path must be written with double \\ in order to be correctly interpreted during the PXE boot.

D.2.3 Scripts used for OS switch

Here are the scripts developed on the XBAS management node to switch the OS of a compute node:

/opt/hosc/switch_dhcp_host

#!/usr/bin/python -t
import os, os.path, sys
############## Cluster characteristics must be written here ################
xbas_hostname_base='xbas'
hpcs_hostname_base='hpcs'
field_dict  = {hpcs_hostname_base:{'filename':'"Boot\\\\x64\\\\WdsNbp.com";\n',
                                    'fixed-address':'192.168.1.',
                                    'next-server':'192.168.1.1;\n',
                                    'server-name':'"192.168.1.1";\n'},
                xbas_hostname_base:{'filename':'"pxelinux.0";\n',
                                    'fixed-address':'192.168.0.',
                                    'next-server':'192.168.0.1;\n',
                                    'server-name':'"192.168.0.1";\n'}}
if (len(sys.argv) <> 2):
    print ('usage: switch_dhcp_host <current compute node hostname>')
    sys.exit(1)
elif (len(str(sys.argv[1]))>1) and (str(sys.argv[1])[-2:].isdigit()):
    node_base = str(sys.argv[1])[:-2]
    node_rank = str(sys.argv[1])[-2:]
else:
    node_base = str(sys.argv[1])[:-1]
    node_rank = str(sys.argv[1])[-1:]
if (node_base == xbas_hostname_base ):
    old_hostname= xbas_hostname_base + node_rank
    new_hostname=hpcs_hostname_base + node_rank
    new_node_base = hpcs_hostname_base
elif (node_base == hpcs_hostname_base):
    old_hostname=hpcs_hostname_base + node_rank
    new_hostname= xbas_hostname_base + node_rank
    new_node_base = xbas_hostname_base
else:
    print ('unknown hostname: ' + sys.argv[1])
    sys.exit(1)
file_name = '/etc/dhcpd.conf'
if not os.path.isfile(file_name):
    print file_name + ' does not exists !'
    sys.exit(1)
status = 'File ' + file_name + ' was not modified'
file_name_save = file_name + '.save'
file_name_temp = file_name + '.temp'
old_file = open(file_name,'r')
new_file = open(file_name_temp,'w')

S = old_file.readline()    
while S:
    if (S[0:11] == 'next-server'): S = old_file.readline() # Removes global next-server line
    if (S.find('host ' + old_hostname) <> -1):
        while (S.find('hardware ethernet') == -1): 
            S = old_file.readline() # Skips old host section lines
        hardware_ethernet=S.split()[2]  # Gets host Mac address
        while (S.find('}') == -1): 
            S = old_file.readline() # Skips old host section lines
        # Writes new host section lines:
        new_file.write('   host ' + new_hostname + ' {\n')
        new_file.write('        filename          ' + field_dict[new_node_base]['filename'])    
        new_file.write('        fixed-address     ' + field_dict[new_node_base]['fixed-address']
                                                    + str(int(node_rank)+1) + ';\n') 
        new_file.write('        hardware ethernet ' + hardware_ethernet + '\n')     
        new_file.write('        option host-name  ' + '"' + new_hostname + '";\n') 
        new_file.write('        next-server       ' + field_dict[new_node_base]['next-server']) 
        new_file.write('        server-name       ' + field_dict[new_node_base]['server-name'])
        if (new_node_base == hpcs_hostname_base):
           new_file.write('option domain-name-servers '+field_dict[new_node_base]['next-server'])
        new_file.write('   }\n')
        status = 'File ' + file_name + ' is updated with host ' + new_hostname
    else: new_file.write(S)  # Copies the line from the original file without modifications
    S = old_file.readline()    
# End while loop
old_file.close()
new_file.close()
if os.path.isfile(file_name_save): os.remove(file_name_save)
os.rename(file_name,file_name_save)
os.rename(file_name_temp,file_name)
print status 
print ('Do not forget to validate changes by typing: service dhcpd restart')
sys.exit(0)
# End of switch_dhcp_host script

 

/opt/hosc/activate_partition_HPCS.sh

#!/bin/sh
#the argument is the node hostname. For example: xbas1
ssh $1 fdisk /dev/sda < /opt/hosc/fdisk_commands.txt

 

/opt/hosc/fdisk_commands.txt

a
4
a
1
w
q

 

/opt/hosc/from_XBAS_to_HPCS.sh

#!/bin/sh
#the argument is the node hostname. For example: xbas1
/opt/hosc/switch_dhcp_host $1
/sbin/service dhcpd restart
/opt/hosc/activate_partition_HPCS.sh $1
ssh $1 shutdown -r -t 20 now

 

/opt/hosc/from_HPCS_to_XBAS.sh

#!/bin/sh
#this script requires a ssh server daemon to be installed on the HPCS compute nodes
#the argument is the compute node hostname. For example: hpcs1
#HPCS head node hostname is hard coded in this script as: hpcs0
/opt/hosc/switch_dhcp_host $1
/sbin/service dhcpd restart
ssh $1 -l root cmd /c \\\\hpcs0\\hosc\\activate_partition_XBAS.bat \\\\hpcs0

D.2.4 Network interface bridge configuration

For configuring 2 network interface bridges, xenbr0 and xenbr1, replace the following line in file:

/etc/xen/xen-config.sxp

(network-script network-bridge)

Down-arrow icon

(network-script my-network-bridges)

 

Then create file:

/etc/xen/scripts/my-network-bridges

#!/bin/bash
XENDIR="/etc/xen/scripts"
$XENDIR/network-bridge "$@" netdev=eth0 bridge=xenbr0 vifnum=0
$XENDIR/network-bridge "$@" netdev=eth1 bridge=xenbr1 vifnum=1

D.2.5 Network hosts

The hosts file declares the IP addresses of the network interfaces of Linux nodes. XBAS CNs needs to have the same hosts file. Here is an example for our HOSC cluster:

/etc/hosts

127.0.0.1localhost.localdomainlocalhost
192.168.0.1xbas0
192.168.0.2xbas1
192.168.0.3xbas2
192.168.0.4xbas3
192.168.0.5xbas4
172.16.0.1xbas0-ic0
172.16.0.2xbas1-ic0
172.16.0.3xbas2-ic0
172.16.0.4xbas3-ic0
172.16.0.5xbas4-ic0

D.2.6 IB network interface configuration

For configuring the IB interface on each node, create/edit the following file with the right IP address. Here is an example for the compute node xbas1:

/etc/sysconfig/network-scripts/ifcfg-ib0

DEVICE=ib0
ONBOOT=yes
BOOTPROTO=static
NETWORK=192.168.220.0
IPADDR=192.168.220.2

D.2.7 ssh host configuration

/etc/ssh/ssh_known_hosts

xbas0,192.168.0.1 ssh-rsa AAAB3NzaC1yc2EAAABIwAAAQE/yiPG/x5gl+dq5XXhffF456fggDFt … lC92dxQUE5qQ==
xbas1,192.168.0.2 ssh-rsa AAAB3NzaC1yc2EAAABIwAAAQE/yiPG/x5gl+dq5XXhffF456fggDFt … lC92dxQUE5qQ==
xbas2,192.168.0.3 ssh-rsa AAAB3NzaC1yc2EAAABIwAAAQE/yiPG/x5gl+dq5XXhffF456fggDFt … lC92dxQUE5qQ==
xbas3,192.168.0.4 ssh-rsa AAAB3NzaC1yc2EAAABIwAAAQE/yiPG/x5gl+dq5XXhffF456fggDFt … lC92dxQUE5qQ==
xbas4,192.168.0.5 ssh-rsa AAAB3NzaC1yc2EAAABIwAAAQE/yiPG/x5gl+dq5XXhffF456fggDFt … lC92dxQUE5qQ==

D.3 Meta-scheduler setup files

D.3.1 PBS Professional configuration files on XBAS

Here is an example of PBS Professional configuration file for PBS server on the XBAS MN:

/etc/pbs.conf

PBS_EXEC= /opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=1
PBS_START_MOM=0
PBS_START_SCHED=1
PBS_SERVER=xbas0
PBS_SCP=/usr/bin/scp

Here is an example of PBS Professional configuration file for PBS MOM on the XBAS CNs:

/etc/pbs.conf

PBS_EXEC= /opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=0
PBS_START_MOM=1
PBS_START_SCHED=0
PBS_SERVER=xbas0
PBS_SCP=/usr/bin/scp

D.3.2 PBS Professional configuration files on HPCS

Here is an example of the lmhosts file needed on HPCS nodes:

C:\Windows\System32\drivers\etc\lmhosts

192.168.0.1

xbas0

#PBS server for HOSC

D.3.3 OS load balancing files

This script gets information from the PBS server and switches the OS type of compute nodes according to the rule defined in Section 5.7:

“Let us define η as the smallest number of nodes requested by a queued job for a given OS type A. Let us define α (respectively β) as the number of free nodes with the OS type A (respectively B). If η>α (i.e., there are not enough free nodes to run the submitted job with OS type A) and if β≥η-α (at least η-α nodes are free with the OS type B) then the OS type of η-α nodes should be switched from B to A”.

/opt/hosc/pbs_hosc_os_balancing.pl

 

#!/usr/bin/perl
#use strict;

#Gets information with pbsnodes about free nodes
$command_pbsnodes = "/usr/pbs/bin/pbsnodes -a |";
open (PBSC, $command_pbsnodes ) or die "Failed to run command: $command_pbsnodes";
@cmd_output = <PBSC>;
close (PBSC);
foreach $line (@cmd_output) {
   if (($line !~ /^(\s+)\w+/) && ($line !~ /^(\s+)$/) &&($line =~ /^(.*)\s+/)) {
      $nodename = $1;              
      push (@pbsnodelist, $nodename);
      $pbsnodes->{$nodename}->{state} = 'unknown';
      $pbsnodes->{$nodename}->{arch} = 'unknown';
   } elsif ($line =~ "state") {
      $pbsnodes->{$nodename}->{state} = (split(' ', $line))[2];
   } elsif ($line =~ "arch") {
      $pbsnodes->{$nodename}->{arch} = (split(' ', $line))[2];
   }
}
foreach my $node (@pbsnodelist) {
   if ($pbsnodes->{$node}->{state}=~"free") {
      if ($pbsnodes->{$node}->{arch}=~"linux") {
         push (@free_linux_nodes, $node);
      } else {
         push (@free_windows_nodes, $node);
      }
   }
}

#Gets information with qstat about the number of nodes requested by queued jobs
$command_qstat    = "/usr/pbs/bin/qstat -a |";
open (PBSC, $command_qstat ) or die "Failed to run command: $command_qstat";
@cmd_output = <PBSC>;
close (PBSC);
$nb_windows_nodes_of_smallest_job = 1e09;
$nb_linux_nodes_of_smallest_job = 1e09;
foreach $line (@cmd_output) {
   if ((split(' ', $line))[9] =~ "Q") {
      $nb_nodes = (split(' ', $line))[5];
      if ($line =~ "windowsq") {
         $nb_windows_nodes_queued += $nb_nodes;
         if ($nb_nodes < $nb_windows_nodes_of_smallest_job) {
            $nb_windows_nodes_of_smallest_job = $nb_nodes;
         }
      } elsif ($line =~ "linuxq") {
         $nb_linux_nodes_queued += $nb_nodes;
         if ($nb_nodes < $nb_linux_nodes_of_smallest_job) {
            $nb_linux_nodes_of_smallest_job = $nb_nodes;
         }
         }
         }
}
#STDOUT is redirected to a LOG file
open LOG, ">>/tmp/pbs_hosc_log.txt";
select LOG; 

#Compute the number of possible requested nodes whose OS type should be switched
$requested_windows_nodes = $nb_windows_nodes_of_smallest_job - scalar @free_windows_nodes;
$requested_linux_nodes = $nb_linux_nodes_of_smallest_job - scalar @free_linux_nodes;

#The decision rule based on previous information is applied 
if (($nb_windows_nodes_of_smallest_job > scalar @free_windows_nodes) && 
    (scalar @free_linux_nodes >= $requested_windows_nodes)){
   #switch $requested_windows_nodes nodes from XBAS to HPCS
   for ($i = 0; $i < $requested_windows_nodes; $i++) {
      $command_offline = "/usr/pbs/bin/pbsnodes -o $free_linux_nodes[$i]";
      system ($command_offline);
      $command_switch_to_HPCS = "/opt/hosc/from_XBAS_to_HPCS.sh $free_linux_nodes[$i]";
      system ($command_switch_to_HPCS);
       ($new_node = $free_linux_nodes[$i]) =~ s/xbas/hpcs/;
      $command_online = "/usr/pbs/bin/pbsnodes -c $new_node";
      system ($command_online);
      print "switch OS type from XBAS to HPCS: $free_linux_nodes[$i] -> $new_node\n";
   }
} elsif (($nb_linux_nodes_of_smallest_job > scalar @free_linux_nodes) && 
         (scalar @free_windows_nodes >= $requested_linux_nodes)) {
   #switch $requested_linux_nodes nodes from HPCS to XBAS
   for ($i = 0; $i < $requested_linux_nodes; $i++) {
      $command_offline = "/usr/pbs/bin/pbsnodes -o $free_windows_nodes[$i]";
      system ($command_offline);
      $command_switch_to_XBAS= "/opt/hosc/from_HPCS_to_XBAS.sh $free_windows_nodes[$i]";
      system ($command_switch_to_XBAS);
       ($new_node = $free_windows_nodes[$i]) =~ s/hpcs/xbas/;
      $command_online = "/usr/pbs/bin/pbsnodes -c $new_node";
      system ($command_online);
      print "switch OS type from HPCS to XBAS: $free_windows_nodes[$i] -> $new_node\n";
   }
}
close LOG;

 

The above script is run periodically every 10 minutes as defined by the crontab file:

/var/spool/cron/root

# run HOSC Operating System balancing script every 10 minutes (noted */10)
*/10 * * * * /opt/hosc/pbs_hosc_os_balancing.pl