Guidelines for Backing Up a Windows HPC Server 2008 R2 Cluster


Updated: July 29, 2016

Applies To: Microsoft HPC Pack 2008 R2

This section provides an overview of the guidelines and methods for backing up a Windows HPC Server 2008 R2 cluster. This section focuses on backing up the data that is stored in the HPC databases (on the head node or on remote servers that are running at least Microsoft® SQL Server® 2008 SP1) and backing up the cluster configuration settings on the head node. If this data is regularly backed up, it is generally possible to restore a cluster to normal operations with minimal disruptions, after a hardware or software failure on the head node or on a remote server for the HPC databases. For more information about restoring a Windows HPC Server 2008 R2 cluster in different situations, see Scenarios for Restoring a Windows HPC Server 2008 R2 Cluster in this Back Up and Restore guide.

System_CAPS_ICON_note.jpg Note

Often the existing cluster nodes other than the head node can continue to operate after a cluster or head node is restored, or they can be easily redeployed by using the data that is stored in the HPC databases and the cluster configuration settings.

In this section:

The critical data for the cluster is stored in the HPC databases and in configuration files and settings on the head node.

HPC databases

The HPC databases in the following table store data that is critical to the operation of Windows HPC Server 2008 R2 and should be backed up regularly. The HPC databases are much more dynamic than other cluster data. The databases are changing continuously while jobs are submitted and run on the cluster.

DatabaseDefault nameData
Cluster management databaseHPCManagementCluster users, network configuration, nodes, node groups, node templates, operations history, performance counter metric history
Job scheduling databaseHPCSchedulerNodes, job templates, job history, scheduler configuration
Diagnostics databaseHPCDiagnosticsResults of diagnostic tests
Reporting databaseHPCReportingRaw and aggregated reporting data

In Windows HPC Server 2008 R2, the cluster databases can be installed on the head node or on remote servers that are running SQL Server. When the databases are installed on the head node, they are installed by default in the COMPUTECLUSTER instance in SQL Server. The database files are located by default in the %PROGRAMFILES%\Microsoft HPC Pack 2008 R2\SQLDB folder.

Cluster configuration settings

In addition to the data that is stored in the HPC databases, the following table lists important cluster configuration settings and data that are stored on the head node.

System_CAPS_ICON_important.jpg Important

This list is not exhaustive, but it indicates settings that are important to restore in many environments. Many settings depend on or affect the data that is stored in the HPC databases. Not all items apply to all clusters.

Store of files for node setup, including operating system images and driversREMINST file share, including the REMINST\setup\images and REMINST\setup\drivers folders
Configuration files for service-oriented architecture (SOA) servicesHpcServiceRegistration file sharePresent if SOA services are installed. The DLLs for the SOA services that are specified in the service registration files are stored separately according to the preferences of the cluster administrator, and they should also be backed up.
Output spool share for compute nodesCcpSpoolShare file share
Results of diagnostic testsDiagnostics file share
Submission and activation filtersPaths configured either in the job scheduler configuration options, or by job template (in Windows HPC Server 2008 R2 SP2 or later)Present if installed by the cluster administrator
Custom diagnostic tests%CCP_HOME%bin\DiagTests folderPresent if installed by the cluster administrator
Other customizable filesFolders under %CCP_HOME%Includes CcpPower.cmd, startnet.cmd, unattend.xml, HpcSession.exe.config
Environment variablesOn the head node: CCP_DATA, CCP_HOME, CCP_JOBTEMPLATE, CCP_SCHEDULER
Hosts file%SystemRoot%\System32\drivers\etc\hosts)

To help to recover your cluster if the head node fails or if the HPC databases fail, you should regularly back up the full system of the head node and any remote database servers, the HPC databases (on the head node or on one or more remote servers), and the cluster configuration settings on the head node. The following table provides guidelines for these three backup types.

Backup typeDescriptionRecommended frequency of backups
Full serverA backup of all volumes so that you can recover the full server, including all the files, data, applications, and the system state. The system state includes the boot file, the COM+ class registration database, and the registry.Make a full server backup at regular intervals, and before and after you make major configuration changes.

In a stable cluster, you can schedule full server backups once a week or even less frequently.
DatabasesA full backup of each HPC database. Database backups represent the whole database at the time the backup finished.The HPC databases should be backed up much more often than the full system state backup.

Back up all the HPC databases at the same time.

Depending on the activity level in your cluster, you might want to back up the databases daily, multiple times per day, or even use a continual backup method.

To ensure consistency between the databases, back up all of the databases after making configuration changes such as adding or deleting nodes.
Cluster configuration settingsA backup of all the file shares on the head node that contain configuration settings that are critical for the operation of your cluster.Back up all of the configuration settings at the same time.

To ensure consistency between the databases and the configuration settings, back up the configuration settings at the same time that you back up the databases.

To create a backup of your head node, the HPC databases, and the cluster configuration settings, you can choose among several standard backup solutions, which include Microsoft and non-Microsoft backup and restore solutions. These methods include the following:

Backup solutionMore information
Windows Server BackupWindows Server Backup
SQL Server Management Studio Backup- Backing Up and Restoring Databases in SQL Server
- Backing Up and Restoring How-to Topics (SQL Server Management Studio)
Microsoft System Center Data Protection Manager (DPM)- DPM 2010 Product Overview and Roadmap
- Protecting SQL Server with DPM 2007
Non-Microsoft backup and recovery solutions, such as Symantec NetBackupConsult the documentation of the vendor.

 DISCLAIMER: Reference to any non-Microsoft products is intended solely for informational purposes and does not constitute or imply any endorsement by Microsoft.
System_CAPS_ICON_note.jpg Note

If your system is running at least Windows HPC Server 2008 R2 Service Pack 2, you can also use the Export-HpcConfiguration.ps1 and Import-HpcConfiguration.ps1 HPC PowerShell® scripts to back up and restore certain cluster configuration settings on the head node. Some of these cluster configuration settings, such as node templates and job templates, are stored in the HPC databases. These scripts can also be used to migrate critical configuration settings from one cluster to another—for example, to maintain cluster operations in case of a disaster. For more information, see Export and Import Windows HPC Cluster Configuration Settings in this Back Up and Restoreguide.

System_CAPS_ICON_important.jpg Important

Regardless of which method you choose for backup and restore, to bring the cluster to a consistent state, you need to perform additional steps when you restore the cluster. For an overview of the restore steps in several scenarios, see Scenarios for Restoring a Windows HPC Server 2008 R2 Cluster.

Example: Create a protection group for the cluster in DPM

You can use DPM to create a protection group for the cluster that includes the HPC databases (on the head node or on one or more remote computers) and the configuration settings that are stored in shared folders on the head node. For example, the protection group could include the following sources:

  • The SQL Server instance or instances that host the HPC databases

  • REMINST file share

  • HpcServiceRegistration file share

  • CcpSpoolShare file share

  • Diagnostics file share

The protection group could be expanded to include other data sources depending on the needs of the cluster administrator.

For information about creating a protection group in DPM, see Creating a Protection Group for File and Application Servers.