Export (0) Print
Expand All

Monitoring Nodes

Updated: January 13, 2014

Applies To: Microsoft HPC Pack 2008 R2, Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2, Windows HPC Server 2008 R2

A key step in monitoring and maintaining cluster health is to identify any deviance from normal operational state or performance. HPC Cluster Manager enables you to view cluster and node status at a glance, identify problem nodes, and drill down into node details for further investigation.

In this topic:

In Node Management you can monitor your cluster at a glance using the node List view or the node Heat Map view. In Charts and Report, the monitoring charts display current and recent data about node health and cluster utilization. For more information, see:

The List and Heat Map views provide a starting point for identifying problem areas. Double-click a compute node to see detailed information such as hardware, operating system properties, and current performance metrics. You can also select one or more nodes, then drill down into the node details to investigate performance.

Tracking recent or ongoing cluster operations is another monitoring aspect that is critical to administrating a cluster. For more information, see:

In HPC Job Manager, you can use the Pivot To actions to correlate the monitoring information between nodes, jobs, operations, and diagnostics. For example, you can select one or more nodes in the views pane, and then pivot to the Jobs for the Selected Nodes. This takes you to a job list view that is filtered by the nodes that you selected.

The supported pivot paths are:

  • Nodes: pivot to jobs, test results, and operations.

  • Jobs: pivot to nodes.

  • Test results: pivot to failed nodes, and operations.

HPC Cluster Manager provides several built-in charts and reports to monitor and analyze cluster resource usage and job and node statistics over time. The HPCReporting database also supports custom reporting. For more information, see Charts and Reports: HPC Cluster Manager.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft