Install and Use the Network Troubleshooting Report Diagnostic Test
Updated: January 23, 2013
Applies To: Microsoft HPC Pack 2008 R2, Microsoft HPC Pack 2012, Windows HPC Server 2008 R2
This topic contains information about how to install and use the Network Troubleshooting Report diagnostic test for Microsoft® HPC Pack. The Network Troubleshooting Report diagnostic test collects and analyzes network information to help you troubleshoot networking issues on your Windows HPC cluster.
In this topic:
-
About the Network Troubleshooting Report diagnostic test
-
Install the diagnostic test
-
Install the diagnostic test files on workstation nodes
-
Install vstat.exe (optional)
-
Run the Network Troubleshooting Report diagnostic test
-
Copy test data to Microsoft Excel for custom analysis
-
Redistribute the diagnostic test files to cluster nodes
-
Edit a node template to automatically deploy test files to new nodes
-
Uninstall the diagnostic test
-
The Network Troubleshooting Report diagnostic test is installed using an installation (MSI) file, which automates the tasks of adding the test to HPC Cluster Manager and copying the necessary binary files and scripts to the nodes. This installation method for a new or custom diagnostic test is not necessary, but it automates tasks that would otherwise be manual. (For more information about adding custom diagnostic tests to a cluster, see Add New and Custom Diagnostic Tests.)
-
During the installation, the following binary files are copied to each node:
Node
Files
Path
Function
Head node
NetTroubleshoot.exe,NetWrench.exe,NetTSAdapter.dll%REMINST%\NetTroubleshoot
Analyze the data relayed by the
NetWrench.exeandNetTSAdapter.dllfiles that run on the compute nodes, and generate the HTML report for the testCompute nodes
NetWrench.exe,NetTSAdapter.dll
Important If a node does not have these files (for example, they failed to be distributed to that node during the installation of the test or the node was redeployed recently), the test will fail to run on that node. No results for that node will appear in the report for the test. %CCP_HOME%binRun when the test runs, to collect network information and relay it to the head node for analysis
-
If you have an InfiniBand network, you can choose to install
vstat.exeon the nodes in that network to collect additional information for the report. TheNetWrench.exeandNetTSAdapter.dllfiles on the compute node usevstat.exeto collect information about the status and the capabilities of the host channel adapter (HCA) cards for that network. This information is then analyzed on the head node and is included in the HTML report. For more information, see Install vstat.exe (optional), later in this topic. -
A script (
RedistributeCN.cmd) is provided to redistribute the binary files to the nodes. You can run this script at any time. The script should be run after a new node is added to the cluster, or if an existing node is redeployed. For more information, see Redistribute the diagnostic test files to cluster nodes, later in this topic. -
If you redeploy nodes often, so that you do not need to constantly redistribute the binary files to the nodes, you can add a task to your node templates to run a script (
CopyCN.cmd). This script will copy the binary files for the test during node deployment. For more information, see Edit a node template to automatically deploy test files to new nodes, later in this topic. - Unlike the basic HTML reports generated by the default tests installed with HPC Pack, the Network Troubleshooting Report test report includes custom tables that contain all of the test data. These tables can be copied to Microsoft ® Excel ® or a similar application for analysis. For more information, see Copy test data to Microsoft Excel for custom analysis , later in this topic.
Important |
|---|
|
-
Download the Network Troubleshooting Report diagnostic test installation program (
NetTroubleshoot.msi) from the Microsoft Download Center (http://go.microsoft.com/fwlink/?LinkId=207765) to the head node of your Windows HPC cluster or to a network location. -
Make sure that as many nodes as possible in your cluster are started and can be reached from the head node. For example, to check this, view the Node Health monitoring chart. To do this:
- If HPC Cluster Manager is not already open on the head node, open it.
- In Charts and Reports, view the Node Health monitoring chart.
- If HPC Cluster Manager is not already open on the head node, open it.
-
On the head node computer, run
NetTroubleshoot.msi. Follow the steps in the installation wizard. -
To copy the
NetWrench.exeandNetTSAdapter.dllfiles to the compute nodes in the cluster, the wizard automatically opens a Command Prompt window and runs a script that copies the files. When the script completes, review the Summary and then press a key to continue.
Important - If an error occurs during the copying of the files to the compute nodes or there is a problem with your credentials, an error message appears in the command output that guides you to correct the problem.
- If you are prompted, type the password for your account.
- If an error occurs during the copying of the files to the compute nodes or there is a problem with your credentials, an error message appears in the command output that guides you to correct the problem.
-
On the final page of the installation wizard, click Finish.
Additional considerations
- The Network Troubleshooting Report diagnostic test is installed in HPC Cluster Manager using the following metadata:
Item Description Suite
Network Troubleshooting
Name
Network Troubleshooting Report
Alias
netTroubleshoot
If the HPC cluster administrator credentials that you use to install the test also provide administrative credentials on workstation nodes (and unmanaged server nodes, if supported in your version of HPC Pack), the installation program automatically copies the necessary test files to the workstation computers.
In many organizations, however, HPC cluster administrators do not have administrative credentials on the workstation nodes. If you do not have administrative credentials on the workstations, the installation program cannot copy the binary files for the test to those nodes. It is also not possible to redistribute the test files to the workstation nodes by running the RedistributeCN.cmd script.
If you are not a workstation administrator for your organization, you will need to discuss and coordinate the installation of the diagnostic test files with the workstation administrator. The administrator will need to follow the deployment practices in the organization to copy the following files from the head node computer (or a network location) to the %CCP_HOME%bin folder in the workstation node computers: NetWrench.exe and NetTSAdapter.dll. For more information about these files and their locations, see About the Network Troubleshooting Report diagnostic test, in this topic.
If you have an InfiniBand network and you want the Network Troubleshooting Report to display the status and the capabilities of the host channel adapter (HCA) cards in that network, the vstat.exe tool must be installed on each node that has an HCA card. The Network Troubleshooting Report diagnostic test does not install vstat.exe.
Important |
|---|
|
-
Download the InfiniBand driver or tools installation program from the appropriate hardware vendor or system builder to the head node of your Windows HPC cluster or to a network location.
-
On the head node computer, run the installation program.
-
(Important) When prompted, choose to install the program files in a folder in the
C:\Program Filesfolder on the head node computer. In most cases, this is a default installation option. -
Depending on how
vstat.exeis installed on the head node, do one of the following:-
If
vstat.exeis installed as a stand-alone application, you can run aclusruncommand on the head node to copyvstat.exeto the nodes.
Important vstat.exemust be copied to a folder in theC:\Program Filesfolder on the nodes. This ensures that the diagnostic test can usevstat.exeto collect information on the nodes. -
If
vstat.exeis installed with the driver package, you can add the drivers to the operating system images that are deployed to the nodes. In HPC Cluster Manager, in Configuration, in the Deployment To-do List, click Manage drivers.
-
If
Additional resources
You can use the following procedure to run the Network Troubleshooting Report diagnostic test on all nodes in the cluster and to view the report.
Important |
|---|
|
-
If HPC Cluster Manager is not already open on the head node, open it.
-
In Diagnostics, in the Navigation Pane, expand Tests, expand Network, and then click Network Troubleshooting.
-
In the view pane, right-click Network Troubleshooting Report, and then click Run.
-
In the Run Diagnostic Tests dialog box, in Nodes to test, select All nodes, and then click Run.
-
In Diagnostics, in the Navigation Pane, click Test Results.
-
In the view pane, verify that the status of the Network Troubleshooting Report test is not Running.
-
To view the report, in the view pane, double-click Network Troubleshooting Report. The report will open in your default web browser.
-
To export the report, right-click Network Troubleshooting Report and then click Export Results. You can then open the report using a browser or Microsoft Excel.
Note |
|---|
| If you see an error message in the Nodes Excluded from this Report section of the report similar to “Netwrench.exe is not recognized as an internal or external command, operable program or batch file”, the binary files for the diagnostic test are not found on the indicated nodes. This can occur if there is a problem distributing the files to the nodes, or if a node was redeployed. To redistribute the binary files for the test to the nodes, see Redistribute the diagnostic test files to cluster nodes, later in this topic. |
Additional resources
You can copy the data in any report table to Microsoft Excel, and perform your custom analysis of the data. The All Data by Network tab in the report is specifically created for that purpose. It contains summary tables of the data in the different categories in the Analysis by Category tab.
-
Open a new Excel workbook.
-
On the test report, click All Data by Network.
-
Select or highlight the table that you want to copy.
-
In Excel, click a cell and paste the data.
-
You can use the tools in Excel to sort, filter, and analyze the data.
At any time, you can redistribute the files necessary to collect information for the diagnostic test to all nodes that can be reached in the cluster.
-
Make sure that as many nodes as possible in your cluster are started and can be reached from the head node.
-
Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
-
At the elevated Command Prompt window, type the following command:
%ccp_home%bin\RedistributeCN.cmd
If you are deploying nodes from bare metal, you can edit an existing node template to automatically deploy the Network Troubleshooting diagnostic test files to new nodes.
-
In HPC Cluster Manager, in Configuration, in the Navigation Pane, click Node Templates.
-
In the views pane, select a node template.
-
In the Actions pane, click Edit. The Node Template Editor dialog box appears.
-
To add a task that will copy the files for the Network Troubleshooting Report diagnostic test to each node, click Add Task, point to Deployment, and then click Run OS command.
-
Ensure that the new task that you created is selected in the Node template tasks list, and then click Move Down until that task is listed as the last task under Deployment. This will make the new task run after all the other deployment tasks have finished running.
-
Specify the following properties for the new task:
-
Set the ContinueOnFailure property to True.
-
Optionally, in the text box for the Description property, type a description for the task. For example: Copy test report files command.
-
In the text box for the Command property, type the following command:
\\%ccp_scheduler%\REMINST\NetTroubleshootSetup\CopyCN.cmd
-
Set the ContinueOnFailure property to True.
-
To save the node template with the new task, click Save.
To uninstall the Network Troubleshooting Report diagnostic test, do the following:
-
Uninstall the diagnostic test on the head node
-
Delete the diagnostic test files from the nodes (optional)
-
Delete vstat.exe from the nodes (optional)
-
On the head node, close HPC Cluster Manager, if it is currently open.
-
Open Control Panel. Click Start, and then click Control Panel.
-
In Control Panel, under Programs, click Uninstall a program.
-
On the list of installed programs, right-click Microsoft HPC Pack 2008 R2 Network Troubleshooting Report, and then click Uninstall. Follow the steps of the wizard.
Important |
|---|
| To uninstall the Network Troubleshooting Report diagnostic test, you must use the domain credentials of an HPC cluster administrator. If you use local administrator credentials, the uninstallation will fail. |
-
Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
-
At the elevated Command Prompt window, type the following two commands:
clusrun /all del “%CCP_HOME%bin\NetWrench.exe” clusrun /all del “%CCP_HOME%bin\NetTSAdapter.dll”
-
Follow the instructions of your vendor of HCA cards or system builder. If an unattended uninstallation program is provided, you can run a
clusruncommand on the head node to uninstallvstat.exeon the nodes.