Resolve a Heartbeat Alert

Applies To: Operations Manager 2007 R2, Operations Manager 2007 SP1

The Health Service sends a heartbeat to a management server to verify that the system is still responding. When a specified number of heartbeats fail to arrive, Microsoft System Center Operations Manager 2007 displays an alert.

This section shows how to investigate a Health Service Heartbeat Failure alert as an example. Different alerts have different causes and different resolutions.

If you want to walk through these procedures, you can cause this alert by disabling the System Center Management service on a test system.

To cause a Health Service heartbeat failure for testing

  1. On a system with an agent installed, open Control Panel.

  2. Double-click Administrative Tools.

  3. Double-click Services.

  4. Right-click the System Center Management service, and then click Stop.

    Note

    Use this same procedure, selecting Start in step 4, when you are done testing.

How to Investigate Agent Heartbeat Issues

The Monitoring pane displays active alerts. Looking at an alert provides information and tools to investigate with.

To investigate an active alert

  1. Open the Operations console.

  2. Click Monitoring.

  3. If necessary, in the Monitoring pane, click Monitoring to expand it.

  4. Click Active Alerts to view the Health Service Heartbeat Alert.

    Note

    Depending on the heartbeat interval and the number of missing heartbeats, a few minutes might be required to see the alert.

  5. Click the alert to highlight it and read the information in the Alert Details area. The Alert Details area provides information about the alert, including a description and knowledge about the cause and resolution.

How to Troubleshoot Agent Heartbeat Issues

Use the tasks in the Action pane to diagnose the cause of the alert. Different alerts have different tasks. For a Health Service Heartbeat Failure alert, the tasks deal with pinging the system and verifying or restarting the service.

To use the action tasks in troubleshooting

  1. If necessary, click Actions to make the Actions pane visible.

  2. In the Actions pane, under Health Service Watcher Tasks, click Ping Computer. The task opens a dialog box to display its progress.

    Note

    If the ping fails, use standard networking troubleshooting to figure out the issue with connectivity. Verify that the system is turned on.

  3. Click Close to close the dialog box.

  4. Under Health Service Watcher Tasks, click Computer Management. A Computer Management dialog box for the target system opens.

  5. Click Services and Applications to expand it.

  6. Click Services to display services.

  7. Right-click the System Center Management service, and then click Start.

    Note

    After the connection with the agent is restored, the alert will be automatically resolved and the computer status will go back to healthy.

These steps will fix the test failure created in this topic, as well as address a number of possible causes of a Health Service Heartbeat Failure. If an actual failure is not resolved by these steps, use standard troubleshooting techniques to figure out the cause of the issue. For instance, the alert displayed in Active Alerts shows how old the alert is. Check for events that happened at this time to see what might have caused an issue.