Heartbeat and Heartbeat Failure Settings in Operations Manager 2007

Applies To: Operations Manager 2007 R2, Operations Manager 2007 SP1

An agent sends a packet of data to its management server on a periodic basis; by default, once every 60 seconds. This packet of data is called the heartbeat. By default, a management server can tolerate three missed heartbeats. If the management server registers four missed heartbeats, an alert will be generated against the health service on the agent computer indicating it is no longer available. The management server then attempts to diagnose the problem by pinging the agent computer. If the ping is unsuccessful, another alert is generated, indicating that the computer is no longer reachable. If the initial diagnostic ping is successful, no further action is taken.

Note

By default, alerts for missed heartbeats and computer not reachable are disabled for client operating systems. To receive alerts for client operating systems, override the Health Service Heartbeat Failure and Computer Not Reachable monitors for the class Windows Client Operating System to set the Generates Alert parameter to True.

If the management server and the agent are separated by a slow connection, it might be normal for three minutes to pass without the management server receiving a heartbeat. To prevent unnecessary alerts, you can increase the number of missed heartbeats that a management server will tolerate.

It is also possible that you might be monitoring critical applications in your environment and service-level agreements might not allow waiting three minutes before alerts are generated. In this situation, you can decrease the heartbeat interval, thus increasing how often an agent sends a heartbeat.

There are two settings you can adjust that relate to the heartbeat: the heartbeat interval and the number of missed heartbeats. Heartbeat interval refers to how often an agent sends a heartbeat. Number of missed heartbeats refers to how many heartbeats a management server will tolerate before running a diagnostic ping. Heartbeat interval and number of missed heartbeats can be configured at a global level and thus affect every agent and management server in the management group. In addition, the number of missed heartbeats can be overridden at the management server level and heartbeat interval can be overridden at the agent level.

In addition to using these options for failure settings, you also have the option of disabling heartbeat monitoring for all agents or for the following specified agents:

  • That connect to the network intermittently.

  • That connect to the network over poor connections or use dial-up connections.

  • On systems that are frequently restarted.

Global Heartbeat Settings

The following heartbeat settings are set at a global level and affect all management servers and agents in the management group.

How to Globally Change the Heartbeat Interval

The following procedure shows how to change the heartbeat interval at the global level. Changes made in this procedure affect all the agents in the management group.

To configure agent heartbeat interval settings

  1. Log on to the computer with an account that is a member of the Operations Manager Administrators role for the Microsoft System Center Operations Manager 2007 management group.

  2. In the Operations console, click the Administration button.

  3. In the Administration pane, expand Administration, and then click Settings.

  4. In the Settings pane, expand Type: Agent, right-click Heartbeat, and then click Properties.

  5. In the Global Agent Settings - Heartbeat dialog box, on the General tab, enter a value in the Heartbeat interval box to specify how often an agent generates a heartbeat, and then click OK.

    Note

    The maximum value is 86,400 seconds (1 day). The minimum value is 5 seconds.

How to Globally Change the Number of Missed Heartbeats a Management Server Will Tolerate

The following procedure shows how to change the number of missed heartbeats at the global level. Changes made in this procedure affect all the management servers in the management group.

To change the number of missed heartbeats a management server will tolerate

  1. Log on to the Operations console with an account that is a member of the Administrators role for the Operations Manager 2007 management group.

  2. Click the Administration button.

  3. In the Administration pane, expand Administration, and then click Settings.

  4. In the Settings pane, expand Type: Server, right-click Heartbeat, and then click Properties.

  5. In the Global Management Server Settings - Heartbeat dialog box, on the General tab, in the Number of missing heartbeats allowed, enter or select the number of missing heartbeats the management server will allow before it starts to ping the agent.

    Note

    The maximum value allowed for the number of missing heartbeats is 100. The minimum value allowed is 1.

Management Server- Heartbeat Settings and Agent-Specific Heartbeat Settings

The following heartbeat settings can be configured on a per-management server basis.

How to Override the Heartbeat Interval

Use the following procedure to override the agent heartbeat interval settings for a specific agent.

To override the heartbeat interval setting

  1. Log on to the Operations console with an account that is a member of the Administrators role for the Operations Manager 2007 management group.

  2. Click the Administration button.

  3. In the Administration pane, expand Administration, expand Device Management, and then click Agent Managed.

  4. In the results pane, right-click the object that you want to view the properties of, and then click Properties.

  5. In the Agent Properties dialog box, select Override global server settings.

  6. Change the Heartbeat interval. The maximum value allowed for the heartbeat interval is 86,400 seconds (1 day). The minimum value allowed is 5 seconds.

  7. Click OK.

How to Override the Number of Missed Heartbeats a Management Server Will Tolerate

Use the following procedure to override the management group heartbeat failure setting and configure the number of missed heartbeats a specific management server will allow for an agent before it changes the state of the respective computer.

To override the number of missed heartbeats

  1. Log on to the Operations console with an account that is a member of the Administrators role for the Operations Manager 2007 management group.

  2. Click the Administration button.

  3. In the Administration pane, expand Administration, expand Device Management, and then click Management Servers.

  4. In the Management Server Properties dialog box, click the Heartbeat tab.

  5. On the Heartbeat tab, do the following:

    1. Select Override global server settings.

    2. Change the Number of missing heartbeats to the number that you want.

  6. Click OK.

Disabling Heartbeat Monitoring

You can disable heartbeat monitoring for all agents or for specified agents.

To disable monitoring for all agents

  1. Click Authoring to open the Authoring area.

  2. Expand Management Pack Objects, and then click Monitors.

  3. Find and right-click the Health Service Heartbeat Failure monitor for Health Service Watcher (Agent).

    Note

    You can also use Find Now to find the monitor.

  4. On the shortcut menu, point to Overrides, point to Disable the Monitor, and then click For all objects of the type: Health Service Watcher (Agent).

  5. Expand Management Pack Objects, and then click Rules.

  6. Find and right-click Heartbeat Failure - Success under Health Service Watcher Group (Agent).

  7. On the shortcut menu, point to Overrides, point to Disable the Rule, and then click For all objects of the type: Health Service Watcher (Agent).

  8. Repeat steps 6 and 7 to disable the rules Heartbeat Failure - Warning and Heartbeat Failure - Error.

To disable monitoring for a subset of managed systems, create a group by using the Create Group Wizard. For instance, to disable monitoring for the systems Server01.contoso.com, Server78.contoso.com, and Server99.contoso.com, create a group called Heartbeat Monitor Disabled Agents containing members of the type Health Service Watcher Group (Agent) and add the systems to that group.

For information about creating groups, see How to Create Groups in Operations Manager 2007.

To disable monitoring for a group of agents

  1. Click Authoring to open the Authoring area.

  2. Expand Management Pack Objects, and then click Monitors.

  3. Find and right-click the Health Service Heartbeat Failure monitor for Health Service Watcher (Agent).

  4. Click Authoring to open the Authoring area.

  5. Expand Management Pack Objects and click Monitors.

  6. Right-click the Health Service Heartbeat Failure monitor for Health Service Watcher (Agent).

    Note

    Use Find Now, if necessary, to find the monitor.

  7. On the shortcut menu, point to Overrides, point to Disable the Monitor and then click For all objects of the type: Health Service Watcher (Agent).

  8. In the Select Object dialog box, click the group to be used (in this example, Heartbeat Monitor Disabled Agents), and then click OK.

  9. Expand Management Pack Objects, and then click Rules.

  10. Find and right-click Heartbeat Failure - Success under Health Services Watcher Group (Agent).

  11. Right-click, point to Overrides, point to Disable the Rule, and then click For a group.

  12. In the Select Object dialog box, click the group to be used (in this example, Heartbeat Monitor Disabled Agents), and then click OK.

  13. Click Yes to confirm.

  14. Repeat steps 7 through 13 to disable the rules Heartbeat Failure-Warning and Heartbeat Failure - Error.