Automating Monitoring with the Exchange Management Pack

 

The components in the Exchange Management Pack can help you detect and respond to critical events. Frequently, timely alerts help prevent Exchange service outages, but preventing outages is not the only objective of system monitoring. You can also use the Exchange Management Pack to automate the following monitoring tasks:

  • Collecting Event and Performance Information   The Exchange Management Pack collects monitoring information from several sources that are defined by individual rules. For example, a rule to look for available disk space might require information from System Monitor, events data generated by other scripts, and Windows event logs. Each rule defines the provider of the information that is used in that rule. Information providers to Microsoft Operations Manager 2005 include the following:

    • Event logs   Warnings and errors, in addition to some informational events, are used to collect information. Rules search for predefined event numbers to obtain system information.

    • Event and performance information gathered through MOM and MOM scripts   The Exchange Management Pack includes its own methods and scripts to collect information about the Exchange 2003 organization and system.

    • System Monitor   The Exchange Management Pack can collect information from System Monitor objects and counters. The counters and their uses are explained in topic (App B).

    • Simple Network Management Protocol (SNMP)   SNMP is an application layer protocol that is part of the TCP/IP protocol suite. It provides a means to manage network performance, resolve problems, and improve capacity.

    • Windows Management Instrumentation (WMI)   WMI is a component of Windows Server 2003 and Windows XP that provides information about software and hardware components. Through WMI, you can query and set information about hardware, systems, services and applications, links, networks, and components of your infrastructure.

  • Maintaining System Health   To keep your Exchange 2003 organization running smoothly and with high availability, you must continuously maintain the overall health of your infrastructure. This involves many components not only specific to Exchange 2003, but also involving the underlying physical network topology, Active Directory deployment, and operating system configuration. MOM can examine your network's configuration and produce reports and alerts to help improve system health. At a minimum, you should deploy Active Directory Management Pack together with the Exchange Management Pack, so that you can monitor Active Directory in addition to the components specific to Exchange 2003. A functioning Active Directory environment is required for Exchange Server 2003 operations. Also consider deploying supplemental management packs such as the DNS and Internet Information Services (IIS) management packs.

    Some methods to monitor system health have been discussed already. You can use reports specific to the Exchange Management Pack, such as the reports from the Health Monitoring and Operations, Capacity Planning, and Traffic Analysis categories in the Exchange Server reporting category of MOM Reporting.

    Another method of monitoring system health is to develop an escalation ticket system to deal with and resolve critical system states.

    By default, MOM includes several default alert resolution states. You can view or modify these states in MOM 2005 Administrator Console by navigating to Microsoft Operations Manager\Administration\Global Settings and accessing the properties of the Alert Resolution States object. The default alert resolution states are shown in the following table.

    Resolution state ID Service level agreement

    New

    0

    10 minutes

    Acknowledged

    85

    20 minutes

    Level 1: Assigned to helpdesk or local support

    170

    4 hours

    Level 2: Assigned to subject matter expert

    180

    2 days

    Level 3: Requires scheduled maintenance

    190

    7 days

    Level 4: Assigned to external group or vendor

    200

    30 days

    Resolved

    255

    --

    A ticket system helps make sure that critical issues are tracked and their resolution assigned for the most appropriate response.

    You can also use the warnings and notifications to alert you of future problems. For example, you can track CPU usage and create an alert when the CPU usage exceeds a certain threshold. However, not every such alert is valuable. In this case, a temporary spike in user activity may generate the alert.

  • Addressing issues proactively   Addressing problem areas before an actual problem occurs is easier with the Exchange Management Pack. With the Exchange Management Pack, you can monitor e-mail flow, logon failure, service disruption, database availability, queue abnormalities, routing and transport failures, and underlying network and hardware availability.

  • Capacity planning   Systems growth monitoring is especially important in large, enterprise environments with many users. The Exchange Management Pack aids with capacity planning by enabling you to store reports and generate baselines for comparison. By using a standardized method of tracking disk use, processor use, user logons, outgoing and incoming messages delivered, and other Exchange components, the Exchange Management Pack makes a side-by-side comparison easier to do. For example, you can view free disk space available per month to understand the growth rate of your organization, and use that data to anticipate and plan for future requirements.

  • Alerting administrators about critical states   As previously mentioned, the Exchange Management Pack can send alerts about critical system states, such as Exchange service problems or denial-of-service attacks, to an administrator. The default notification group, found under Rules/Notification Groups/, for rule responses in the Exchange Management Pack, is Mail Administrators. You can add operators to this group and set up how each operator is to be notified. You can page, e-mail, and notify by external command, and you can configure at what times the operator is be notified. This is useful where you must have continuous monitoring in an organization that operates on a shift basis. Each operator in such a case can be notified at only a specific time range and on specified days (that is, during their shift).