Service Monitoring and Control Overview

 

Service Monitoring and Control (SMC) is the real-time observation of and alerting about health conditions (characteristics that indicate success or failure) in an IT environment. It helps to ensure that deployed services are operated, maintained, and supported in line with the service level agreement (SLA) targets agreed to between the business and IT. For more information about SLAs and operations level agreements (OLAs), see the Business/IT Alignment SMF.

This SMF describes what is required to successfully implement SMC. The components of this process are:

  • Establishing a service monitoring function.
  • Understanding the nature of new and existing IT services.
  • Understanding the requirements for successful service monitoring tools.
  • Ensuring that all relevant information from service monitoring is acted upon by the appropriate people.
  • Generating all the information required by other SMFs.
  • Improving the quality of service information.

The importance of effective service monitoring cannot be overemphasized. If a service can’t be monitored, it can’t be measured, and if it can’t be measured, it can’t be managed.

Service Monitoring and Control Role Types

The primary team accountability that applies to the Service Monitoring and Control SMF is the Operations Accountability. The role types within that accountability and their primary activities within this SMF are displayed in the following table.

Table 1. Operations Accountability and Its Attendant Role Types

Role Type

Responsibilities

Role in This SMF

Monitoring Manager

  • Responsible for SMC SMF tasks
  • Ensures that the right systems are monitored
  • Facilitates effective monitoring mechanism
  • Is expert on how to monitor, not what to monitor.

 

  • Monitors IT service health
  • Helps define IT service to be monitored
  • Helps prepare service component health model

Scheduling Manager

  • Plans schedule of individual activities within operations
  • Owns timing decisions
  • Plans operational work, including maintenance

 

  • Avoids conflicting work

Operations Manager

  • Is accountable for Operations SMF and Service Monitoring and Control

 

 

  • Oversees
  • Drives definition of IT service to be monitored
  • Drives preparation of service component health model

Goals of Service Monitoring and Control

The goals of service monitoring and control include the following:

  • Observe the health of IT services.
  • Take remedial actions that minimize the impact of service incidents and system events.
  • Understand the infrastructure components responsible for the delivery of services.
  • Provide data on component or service trends that can be used to optimize the performance of IT services.

Table 2. Outcomes and Measures of the Deploy SMF Goals

Outcomes

Measures

Improved overall availability of services

Percent of time service is available

A reduction in the number of SLA and OLA breaches

Number of breaches to SLAs and OLAs

  • A reduction or prevention of service incidents through the use of proactive remedial action

Number of service incidents

Key Terms

The following table contains definitions of key terms found in this guide.

Table 3. Key Terms

Term

Definition

Aggregation

A function that makes it possible to treat a series of similar events as a single event

Alert

A notification that an event requiring attention has occurred

Configuration item (CI)

An IT component that is under configuration management control

Correlation

A function that groups events together or defines an event’s relationship with other events that together represent an impact

Event

An occurrence within the IT environment detected by a monitoring tool

Health model

A definition of CI health categorized by availability, configuration, performance, or security

IT Control

A specific activity performed by people or systems designed to ensure that business objectives are met

Reporting

The collection, production, and distribution of information about IT services

Resolution completion

The point in the control process where manual/automatic action has been taken and all recording and incident management have been completed

Rules

A predetermined policy that describes the provider (the source of data), the criteria (used to identify a matching condition), and the response (the execution of an action)

Threshold/criteria

A configurable value above which something is true and below which it is not

This accelerator is part of a larger series of tools and guidance from Solution Accelerators.

Download

Get the Microsoft Operations Framework 4.0

Solution Accelerators Notifications

Sign up to learn about updates and new releases

Feedback

Send us your comments or suggestions