Service Monitoring and Control Overview
Service Monitoring and Control (SMC) is the real-time observation of and alerting about health conditions (characteristics that indicate success or failure) in an IT environment. It helps to ensure that deployed services are operated, maintained, and supported in line with the service level agreement (SLA) targets agreed to between the business and IT. For more information about SLAs and operations level agreements (OLAs), see the Business/IT Alignment SMF.
This SMF describes what is required to successfully implement SMC. The components of this process are:
- Establishing a service monitoring function.
- Understanding the nature of new and existing IT services.
- Understanding the requirements for successful service monitoring tools.
- Ensuring that all relevant information from service monitoring is acted upon by the appropriate people.
- Generating all the information required by other SMFs.
- Improving the quality of service information.
The importance of effective service monitoring cannot be overemphasized. If a service can’t be monitored, it can’t be measured, and if it can’t be measured, it can’t be managed.
Service Monitoring and Control Role Types
The primary team accountability that applies to the Service Monitoring and Control SMF is the Operations Accountability. The role types within that accountability and their primary activities within this SMF are displayed in the following table.
Table 1. Operations Accountability and Its Attendant Role Types
Role Type
Responsibilities
Role in This SMF
Monitoring Manager
- Responsible for SMC SMF tasks
- Ensures that the right systems are monitored
- Facilitates effective monitoring mechanism
- Is expert on how to monitor, not what to monitor.
- Monitors IT service health
- Helps define IT service to be monitored
- Helps prepare service component health model
Scheduling Manager
- Plans schedule of individual activities within operations
- Owns timing decisions
- Plans operational work, including maintenance
- Avoids conflicting work
Operations Manager
- Is accountable for Operations SMF and Service Monitoring and Control
- Oversees
- Drives definition of IT service to be monitored
- Drives preparation of service component health model
Goals of Service Monitoring and Control
The goals of service monitoring and control include the following:
- Observe the health of IT services.
- Take remedial actions that minimize the impact of service incidents and system events.
- Understand the infrastructure components responsible for the delivery of services.
- Provide data on component or service trends that can be used to optimize the performance of IT services.
Table 2. Outcomes and Measures of the Deploy SMF Goals
Outcomes
Measures
Improved overall availability of services
Percent of time service is available
A reduction in the number of SLA and OLA breaches
Number of breaches to SLAs and OLAs
- A reduction or prevention of service incidents through the use of proactive remedial action
Number of service incidents
Key Terms
The following table contains definitions of key terms found in this guide.
Table 3. Key Terms
Term
Definition
Aggregation
A function that makes it possible to treat a series of similar events as a single event
Alert
A notification that an event requiring attention has occurred
Configuration item (CI)
An IT component that is under configuration management control
Correlation
A function that groups events together or defines an event’s relationship with other events that together represent an impact
Event
An occurrence within the IT environment detected by a monitoring tool
Health model
A definition of CI health categorized by availability, configuration, performance, or security
IT Control
A specific activity performed by people or systems designed to ensure that business objectives are met
Reporting
The collection, production, and distribution of information about IT services
Resolution completion
The point in the control process where manual/automatic action has been taken and all recording and incident management have been completed
Rules
A predetermined policy that describes the provider (the source of data), the criteria (used to identify a matching condition), and the response (the execution of an action)
Threshold/criteria
A configurable value above which something is true and below which it is not
This accelerator is part of a larger series of tools and guidance from Solution Accelerators. |
Download |
Solution Accelerators Notifications |
Feedback |