Appendix A - Microsoft Operations Framework
On This Page
Microsoft Operations Framework (MOF) is a collection of best practices, principles, and models. It provides comprehensive technical guidance for achieving mission-critical production system reliability, availability, supportability, and manageability for solutions and services built on Microsoft products and technologies. This guidance is presented in the form of white papers, service management guides, assessment tools, operations kits, best practices, case studies, and support tools that address the people, processes, and technologies for effectively managing production systems within today's complex distributed IT environment.
MOF Core Concepts
MOF directly addresses the two core concepts that IT organizations need to effectively manage in order to be successful. Those core concepts are service solutions and IT service management.
Service solutions include those business services that IT provides to customers and users. These may include line of business (LOB) applications, messaging, e-commerce infrastructure, print services, data storage, and others.
IT service management entails the IT functions or processes that must be performed in order to manage and maintain each of the aforementioned service solutions. These IT functions or processes are termed service management functions (SMFs) and include configuration management, change management, service desk, capacity management, and others. MOF currently recognizes a total of 20 individual service management functions, each of which is thoroughly described in an SMF-specific technical document. A brief description of each of the SMFs is provided later in this overview.
MOF Core Models
MOF principles and guidance are also organized around three core models, which are each manifested within the individual SMFs. These three models the MOF team model, MOF process model, and MOF risk model are introduced in the following sections. In-depth discussions of each model are available through technical guides located at http://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx.
The MOF Process Model
IT operations encompass a complex, dynamic set of procedures and processes that are extremely difficult to define and capture with a high degree of accuracy. Modeling such endeavors to an extreme level of precision would be cost prohibitive and generally inappropriate because the processes must typically be adapted locally anyway.
MOF simplifies the approach to modeling this complex set of dynamics into an easy-to-understand framework whose principles and practices are straightforward to incorporate and apply in the IT environment. This simplified approach enables the operations staff of an enterprise of any size, regardless of maturity level, to realize tangible benefits to the existing, or proposed, operations.
The MOF process model supports the successful provision of IT services by addressing four key principles.
Structured architecture. The process model provides the structure for process integration, life cycle management, mapping of roles and responsibilities, and overall management command and control. It also provides the underlying foundation for process automation and technology-specific operations.
Rapid life cycle, iterative improvement. To stay competitive in an aggressive business environment, MOF utilizes the concept of an iterative life cycle that supports both the capability to incorporate change quickly and to continuously assess and improve the overall operations environment. Recognizing that operations do not follow a sequential set of phases as does the typical IT development project (and MSF), the MOF process model categorizes key operational activities into quadrants that emphasize a spiral life cycle, with parallel processes occurring simultaneously.
Review-driven management. To aid in managing the operations environment, MOF recommends and describes many methods and techniques and delivers them through the service management functions (SMFs). Despite this comprehensive prescriptive guidance, rote application of these SMFs alone is insufficient to extract maximum benefit from IT investments. To provide this benefit, MOF institutes higher-level management reviews at key points in the life cycle. These reviews are held to evaluate performance for release-based activities as well as steady-state or daily operational activities.
Embedded risk management. Implementing IT service management functions can be seen as a form of risk management. Service management focuses on implementing functions that, among other benefits, reduce the risk of service outages. However, this provides too narrow a view of risk and how it needs to be managed. To enable a broader view, MOF has adopted a comprehensive risk model, whose concepts are incorporated throughout IT operations.
The Four MOF Quadrants
Applying these key principles, the MOF process model is divided into four highly integrated quadrants of operational activity: Changing, Operating, Supporting, and Optimizing.
Each of the quadrants has a unique service mission that is related to specific aspects of the IT life cycle. Each quadrants service mission is accomplished by means of the implementation and execution of underlying operational processes and activities called service management functions. Each of the SMFs is primarily associated with a particular quadrant (although overlap can occur); this association will be presented in a subsequent section.
MOF incorporates specifically tailored management reviews into the process model to assess the operational effectiveness of each quadrant, including the underlying service management functions. The management reviews are:
Release Readiness Review
Service Level Agreement (SLA) Review
Release Approved Review
The following diagram illustrates the MOF process model. It shows the SMFs and the operational reviews that are associated with each of the four quadrants.
The MOF Team Model
The evolution of the MOF team model began with a review of the objectives of IT management and the best practices applicable to meeting those objectives. Quality goals were established to provide a means to measure an organizations success in achieving them. The goals and the roles developed to achieve them are detailed as follows:
Table A.1 MOF Team Role Quality Goals
Well-developed release and change management processes and accurate inventory tracking of all IT services and systems.
Efficient management of physical environments and infrastructure tools.
Quality, cost-effective customer support and a service culture.
Predictable, repeatable, and automated day-to-day system management.
Protected corporate assets, controlled access to systems and information, and proactive planning for emergency response.
Efficient and cost-effective, mutually beneficial relationships with service and supply partners.
Note that the team model that MOF developed for this environment looks and runs very differently from traditional, centralized-computing, hierarchical data center operations teams. The roles balance process ownership and functional organization to reflect the conflicting requirements imposed in today's decentralized environments, where both processes and functions may cross geographical, business unit, time zone, and even company boundaries.
The MOF team model describes:
Best practices for using role clusters to structure operations teams.
The key activities and competencies of each role cluster.
How to scale the teams for different sizes and types of organizations.
Which role combinations work well and which do not.
Which guiding principles are the most successful for running and operating distributed computing environments on the Microsoft platform.
How the MOF Team Model relates to the MSF Team Model.
These principles, together with the quality goals noted previously, combine to define the individual roles comprising the MOF Team Model. The six roles of the team model should be interpreted as high-level operations role clusters, each of which is responsible for performing several core operations duties.
Most often, the roles are distributed among different groups within the IT organization and, sometimes, within the business user community, as well as external consultants and partners. As a general good practice for any team structure, roles and responsibilities need to be carefully defined and explicitly communicated so that deliverables and expectations are clear to everyone, thereby creating an environment conducive for focused individual work and quality contribution to the team effort.
Detailed descriptions of each of the MOF team roles and responsibilities are provided in the white paper, MOF Team Model, available in the MOF Resource Library at http://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx.
The MOF Risk Model
The MOF Risk Model is essentially the same as the risk model employed in the MSF Risk Management Discipline, which is described in the risk management section of the overview of MSF disciplines. A common approach to risk management is one aspect of the complementary nature of the two frameworks.
MOF Quadrants, SMFs, and Migration Operations
This section of the document describes each quadrant and the service management functions that fall within it. It then discusses operating the migration in the context of the appropriate SMFs.
The changing quadrant includes the processes and procedures required to identify, review, approve, and incorporate change into a managed IT environment. Change includes hardware and software assets as well as specific process and procedural changes. The objective of the change process is to introduce changes into the IT environment quickly and with minimal disruption to service.
Service Management Functions
Change Management has the objective of introducing change into the IT environment quickly and with minimal disruption to service. Change management should be applied to any asset in the environment that is necessary for meeting the service level requirements of the solution. The change management SMF utilizes such processes, artifacts, and authorities as change controls, requests for changes, and the change advisory board.
Configuration Management is responsible for identifying, recording, tracking, and reporting of key IT components or assets called configuration items (CIs). CIs typically correspond to each of the assets placed under the control of the change management SMF. The configuration management SMF is concerned with establishing, maintaining, and managing the CIs and the configuration management database (CMDB) where CI information is recorded.
Release Management is the process of coordinating and managing releases to a live environment. This process ensures that releases are implemented in the live environment as quickly as possible to meet business requirements. It also ensures that releases are implemented in a controlled and systematic way that limits negative impacts to the IT environment.
The Operating Quadrant includes the IT operating standards, processes, and procedures that are applied regularly to service solutions to achieve and maintain service levels within predetermined parameters. The goal of the Operating Quadrant is highly predictable manual and automated execution of day-to-day tasks. Eight SMFs support this goal.
Service Management Functions
System Administration is responsible for keeping the enterprise systems running. This SMF oversees the entire distributed processing environment. It coordinates the activities of the other SMFs within the quadrant and ensures that the SMFs are performed efficiently and effectively.
Security Administration maintains a safe computing environment, ensuring that all data within the system is secure and complete. The SMF is responsible for the six basic requirements that help ensure data confidentiality, integrity, and availability. These six basic requirements are: user identification, authentication, access control or authorization, confidentiality, system integrity, and non-repudiation.
The Service Monitoring and Control SMF allows the operations staff to observe the health of an IT service in real time. System components that are typically monitored to ensure that IT service remains available include process heartbeats, job and queue status, server response times and resource loading, and others. The control portion of the SMF refers to the notifications or actions it provides to ensure that appropriate actions are taken in response to indicators of system failure or diminished performance.
Job Scheduling involves the continuous organization of jobs and processes into the most efficient sequence. The intent of this function is to maximize system throughput and utilization to meet service level agreement and user requirements, and to optimize the use of available capacity. Job scheduling is closely tied to service monitoring and control and to capacity management.
Network Administration is the process of managing and running all networks within an organization. A comprehensive discipline that must manage people, processes, and the technologies with which they interact, it is responsible for the maintenance of the physical components (such as servers, routers, switches, and firewalls) that make up the organizations network.
Directory Services Administration allows users and applications to find such network resources as users, servers, applications, tools, services, and other information over the network. The goal of directory services administration is to ensure that information is accessible through the network by any authorized requester by means of a simple and organized process.
Print and Output Management is concerned with all data that is printed or compiled into reports that are distributed to various members of the organization. The print and output management team must ensure that any sensitive printed material is properly secured. Its goal is to control the production and distribution of data and to report output in line with service level agreements.
Storage Management deals with on-site and off-site data storage for the purposes of data restoration and historical archiving. The storage management team must ensure the physical security of backups and archives. Its goal is to define, track, and maintain data and data resources in the production IT environment through appropriate planning, policy setting, and monitoring for the management of storage assets.
The key objective of this quadrant is the timely resolution of incidents, problems, and inquiries for end users of the IT services provided. SMFs within the supporting quadrant use both reactive and proactive functions to manage services in accordance with service level agreements. Three SMFs belong in this quadrant.
Service Management Functions
The Service Desk is the key SMF of the Supporting Quadrant. It coordinates all activities and customer communications about incidents, problems, and inquiries related to production systems. It is the single point of contact between service providers and customers/users on a day-to-day basis. In particular, Service Desk interacts closely with Incident Management in performing their respective functions and procedures.
Incident Management is the process of managing and controlling faults and disruptions in the use or implementation of IT services as reported by customers or IT partners. The primary goal of this SMF is to restore normal service operation as quickly as possible and minimize the adverse impact on business operations, thus ensuring the maintenance of the best possible quality and availability of levels of service within the limits of the service level agreement.
Problem Management seeks to ensure stability in service solutions by identifying and removing errors in the IT infrastructure. Problem Management is responsible for clearly defining the overall support model used, escalation procedures, incident correlation, root cause analysis, problem resolution, and reporting of incidents and their resolution.
The Optimizing Quadrant includes six SMFs for managing (decreasing) costs while maintaining or improving service levels. This includes review of outages/incidents, examination of cost structures, staff assessments, and availability and performance analysis, as well as capacity forecasting. The optimizing functions described are typically performed in an iterative fashion over time: modifications are performed, performance and cost are noted, and then the cycle is repeated.
Service Management Functions
Service Level Management provides a structured way for consumers and providers of IT services to meaningfully discuss and assess how well a service is being delivered. The primary objective of service level management is to provide a mechanism for setting clear expectations with the customer and user groups with respect to the service being delivered, as well as negotiating how to measure performance against these requirements.
Capacity Management is the process of planning, sizing, and controlling service solution capacity to satisfy user demand at a reasonable cost within the performance levels stated in the service level agreement (SLA) or stated internally to IT as an operating level agreement (OLA).
The singular goal of Availability Management is to ensure that the customer can use a given IT service at any time. This requires heavy involvement through the requirements and planning phases of the project in order to carefully evaluate requests for change within the production environment and to minimize adverse effects on availability.
Financial Management ensures that any solution proposed by a foundational SMF (service continuity, availability, capacity, or workforce) meets the requirements defined in service level management (SLM) and is justified from a cost and budget standpoint. This is often referred to as a cost-benefit analysis. Activities performed within the function include such standard accounting practices as budgeting, cost allocations, and others.
Achieving the objectives described for each of the service management functions requires an adequately skilled and trained workforce. The Workforce Management SMF implements best practices to continuously assess key aspects of the IT workforce and make appropriate investments and changes as necessary. Activities performed within this function include recruiting, skills development, knowledge transfer, competency levels, team building, process improvements, and resource deployment.
Service Continuity Management, also known as contingency management, focuses on minimizing the disruptions to business by the failure of mission-critical systems. This process deals with planning to cope with and recover from an IT disaster and considers what activities need to be performed in the event of a service outage not attributed to a full-blown disaster. It also provides guidance on safeguarding the existing systems