Process 1: Planning

Published: April 25, 2008   |   Updated: October 10, 2008



Figure 4. Planning

Activities: Planning

Thorough planning is the first process in achieving reliability and consists of the following activities:

  • Define service requirements.
  • Plan and analyze business and technical requirements.

The first activity in planning for Reliability Management is to clearly understand and document the business requirements for the service. Understanding the business objectives allows IT to prioritize and allocate resources to the service and to better align technology investment decisions with the organization’s priorities. IT gathers these requirements by engaging the business through an ongoing relationship management process.

The second activity in planning focuses on the effort and investment that IT must make to ensure that business expectations are met. Doing this successfully involves understanding both the target IT environment and the specifications for the new service: how these align with each other, how the new service will affect the current environment, and where there are significant technical or resource capability gaps.

These activities should ideally occur during the design phase of a new service so that IT operations can influence the specifications and ensure that the service is designed to operate reliably. A regular dialogue between IT and the business is crucial because trade-off decisions will usually need to be made between an ideal state and a practical, cost-effective one. The following table describes these planning activities in more detail.

Table 4. Activities and Considerations for Planning



Define service requirements

Key questions:

  • What is the definition of the service? How does the business define the service as “working properly”?
  • What is the business impact if the service is unavailable for a short or extended period? If the service is degraded?
  • Are there any financial penalties to IT or the business if the service is degraded or unavailable? Are there any legal risks?
  • Is this service (or the data) subject to regulatory constraints or internal policies?
  • Does demand for this service vary according to daily, weekly, monthly, or annual patterns? Are there times when this service is not needed?
  • What are the growth plans for this service over the next 6, 12, and 24 months? Will it remain static? Will additional functionality be added later, and how will this affect our planning?


  • Service level agreements (SLAs)
  • Applicable regulations, laws
  • Internal policies
  • Risk and impact analysis
  • Future plans and anticipated changes/growth
  • Vendor support and delivery operating level agreements (OLAs)
  • Service dependencies, both internal and external
  • Reliability parameters for the service


  • Service level requirements (hours of operation, maintenance windows, recoverability time frames, continuity requirements, and so on)
  • Data classification and corresponding data handling policy
  • Service priority
  • Growth plans

Best practices:

  • State reliability in business terms—for example, “When a user queries the service, he or she should receive a response within 20 seconds.”
  • Document, clarify, and verify as many of the business requirements as possible. (Whether the goals are rolled into a single reliability plan or separate plans, the business requirements are mostly similar.)
  • Discuss the service’s financial impacts as early in the planning process as possible. By ensuring that the business understands the cost implications, IT can avoid situations where plans are made only to be rejected later for being too costly.
  • Plan for early involvement by IT operations and include “design for operations” requirements in the service specifications. This practice will reduce the amount of rework and design changes, which in turn means fewer tradeoffs later.
  • To encourage consistency of operational design, ensure that the development team understands the infrastructure standards and strategy.
  • Schedule regular business and IT reviews to improve communication, foster understanding, and encourage a unified approach to planning and design.
  • Create a communications plan, with ownership and accountability roles clearly defined, so as to reduce confusion about the decision-making process as it relates to this service.

Plan and analyze

Key questions:

  • Do we have sufficient information to translate business requirements into technical requirements for the service?
  • Can the service be delivered to meet the required expectations, or are there gaps in the design or build that might create a roadblock? Can we overcome these gaps, or should we document them and communicate them back to the business?
  • Do we have budget approval and business support to proceed with building and implementing the reliability plans?
  • What other services does this service affect? What are the technical considerations? How does this affect support services?
  • What information, data, or trend analyses will help us to understand how this service will perform in various conditions? Is there a known error database and corresponding workarounds for this service or its components? (For more information see the Customer Service SMF.)


  • Outputs from previous process
  • Service map
  • Policy and regulation requirements
  • Historical trend information and known errors
  • Technical information about the design and build of the service
  • Operating guides


  • Reliability specification model and templates

Best practices:

  • To obtain the most complete idea of needs, involve all relevant people in assessment and planning. This group might include individuals from solution Development, Enterprise Architecture, Operations, Service Desk, and the business.
  • Engage vendors, suppliers, and service providers to give advice on best use of the technology, design reviews, performance, and scalability behaviors.
  • Consider looking at usage scenarios, particularly if they have been tested and evaluated in a lab environment; they are a practical way to understand the service’s expected performance.
  • Eliminate confusion by defining and communicating an enterprise-wide standard for operations requirements. A clear understanding of exactly what the operations environment needs removes uncertainty and promotes consistency of standards.