Export (0) Print
Expand All

Managing Line of Business Applications using Microsoft System Center Operations Manager 2007

Technical Solution Brief

Published: March 19, 2007

Download

Download Technical Solution Brief, 1.24 MB, Microsoft Word file

PowerPoint PowerPoint Presentation, 2.09 MB, Microsoft PowerPoint file

Situation

Solution

Benefits

Products&Technologies

Microsoft IT established service level agreements with 99.99 percent availability for infrastructure services. Yet, LOB applications depend on interactions between components and services that are not directly related to infrastructure services. Accordingly, tracking availability and performance for these distributed business solutions is difficult.

Using System Center Operations Manager 2007, Microsoft IT can implement new end-to-end monitoring solutions in a centralized monitoring infrastructure. The new solutions are the basis for proactive management of LOB applications.

  • Lower operating costs
  • Increased service levels
  • Alignment of business and IT services
  • Increased productivity of employees
  • Microsoft Operations Manager 2005
  • Microsoft System Center Operations Manager 2007
  • Microsoft SQL Server 2005
  • Microsoft ADO.NET
  • Message Queuing

Microsoft® System Center Operations Manager 2007 provides the foundation for the Microsoft Information Technology (Microsoft IT) group to increase operational efficiency and advance the alignment of IT services with the needs of Microsoft. Microsoft IT achieves these goals by providing business units with greater control over line-of-business (LOB) applications based on a new generation of end-to-end service management solutions.

LOB applications are at the core of all areas of business at Microsoft. They facilitate strategic planning and internal business processes, in addition to collaboration and communication with customers, partners, and vendors. They also enable employees to manage their careers and human resources (HR) benefits programs. Overall, more than 2,500 LOB applications exist in the corporate production environment. Managing these complex and distributed applications is a vital task for Microsoft IT.

This technical solution brief explains how Microsoft IT used System Center Operations Manager 2007 to develop an end-to-end service management solution for a mission-critical LOB application in the corporate production environment. Microsoft IT conducted this pilot project in November 2006 by using the Release Candidate 1 version of the product.

This technical solution brief contains information for IT decision makers who are evaluating the benefits of System Center Operations Manager and for application designers and IT implementers who are planning to design and implement end-to-end enterprise management solutions. This paper assumes that the audience is already familiar with the concepts of Microsoft Windows Server® 2003, the Active Directory® directory service, and Microsoft Operations Manager (MOM) 2005. A high-level understanding of the new features and technologies included in System Center Operations Manager is also helpful. Detailed product information is available at http://go.microsoft.com/fwlink/?LinkId=64017.

Note: For security reasons, the sample names of internal resources, application components, services, and solutions used in this document do not represent actual names used within Microsoft and are for illustration purposes only. In addition, the contents of this document describe how Microsoft IT runs its enterprise data centers. The procedures and processes included in this document are not intended to be prescriptive guidance on how to run a generic data center and may not be supported by Microsoft Customer Support Services.

Introduction

Microsoft IT maintains a complex enterprise environment that consists of 16 data centers, 441 office locations in 98 countries, and more than 121,000 users. The corporate network includes approximately 350,000 workstations and 11,000 production servers, housing more than 1,000 terabytes of data. On top of this infrastructure reside 2,500 LOB applications that Microsoft IT maintains to support the business units of the company, including finance, HR, marketing, operations, purchasing, and sales.

Maintaining this large-scale environment is primarily the task of the following two main divisions in Microsoft IT:

  • Infrastructure Services (IS) Maintains and supports the telephony and network infrastructure, the server operating systems, the core infrastructure services (such as Domain Name System [DNS] and Active Directory), the global messaging environment, and the monitoring infrastructure. More than 2,500 IT professionals work in the IS division, organized in teams according to technology services.

  • Business IT Develops, maintains, and supports the LOB applications in the corporate production environment. For example, the HR IT group is responsible for the LOB applications of the HR department, whereas the Commercial IT group is in charge of all sales, marketing, and product support applications. Another group provides support for the central enterprise resource planning (ERP) system that Microsoft uses for all financial and supply-chain management and for storing the HR master data. The Financial IT group manages the payroll and accounting applications. Overall, the business IT groups includes more than 1,500 IT professionals.

Figure 1 illustrates how IS and business IT groups view the corporate production environment. According to the different perspectives, IS uses monitoring tools that focus on the infrastructure; business IT groups needs tools that include clients, servers, applications, and synthetic transaction monitoring of the end-user perspective to manage and improve the end-to-end delivery of IT services.

Figure 1. IS and business IT views of the corporate production environment

Figure 1. IS and business IT views of the corporate production environment

Improving Service Levels for LOB Applications

By using MOM 2005 and a limited number of third-party solutions, Microsoft IT established 99.99 percent availability for business-critical infrastructure services, such as Microsoft Exchange Server, Microsoft SharePoint® Products and Technologies, and Active Directory. First-level operators discovered about 87 percent of all infrastructure issues before a user recognized the issues. About 98 percent of all alerts were detected within two minutes of occurrence. However, as internal support statistics revealed, users still found the majority of application problems, putting the business IT groups in a reactive position. Business IT groups needed effective monitoring solutions that supported a proactive service management approach beyond the capabilities of MOM 2005 in order to identify and resolve issues that affect the health of distributed IT services before users discover these issues.

To remove the obstacles that prevented Microsoft IT from increasing service levels in the LOB application space prior to System Center Operations Manager 2007, Microsoft IT devised the following strategy:

  1. Consolidate separate monitoring environments in a centralized System Center Operations Manager infrastructure that covers the entire corporate production environment.

  2. Develop cost-efficient management packs and end-to-end service management solutions for standard and custom LOB applications by using new authoring and editing tools that are readily available in System Center Operations Manager.

  3. Delegate monitoring responsibilities to the individual Microsoft IT teams and groups according to their areas of control so that they can integrate the relevant pieces of health information into their operational processes.

Centralizing the Monitoring Infrastructure

Highly distributed LOB applications consist of loosely coupled components that share common resources, some distributed worldwide. To achieve effective end-to-end service management for these types of applications, the monitoring infrastructure must cover the entire corporate production environment, including hardware devices, Microsoft products, custom Microsoft IT solutions, and non-Microsoft components. Any gaps in the monitoring infrastructure represent gaps in the end-to-end service management scenario.

The following System Center Operations Manager features enabled Microsoft IT to establish a centralized global monitoring infrastructure:

  • Performance and scalability enhancements Support of high-availability technologies, such as Windows® Clustering in Windows Server 2003, security-enhanced and automated agent deployments based on Microsoft Systems Management Server (SMS) packages, and automated agent configuration discovery based on Active Directory, enable Microsoft IT to deploy System Center Operations Manager efficiently. System Center Operations Manager automatically discovers new systems and applications and deploys the available and appropriate monitoring policies. Months before the product's release to manufacturing (RTM), Microsoft IT already managed more than 20,000 computers by using System Center Operations Manager 2007 Release Candidate 2.

  • Security enhancements Active Directory integration, role-based authorization, and Run As accounts enable Microsoft IT to support individual teams and groups that have varying monitoring needs and security requirements without having to deploy and maintain redundant management servers and reporting databases.

  • Simple Network Management Protocol (SNMP) Direct support of SNMP-enabled devices provides an opportunity to reduce costs by replacing third-party solutions. It also enables Microsoft IT to include these devices in end-to-end service management solutions for LOB applications to provide operators with a complete view of the entire application infrastructure.

  • Operations Manager Connector Framework Based on Operations Manager Connector Framework, Microsoft IT integrated System Center Operations Manager with external systems, such as the Helpdesk ticketing system for incident management; the configuration management database (CMDB) to track server purchases, configurations, and retirements; and previous third-party solutions to manage network devices. Bi-directionally synchronized integration enables centralized enterprise management through the Operations Console.

Note: As a direct result of the consolidation efforts, Microsoft IT established a new Enterprise Monitoring team responsible for maintaining the centralized monitoring infrastructure.

Developing Cost-Efficient End-to-End Service Management Solutions

System Center Operations Manager includes more than 50 management packs to provide prescriptive knowledge and automated in-line tasks based on best practices directly from Microsoft operating system, server, client, and application development teams. Moreover, Microsoft IT imported management packs provided by Microsoft partners to manage third-party business solutions, such as the central ERP system and the customer relationship management (CRM) system. Microsoft IT also used the conversion tools available in System Center Operations Manager to import custom management packs developed earlier for LOB applications. These management packs extend the monitoring capabilities of System Center Operations Manager to cover all relevant LOB application dependencies.

In addition to an improved management pack design, System Center Operations Manager includes templates for common services, such as messaging and custom Microsoft ASP.NET applications, and graphical design tools, such as in-line wizards, to create new custom management solutions. These solutions require significantly lower development efforts in comparison to previous Operations Manager versions. Customizing and authoring diagnostic and planning reports that are integrated directly into the Operations Console are also straightforward processes. Service-oriented views and reports enable both the operations teams and Microsoft IT management to get the information that they need to quickly identify and resolve issues that affect service levels.

For LOB applications that use Windows Server, Microsoft SQL Server™, Microsoft .NET, Internet Information Services (IIS), and other technologies as building blocks, Microsoft IT creates end-to-end service management solutions that include:

  • Application health model By using the Distributed Application Designer in the Operations Console, Microsoft IT defines a health model for the infrastructure components of the distributed application based on templates and discovered relationships and component types, and then creates the monitors, rules, views, and reports necessary to manage these components. The health model describes how the state of the individual components influences the state of the LOB application.

  • Synthetic transactions By using Web Application Editor in the Operations Console, Microsoft IT records a sequence of user actions, such as connecting to a site and browsing through Web pages, which are then played back to provide information about how the LOB application is performing. Agent-managed computers, located anywhere in the corporate production environment, can perform these synthetic transactions at regular intervals to achieve genuine monitoring of the end-user perspective. Synthetic transactions also enable Microsoft IT to stress-test LOB applications and to see whether monitoring settings, such as alerts and notifications, perform as expected.

Delegating Monitoring Responsibilities

By exploiting new features, such as role-based authorization through Active Directory, Microsoft IT can define various levels of access to the monitoring environment based on the principle of least privilege in order to provide security for the execution of tasks. For example, System Center Operations Manager supports various user roles, including read-only operators, operators, and authors. Microsoft IT uses these user roles to provide support specialists and group managers with access to monitoring solutions according to their responsibilities, as follows:

  • Service desk and Tier 1 support Users who are experiencing problems with an LOB application first contact the internal service desk. Service desk and Tier 1 support groups provide assistance concerning functional questions and problems related to the user interface. These support groups have the permissions of read-only operators so that they can use the Operations Console to view relevant alerts and check the health of application components. This information facilitates escalation decisions.

  • Tier 2 support Application problems that service desk and Tier 1 support groups cannot solve are escalated to a Tier 2 support engineer in the business IT group that is responsible for the affected LOB application. This engineer must locate and solve technical problems in a highly distributed LOB application space. Accordingly, Tier 2 support engineers are assigned the Operator role so that they can interact with alerts, run tasks, and access views to identify malfunctioning components or bottlenecks in the infrastructure and solve problems quickly.

  • Group managers Based on the Author role, Microsoft IT provides individual teams and groups with control over subsets of resources in the monitoring infrastructure. In a first step, an administrator from the Enterprise Monitoring team defines the basic monitoring resources and configurations for each group. Within the configured scope, group managers who have authoring permissions can then use the Operations Console to perform administrative tasks, such as creating rules, alert streams, monitors, and views, without having to depend on an Enterprise Monitoring administrator. In this way, the business units gain more control over service management functions.

Figure 2 illustrates how Enterprise Monitoring uses role-based authorization to support individual teams and groups in a consolidated monitoring environment, such as HR IT and Commercial IT.

Figure 2. Role-based authorization in a centralized monitoring infrastructure

Figure 2. Role-based authorization in a centralized monitoring infrastructure

Line-of-Business Applications at Microsoft

Microsoft spends more than $500 million each year on the acquisition, design, development, implementation, support, and maintenance of server systems and LOB applications. About 20 percent of the Microsoft IT LOB applications are mission critical or used for executive decision making.

Among the most critical are the following solutions that establish a core infrastructure for other LOB applications:

  • E-Mail Notifications Web service This is an automated system that implements action-based and time-based notification logic to notify users of tasks to complete. For example, if users do not reply to an e-mail notification within a period specified by the business rules, the E-Mail Notifications system triggers reminders that ask the users to take action. Centralizing e-mail notification handling in a dedicated Web service helps to eliminate duplicated business logic in LOB applications, reduces overhead associated with the management of business rules, and leads to leaner and more robust application code.

  • Error Logging Web service This service provides a centralized solution for error reporting in the corporate production environment. Structured exception handling is an integral part of all Microsoft IT LOB applications. If an exception is raised in an LOB application, the application code serializes the Exception object and passes the debugging information in XML format to the Error Logging Web service to report that a critical error has occurred.

  • Role-Based Permissions (RBP) system This service centralizes security and permissions management for LOB applications through automated processes that replicate changes of master data from the company's ERP system. All Microsoft IT LOB applications support single sign-on based on Integrated Windows authentication. Most internally developed LOB applications also rely on RBP to help ensure that only authorized users with appropriate roles can access sensitive data, such as personally identifiable information. For example, an LOB application for HR might allow a manager to see social security numbers and salaries of direct employees, whereas employees can see only their own data. Currently, RBP defines roles for about 14,000 managers.

  • Digital Asset Store Microsoft IT manages personally identifiable information and other highly sensitive data in a centralized and encrypted database called Digital Asset Store. The Enterprise Data Services (EDS) group within Microsoft IT created this database solution based on Microsoft SQL Server 2005 to isolate highly sensitive data from the LOB application space. To provide this information to subscribers, Digital Asset Store integrates with FeedStore.

  • FeedStore This is a 2-terabyte data warehouse that pulls data from 39 internal sources, including the company's ERP system and other key databases from business units, and feeds worldwide more than 500 subscribing LOB applications via three distribution servers. By using transactional replication, the distribution servers in Redmond, Dublin, and Singapore provide subsets of the FeedStore data to subscribers. The EDS group developed and maintains FeedStore.

Note: Microsoft IT tracks all LOB applications in a database with information about each application's purpose, status of changes, versions that are in production, and other key data, such as availability and performance statistics.

Application Interdependencies

By reusing core services across a large number of other LOB applications, Microsoft IT can reduce data duplication in the corporate production environment, keep sensitive information in an encrypted central location to maintain confidentiality, reduce the overhead associated with security and permissions management, apply common business rules (such as e-mail notification logic) consistently across LOB applications, and streamline error reporting and error handling. However, the downside of reusing common components across a large number of LOB applications in a distributed service-oriented architecture (SOA) is increased complexity in service management, maintenance, and support.

For example, a complex Microsoft IT LOB application might rely on ASP.NET to implement the basic business logic, use a variety of SQL Server 2005 databases through Microsoft ADO.NET, and consume a number of separate Web services based on Simple Object Access Protocol (SOAP) and XML to pull in data in a standardized way from the ERP system or other sources. Due to the synchronous nature of the data communication, this LOB application now depends on the availability and performance of its own components in addition to the availability and performance of the Web services. LOB applications can also use asynchronous communication methods based on transactional or message queuing systems, such as SQL Server 2005 Service Broker or Message Queuing (also known as MSMQ). Asynchronous communication enables distributed components to interact even if individual components are temporarily unavailable, but it introduces a new dependency on the availability and reliability of the queuing system.

Benefits Enrollment Application

The Benefits Enrollment application is an example of a mission-critical LOB application that incorporates SQL Server databases, Web services, and asynchronous communication methods based on Message Queuing. HR IT maintains this application for nearly 30,000 Microsoft employees in the United States. Each year, during the employee open enrollment period, U.S. employees can use this mission-critical Web application to change coverage for medical, dental, employee life, and long-term disability benefits after qualifying changes in family or employment status. The open enrollment period starts on November 1 and ends on November 30. Employees who do not initiate a change during this period will automatically continue their current benefits package into the next year. During the remainder of the calendar year, employees can use the Benefits Enrollment application to review current benefits information.

Figure 3 shows the architecture of the Benefits Enrollment application. Employees access this Web-based solution through one of five front-end Web servers, clustered through Network Load Balancing (NLB) to ensure high availability and scalability. The required client browser is Windows Internet Explorer®. Communication between client and server relies on Secure Sockets Layer (SSL), Hypertext Transfer Protocol Secure (HTTPS), and trusted connections for network security. The Web servers run IIS version 6.0 and host the application's ASP.NET pages. The application's databases reside on a SQL Server 2005 failover cluster based on Windows Clustering. Another server cluster hosts two virtual servers in an Active/Active configuration that runs core LOB services for data exchange with FeedStore and E-Mail Notifications, RBP, and other application support services. The communication between servers occurs through a mixture of technologies, including ADO.NET, SOAP/XML, and Message Queuing.

Figure 3. Benefits Enrollment application architecture

Figure 3. Benefits Enrollment application architecture

Distributed Application Monitoring

Using System Center Operations Manager 2007 to pilot an end-to-end service management solution for the Benefits Enrollment application enabled Microsoft IT to highlight new distributed operational processes. In these processes, Microsoft IT takes advantage of role-based authorization in a consolidated environment to implement effective monitoring of loosely coupled LOB applications from the end-user perspective.

Distributed Enterprise Monitoring Strategy

At Microsoft IT, individual teams and groups view their responsibilities from the perspective of providing business-critical services. This directly leads to Microsoft IT's philosophy of measuring service levels: Keeping servers running is not sufficient. Instead of focusing on individual server status, all components required to deliver a service to the business user are important.

Whereas the Enterprise Monitoring team maintains the global monitoring infrastructure and health information for all IT groups, the individual groups use specifically designed monitoring tools to view this information, create reports, and configure alert streams according to specific requirements of their LOB applications and operational processes. When LOB applications share common resources and components, individual end-to-end service management tools overlap and rely on the same health information, but without duplicating the underlying data, management agents, or management servers. Customizations that one group applies to a monitoring solution (such as custom alerts) in order to accommodate group-specific needs do not affect the monitoring solutions of other groups that include the same resources and components.

Figure 4 shows an example that illustrates the distributed enterprise monitoring principle. The EDS group maintains the FeedStore data warehouse that a large number of other LOB applications use, such as the Benefits Enrollment application that the HR IT group maintains. By using an end-to-end service management solution, an EDS operator can observe the mission-critical FeedStore warehouse to ensure its availability and performance. An HR IT operator can use another monitoring solution to keep track of the Benefits Enrollment application. Because this is an end-to-end scenario, the HR IT monitoring solution covers all components that the Benefits Enrollment application depends on, including FeedStore. HR IT does not maintain FeedStore, yet to the end user, this is irrelevant. From the end user's point of view, any problem concerning the Benefits Enrollment application is the jurisdiction of HR IT—end to end, including FeedStore and all other core services and components.

Figure 4. End-to-end service management of distributed LOB applications at Microsoft

Figure 4. End-to-end service management of distributed LOB applications at Microsoft

Implementing distributed monitoring based on System Center Operations Manager has the following advantages for Microsoft IT:

  • Reuse of existing investments in Windows technologies Microsoft IT groups can monitor their specific LOB applications and components without having to deploy or maintain redundant Active Directory, SQL Server, or monitoring infrastructures.

  • Proactive end-to-end management of distributed IT services Any issue that affects the availability or performance of an LOB application is directly visible within that LOB application's monitoring solution. Operators can quickly locate trouble spots and bottlenecks, whether they reside within the application itself or in any of the devices, systems, and components that the application depends on.

  • Rich, new reports and an easy-to-customize reporting environment Diagnostic and planning reports include comprehensive information and reveal the true reasons of incidents. The business units can recognize the causes of issues, whether they are the responsibility of the group maintaining the LOB application or that of another group, bringing additional insight to troubleshooting and planning.

  • Increased service levels Overlapping monitoring solutions enable individual groups within Microsoft IT to support each other to ensure highest service levels across the entire IT organization.

Benefits Enrollment Monitoring Solution

To create the monitoring solution for the Benefits Enrollment application according to the distributed monitoring strategy of Microsoft IT, the Enterprise Monitoring team and the HR IT group closely collaborated and performed the following steps based on guidelines outlined in the Microsoft Solutions Framework (MSF):

  1. Clarifying business requirements and project scope To establish a common understanding of the intended monitoring solution, Enterprise Monitoring demonstrated the capabilities of System Center Operations Manager to the HR IT group. After this presentation and based on the architecture of the Benefits Enrollment application, HR IT and Enterprise Monitoring defined the scope and deliverables for the pilot version of the monitoring solution.

  2. Planning the monitoring solution Within the defined scope of the project, Enterprise Monitoring and HR IT analyzed how employees work with the Benefits Enrollment application and how the individual components and dependencies within the application's architecture affect users. Among other activities, the two teams established baselines that defined levels of health states for the Benefits Enrollment application. For example, if response times exceeded 10 seconds, the state of the application would be considered unhealthy.

  3. Creating the monitoring solution Using standard authoring tools available in System Center Operations Manager, the Enterprise Monitoring team created the components of the monitoring solution according to HR IT specifications. The Enterprise Monitoring team first developed the solution on a test system and then manually re-created the solution in the corporate production environment. Re-creating the monitoring solution required less than one hour of work.

  4. Stabilizing and deploying the monitoring solution Following a quick functionality check in the corporate production environment, the Enterprise Monitoring team delivered the solution to HR IT by granting authoring permissions to the product manager responsible for the Benefits Enrollment application. According to the defined scope, this step concluded the project for the Enterprise Monitoring team. Granting authoring permissions enabled HR IT to customize the solution further, such as by defining alert streams, notifications, custom views, and reports.

Note: Detailed information about the MSF, including an MSF Resource Kit and case studies, is available on Microsoft TechNet at http://www.microsoft.com/technet/solutionaccelerators/msf/default.mspx.

Clarifying Business Requirements and Project Scope

Taking beta issues and time constraints into consideration, Enterprise Monitoring and HR IT agreed that a basic, reliable, and usable monitoring solution had more immediate value than a deluxe version delivered after the end of the employee open enrollment period. By focusing on core functionality, Enterprise Monitoring delivered the base solution swiftly, providing HR IT with a solid foundation to apply customizations in subsequent steps.

Figure 5 shows the decision tree that Enterprise Monitoring and HR IT followed to determine the scope of the project. In a first step, the teams decided to include only the actual application elements and the core services in the monitoring solution. The teams considered hardware, network, and infrastructure elements (such as Active Directory, DNS, and Dynamic Host Configuration Protocol [DHCP]) less important for the pilot version because Microsoft IT guarantees 99.99 availability in these areas. Adding these components to the monitoring solution would have given HR IT a very detailed view of the application infrastructure, yet it would have required Microsoft IT to fully deploy System Center Operations Manager 2007 in the corporate production environment, which did not correspond to the plans of Microsoft IT for the Release Candidate 1 version of the product.

Figure 5. Clarifying monitoring scope for the Benefits Enrollment application

Figure 5. Clarifying monitoring scope for the Benefits Enrollment application

Enterprise Monitoring and HR IT defined the project scope based on the following decisions:

  • Included components The base solution must monitor the Web servers that host the application, the HR databases on MSSQLDB, and the core services running on MSMQCLUST01 and MSMQCLUST02. (Figure 3 earlier in this paper showed the position of these servers in the Benefits Enrollment application architecture.)

  • Component health monitoring The base solution must include a distributed application model and health rollup configuration so that HR IT can monitor the health of each application component.

  • Application performance monitoring The base solution must include a Web application monitor to perform synthetic transactions so that HR IT can track availability and response times of the Benefits Enrollment application based on the actual end-user perspective. The Web application monitor must use a special Run As account because System Center Operations Manager system accounts are not authorized to access HR LOB applications.

  • Role-based authorization The base solution needs to include only standard views and reports. For alert streams, notifications, and other customizations, Enterprise Monitoring grants authoring permissions to HR IT so that HR IT can create these elements without the involvement of Enterprise Monitoring.

Planning the Monitoring Solution

To create an effective monitoring design, Enterprise Monitoring and HR IT analyzed the impact of the individual application components on the overall user experience, specifically how employees would notice component failures and performance bottlenecks while working with the Benefits Enrollment application. For example, the Benefits Enrollment application is hosted on five Web servers in an NLB cluster (as shown in Figure 3 earlier in this paper). Because the NLB cluster provides automatic failover capabilities, unavailability of a single Web server does not affect users. The remaining Web servers are available and have sufficient capacity to handle the user requests without performance impact. Accordingly, Enterprise Monitoring and HR IT decided that a critical situation occurs only if more than 50 percent of the servers in the NLB cluster are unavailable.

Table 1 summarizes the key monitoring criteria that Enterprise Monitoring and HR IT established for the Benefits Enrollment application.

Table 1. Monitoring Plan for the Benefits Enrollment Application

Category Components Monitoring criteria

Web site

WEB01, WEB02, WEB03, WEB04, and WEB05

The health state is critical if 50 percent of the servers in the NLB cluster are unavailable. Because the NLB cluster provides automatic failover capabilities, unavailability of single Web server does not represent a critical state.

HR databases

MiscData, DataReplication, General, IssueTracking, Metadata, and Payments

The health state is critical if any of the databases is unavailable.

Core LOB services

E-Mail Notifications, Issue Tracking, RBP, and data exchange services (FeedStore).

The health state is critical if any of the core LOB services is unavailable.

Web application

Response Times

The health state reaches the warning level if the response times exceed 10 seconds and the critical level if response times exceed 20 seconds.

User tolerance for performance is around 20 seconds response time for a page.

Creating the Monitoring Solution

To implement the monitoring plan according to the scope of the pilot version, the Enterprise Monitoring team created the following elements by using the Operations Console:

  • Custom management pack System Center Operations Manager uses the concept of management packs to save the rules and configuration settings of a monitoring solution. Enterprise Monitoring delivered the Benefits Enrollment monitoring solution in a single custom management pack to HR IT.

  • Distributed application model The application model is a structural representation of the distributed application, outlining the application's organization in terms of component groups and group relationships. To create the application model for the Benefits Enrollment solution, Enterprise Monitoring used the Distributed Application Designer directly available in the Operations Console.

  • Health rollup configuration Each component group within the distributed application model provides access to health rollup settings. Enterprise Monitoring configured these settings to describe the dynamic behavior of the Benefits Enrollment application according to the monitoring criteria that Enterprise Monitoring and HR IT established during the planning phase.

  • Web application monitor To check availability and measure response times of the Benefits Enrollment application, Enterprise Monitoring created a Web application monitor. This monitor performs synthetic Hypertext Transfer Protocol (HTTP) requests in a recorded browser session by using the account credentials that HR IT provided.

Creating the Distributed Application

Enterprise Monitoring created the application model for the Benefits Enrollment solution by using the Distributed Application Designer and the Line of Business Web Application template. This template contains predefined container groups for Web sites and databases. In addition to the predefined groups, Enterprise Monitoring created a third container for the core LOB services and established a relationship between the Web site component group and the application services component group. Because the core LOB services run on Windows Clustering virtual servers, Enterprise Monitoring restricted the new component group to the Windows Cluster Resource object class. Subsequently, Enterprise Monitoring used the Objects pane in the Distributed Application Designer to discover automatically all relevant objects and added them via a drag-and-drop operation to the corresponding component groups. Figure 6 displays the resulting application model.

Figure 6. Benefits Enrollment application model

Figure 6. Benefits Enrollment application model

Defining Health Rollup Settings

Table 2 summarizes the health rollup configuration that Enterprise Monitoring applied to the individual component groups in the distributed application model. The Enterprise Monitoring team configured only the most basic parameters according to the monitoring plan discussed in the section "Planning the Monitoring Solution" earlier in this paper. Authoring permissions enable the HR IT team to define further settings on a group-specific basis.

Table 2. Health Rollup Configuration for the Benefits Enrollment Application

Component group Parameter Setting Effective value

Benefits Enrollment Web application Web sites

Rollup algorithm

Worst state of a percentage of members in good health state

Show the worst state of any member

Percentage

50

Benefits Enrollment Web application databases

Rollup algorithm

Worst health state of any member

Show the worst state of any member

Benefits Enrollment Web application services

Rollup algorithm

Worst health state of any member

Show the worst state of any member

Creating a Web Application Monitor

To measure availability and performance of the Benefits Enrollment application from the end-user perspective, Enterprise Monitoring added a Web application monitor object to the custom management pack for HR IT. Based on the configuration settings defined in this monitor object, System Center Operations Manager agents running on managed computers in the corporate production environment perform synthetic transactions. An agent-managed computer that is running an application monitor is called a watcher node. To track availability and performance comprehensively across the corporate production environment, Enterprise Monitoring selected computers in all U.S. locations to act as watcher nodes.

Enterprise Monitoring and HR IT performed the following steps to create the Web application monitor:

  1. Added an HR IT system account to the custom management pack as a Run As account to provide an appropriate identity for the Web application monitor.

  2. Created the Web application monitor in the Operations Console by using the Add Monitoring Wizard and the Web Application management pack template. During this configuration step, Enterprise Monitoring assigned agent-managed computers from all relevant locations as watcher nodes.

  3. Recorded a browser session to perform synthetic transactions. The transactions simulate typical user actions, such as logging on and browsing Web pages. Because of legal constraints, Enterprise Monitoring and HR IT did not configure the Web application monitor to complete actual transactions in the HR databases.

  4. Defined content matching and response time criteria for the warning and critical error levels for the Web application monitor.

  5. Activated Windows authentication and specified the HR IT account to access the pages.

Deploying the Monitoring Solution

Following a functional test of the solution in the corporate production environment, Enterprise Monitoring created separate Author, Operator, and Read-Only Operator user roles for HR IT by using the Create User Role Wizard in the Operations Console. Enterprise Monitoring also restricted the authoring scope to the resources created in the custom management pack for HR IT. The Author user role provides HR IT with control over the Benefits Enrollment monitoring solution. The Operator user role includes a set of rights to interact with alerts, execute tasks, and access views according to their configured scope. The Read-Only Operator user role allows the corresponding users to view alerts and access views.

Enterprise Monitoring assigned the user roles to security groups. Assigning to security groups instead of individual user accounts enables HR IT to assign user roles by adding or removing group members in Active Directory without further involvement of an Enterprise Monitoring administrator. Through the corresponding security groups, HR IT assigned the Author user role to the program manager who is responsible for the Benefits Enrollment application, the Operator user role to Tier 2 support engineers, and the Read-Only Operator user role to Tier 1 specialists.

Pilot Project Review

The HR IT group used the solution to monitor the Benefits Enrollment application very closely during the open enrollment period in 2006 and created daily performance and availability reports. These reports included the number of concurrent connections, processor time, processor queue length, available memory, memory pages swapped per second, and free disk space for each server. Based on these reports, HR IT communicated statistics to stakeholders in the Microsoft IT and HR departments that showed 100 percent application availability with an average server utilization of 4.42 percent (Web servers), 13.04 percent (database servers), and 7.92 percent (reporting servers). The Benefits Enrollment application had 26,687 unique visitors that saved 17,046 records.

After the end of the open enrollment period, Enterprise Monitoring and HR IT reviewed the pilot project and summarized their findings as follows:

  • Increased flexibility to meet service levels The end-to-end service management solution gave HR IT flexible control over all application-monitoring aspects. These aspects included synthetic transaction monitoring and the customization of alert streams and reports according to application-specific support requirements and the promotion of service desk, Tier 1, and Tier 2 operators. Essentially, the monitoring solution appeared as owned exclusively by HR IT, yet without the overhead of infrastructure maintenance and administration. HR IT was promptly able to satisfy business unit requests for custom availability and performance reports without the involvement or assistance of Enterprise Monitoring.

  • Clear separation of responsibilities Whereas HR IT focused on satisfying the needs of the business user through highly customized monitoring and support solutions, Enterprise Monitoring focused on maintaining a centralized monitoring infrastructure in the corporate production environment. Enterprise Monitoring can increasingly concentrate on overall enterprise monitoring, and a large number of small application support teams (each containing four IT specialists or fewer) can take care of each individual LOB application, quickly identifying and resolving issues that affect service levels.

  • Low development costs Enterprise Monitoring created the base version of the Benefits Enrollment monitoring solution by using standard tools and wizards readily available in the System Center Operations Manager Operations Console. Providing authoring permissions to HR IT freed Enterprise Monitoring from having to determine detailed business unit requirements during the initial solution deployment.

  • Rapid deployment The end-to-end service management solution did not require a specialized monitoring environment. All required resources are centrally available. Reusing existing technology investments eliminates the need to deploy additional management agents, servers, or data warehouses for reporting purposes. Deploying end-to-end service management solutions is mainly a task of delegating appropriate user roles to the individual business IT groups.

Best Practices

A well-defined project scope and monitoring plan, intuitive authoring tools and wizards, and a clear separation of administration, authoring, and operational tasks enabled Microsoft IT to deliver the end-to-end service management solution for the Benefits Enrollment application in time to run the first pilot during the open enrollment period of 2006. Based on the experiences gained during the pilot phase, Microsoft IT developed the following best practices to use in planning and deploying end-to-end service management solutions based on System Center Operations Manager 2007:

  • Centralize the monitoring infrastructure By monitoring the entire corporate production environment in a centralized infrastructure, Microsoft IT can include all components that affect the availability and performance of LOB applications in end-to-end service management solutions without having to duplicate the underlying health information. Standardizing on System Center Operations Manager and reusing existing investments helps Microsoft IT to lower costs.

  • Decentralize application monitoring By giving individual business IT groups targeted monitoring tools (for their specific LOB applications) that include all components that the LOB applications depend on, Microsoft IT can create an overlapping monitoring environment where individual teams can support each other to ensure highest service levels across the entire IT organization.

  • Focus on immediate value By providing basic, solid, and usable monitoring solutions for mission-critical LOB applications with the most business value, Microsoft IT can quickly realize improved service levels in the LOB application space. In subsequent steps, individual business IT groups can customize their solutions according to specific needs.

  • Analyze application behavior and dependencies By defining a clear monitoring plan that reflects the application behavior under critical and error conditions, Microsoft IT can ensure that the end-to-end solutions effectively monitor the availability and performance of the corresponding LOB applications.

  • Monitor availability and performance from all relevant locations By using synthetic transactions and agent-managed computers in all relevant locations as watcher nodes, Microsoft IT can track availability and performance from the perspective of all employees who use the LOB application.

  • Use Windows security groups for role-based authorization By granting the Author, Operator, or other permissions to security groups instead of individual user accounts in System Center Operations Manager, Microsoft IT can delegate permissions management to the individual teams responsible for LOB applications.

  • Document the solution By using the standard authoring tools available in the Operations Console and a basic set of documentation that outlines the components and health settings to be monitored, Microsoft IT can quickly recreate monitoring solutions if necessary.

Conclusion

System Center Operations Manager 2007 provides the foundation for Microsoft IT to reduce costs while at the same time increasing service levels concerning LOB applications in the corporate production environment. Performance and scalability improvements and new features, such as role-based authorization, enable Microsoft IT to streamline IT operations by consolidating management groups. Centralizing the monitoring infrastructure eliminates redundant management servers and reporting databases. Consolidating all health information in a centralized monitoring infrastructure is also the basis to develop end-to-end service management solutions for individual teams and groups that maintain LOB applications and other systems in the corporate production environment. Individual monitoring solutions can overlap to include shared devices, servers, components, and services, without duplicating the underlying health data.

Reusing existing technology investments is one key element for Microsoft IT to lower costs and operations overhead. Another element is reusing knowledge and expertise from Microsoft operating system, server, client, and application development teams, as well as from third-party vendors through management packs. Yet another key element is the integration of System Center Operations Manager with Helpdesk ticketing and other systems to automate routine tasks in order to increase the efficiency of IT operations. Microsoft IT furthermore automates redundant administration tasks, such as the deployment of management agents and the discovery of new systems and applications in the corporate production environment.

System Center Operations Manager provides end-to-end service management that is easy to customize and extend. Service templates and intuitive design tools, such as the Distributed Application Designer and the Add Monitoring Wizard in the Operations Console, enable Microsoft IT to create end-to-end service management solutions in short development cycles. The solutions include synthetic transaction monitoring to measure application availability and performance from the end-user perspective.

The groups that actually use the monitoring solutions have authoring permissions to customize the solutions further according to the specific needs of the business units. For example, business IT groups can customize performance and availability reports to give business units convenient access to statistics and other information that shows achieved service levels and application performance.

The new service-oriented views and availability reporting enable both the Microsoft IT operations teams and management to get the information that they need to identify and resolve issues that affect the end-to-end delivery of IT services. In this way, System Center Operations Manager enables Microsoft IT to perfect the alignment of IT services with the needs of the company.

For More Information

For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information through the World Wide Web, go to:

http://www.microsoft.com

http://www.microsoft.com/technet/itshowcase

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft