Managing Line of Business Applications using Microsoft System Center Operations
Manager 2007
Technical Solution Brief
Published: March 19, 2007
|
Situation
|
Solution
|
Benefits
|
Products & Technologies
|
|
Microsoft IT established service level agreements with 99.99 percent availability
for infrastructure services. Yet, LOB applications depend on interactions between
components and services that are not directly related to infrastructure services.
Accordingly, tracking availability and performance for these distributed business
solutions is difficult.
|
Using System Center Operations Manager 2007, Microsoft IT can implement new end-to-end
monitoring solutions in a centralized monitoring infrastructure. The new solutions
are the basis for proactive management of LOB applications.
|
- Lower operating costs
- Increased service levels
- Alignment of business and IT services
- Increased productivity of employees
|
- Microsoft Operations Manager 2005
- Microsoft System Center Operations Manager 2007
- Microsoft SQL Server 2005
- Microsoft ADO.NET
- Message Queuing
|
Microsoft® System Center Operations Manager 2007 provides the foundation for
the Microsoft Information Technology (Microsoft IT) group to increase operational
efficiency and advance the alignment of IT services with the needs of Microsoft.
Microsoft IT achieves these goals by providing business units with greater control
over line-of-business (LOB) applications based on a new generation of end-to-end
service management solutions.
LOB applications are at the core of all areas of business at Microsoft. They facilitate
strategic planning and internal business processes, in addition to collaboration
and communication with customers, partners, and vendors. They also enable employees
to manage their careers and human resources (HR) benefits programs. Overall, more
than 2,500 LOB applications exist in the corporate production environment. Managing
these complex and distributed applications is a vital task for Microsoft IT.
This technical solution brief explains how Microsoft IT used System Center Operations
Manager 2007 to develop an end-to-end service management solution for a mission-critical
LOB application in the corporate production environment. Microsoft IT conducted
this pilot project in November 2006 by using the Release Candidate 1 version of
the product.
This technical solution brief contains information for IT decision makers who are
evaluating the benefits of System Center Operations Manager and for application
designers and IT implementers who are planning to design and implement end-to-end
enterprise management solutions. This paper assumes that the audience is already
familiar with the concepts of Microsoft Windows Server® 2003, the Active Directory®
directory service, and Microsoft Operations Manager (MOM) 2005. A high-level understanding
of the new features and technologies included in System Center Operations Manager
is also helpful. Detailed product information is available at
http://go.microsoft.com/fwlink/?LinkId=64017.
Note: For security reasons, the sample names of internal resources,
application components, services, and solutions used in this document do not represent
actual names used within Microsoft and are for illustration purposes only. In addition,
the contents of this document describe how Microsoft IT runs its enterprise data
centers. The procedures and processes included in this document are not intended
to be prescriptive guidance on how to run a generic data center and may not be supported
by Microsoft Customer Support Services.
Introduction
Microsoft IT maintains a complex enterprise environment that consists of 16 data
centers, 441 office locations in 98 countries, and more than 121,000 users. The
corporate network includes approximately 350,000 workstations and 11,000 production
servers, housing more than 1,000 terabytes of data. On top of this infrastructure
reside 2,500 LOB applications that Microsoft IT maintains to support the business
units of the company, including finance, HR, marketing, operations, purchasing,
and sales.
Maintaining this large-scale environment is primarily the task of the following
two main divisions in Microsoft IT:
-
Infrastructure Services (IS) Maintains and supports the telephony and network
infrastructure, the server operating systems, the core infrastructure services (such
as Domain Name System [DNS] and Active Directory), the global messaging environment,
and the monitoring infrastructure. More than 2,500 IT professionals work in the
IS division, organized in teams according to technology services.
-
Business IT Develops, maintains, and supports the LOB applications in the
corporate production environment. For example, the HR IT group is responsible for
the LOB applications of the HR department, whereas the Commercial IT group is in
charge of all sales, marketing, and product support applications. Another group
provides support for the central enterprise resource planning (ERP) system that
Microsoft uses for all financial and supply-chain management and for storing the
HR master data. The Financial IT group manages the payroll and accounting applications.
Overall, the business IT groups includes more than 1,500 IT professionals.
Figure 1 illustrates how IS and business IT groups view the corporate production
environment. According to the different perspectives, IS uses monitoring tools that
focus on the infrastructure; business IT groups needs tools that include clients,
servers, applications, and synthetic transaction monitoring of the end-user perspective
to manage and improve the end-to-end delivery of IT services.
.jpg)
Figure 1. IS and business IT views of the corporate production environment
Improving Service Levels for LOB Applications
By using MOM 2005 and a limited number of third-party solutions, Microsoft IT established
99.99 percent availability for business-critical infrastructure services, such as
Microsoft Exchange Server, Microsoft SharePoint® Products and Technologies,
and Active Directory. First-level operators discovered about 87 percent of all infrastructure
issues before a user recognized the issues. About 98 percent of all alerts were
detected within two minutes of occurrence. However, as internal support statistics
revealed, users still found the majority of application problems, putting the business
IT groups in a reactive position. Business IT groups needed effective monitoring
solutions that supported a proactive service management approach beyond the capabilities
of MOM 2005 in order to identify and resolve issues that affect the health of distributed
IT services before users discover these issues.
To remove the obstacles that prevented Microsoft IT from increasing service levels
in the LOB application space prior to System Center Operations Manager 2007, Microsoft
IT devised the following strategy:
-
Consolidate separate monitoring environments in a centralized System Center Operations
Manager infrastructure that covers the entire corporate production environment.
-
Develop cost-efficient management packs and end-to-end service management solutions
for standard and custom LOB applications by using new authoring and editing tools
that are readily available in System Center Operations Manager.
-
Delegate monitoring responsibilities to the individual Microsoft IT teams and groups
according to their areas of control so that they can integrate the relevant pieces
of health information into their operational processes.
Centralizing the Monitoring Infrastructure
Highly distributed LOB applications consist of loosely coupled components that share
common resources, some distributed worldwide. To achieve effective end-to-end service
management for these types of applications, the monitoring infrastructure must cover
the entire corporate production environment, including hardware devices, Microsoft
products, custom Microsoft IT solutions, and non-Microsoft components. Any gaps
in the monitoring infrastructure represent gaps in the end-to-end service management
scenario.
The following System Center Operations Manager features enabled Microsoft IT to
establish a centralized global monitoring infrastructure:
-
Performance and scalability enhancements Support of high-availability technologies,
such as Windows® Clustering in Windows Server 2003, security-enhanced and automated
agent deployments based on Microsoft Systems Management Server (SMS) packages, and
automated agent configuration discovery based on Active Directory, enable Microsoft
IT to deploy System Center Operations Manager efficiently. System Center Operations
Manager automatically discovers new systems and applications and deploys the available
and appropriate monitoring policies. Months before the product's release to manufacturing
(RTM), Microsoft IT already managed more than 20,000 computers by using System Center
Operations Manager 2007 Release Candidate 2.
-
Security enhancements Active Directory integration, role-based authorization,
and Run As accounts enable Microsoft IT to support individual teams and groups that
have varying monitoring needs and security requirements without having to deploy
and maintain redundant management servers and reporting databases.
-
Simple Network Management Protocol (SNMP) Direct support of SNMP-enabled
devices provides an opportunity to reduce costs by replacing third-party solutions.
It also enables Microsoft IT to include these devices in end-to-end service management
solutions for LOB applications to provide operators with a complete view of the
entire application infrastructure.
-
Operations Manager Connector Framework Based on Operations Manager Connector
Framework, Microsoft IT integrated System Center Operations Manager with external
systems, such as the Helpdesk ticketing system for incident management; the configuration
management database (CMDB) to track server purchases, configurations, and retirements;
and previous third-party solutions to manage network devices. Bi-directionally synchronized
integration enables centralized enterprise management through the Operations Console.
Note: As a direct result of the consolidation efforts, Microsoft IT
established a new Enterprise Monitoring team responsible for maintaining the centralized
monitoring infrastructure.
Developing Cost-Efficient End-to-End Service Management Solutions
System Center Operations Manager includes more than 50 management packs to provide
prescriptive knowledge and automated in-line tasks based on best practices directly
from Microsoft operating system, server, client, and application development teams.
Moreover, Microsoft IT imported management packs provided by Microsoft partners
to manage third-party business solutions, such as the central ERP system and the
customer relationship management (CRM) system. Microsoft IT also used the conversion
tools available in System Center Operations Manager to import custom management
packs developed earlier for LOB applications. These management packs extend the
monitoring capabilities of System Center Operations Manager to cover all relevant
LOB application dependencies.
In addition to an improved management pack design, System Center Operations Manager
includes templates for common services, such as messaging and custom Microsoft ASP.NET
applications, and graphical design tools, such as in-line wizards, to create new
custom management solutions. These solutions require significantly lower development
efforts in comparison to previous Operations Manager versions. Customizing and authoring
diagnostic and planning reports that are integrated directly into the Operations
Console are also straightforward processes. Service-oriented views and reports enable
both the operations teams and Microsoft IT management to get the information that
they need to quickly identify and resolve issues that affect service levels.
For LOB applications that use Windows Server, Microsoft SQL Server™, Microsoft .NET,
Internet Information Services (IIS), and other technologies as building blocks,
Microsoft IT creates end-to-end service management solutions that include:
-
Application health model By using the Distributed Application Designer in
the Operations Console, Microsoft IT defines a health model for the infrastructure
components of the distributed application based on templates and discovered relationships
and component types, and then creates the monitors, rules, views, and reports necessary
to manage these components. The health model describes how the state of the individual
components influences the state of the LOB application.
-
Synthetic transactions By using Web Application Editor in the Operations
Console, Microsoft IT records a sequence of user actions, such as connecting to
a site and browsing through Web pages, which are then played back to provide information
about how the LOB application is performing. Agent-managed computers, located anywhere
in the corporate production environment, can perform these synthetic transactions
at regular intervals to achieve genuine monitoring of the end-user perspective.
Synthetic transactions also enable Microsoft IT to stress-test LOB applications
and to see whether monitoring settings, such as alerts and notifications, perform
as expected.
Delegating Monitoring Responsibilities
By exploiting new features, such as role-based authorization through Active Directory,
Microsoft IT can define various levels of access to the monitoring environment based
on the principle of least privilege in order to provide security for the execution
of tasks. For example, System Center Operations Manager supports various user roles,
including read-only operators, operators, and authors. Microsoft IT uses these user
roles to provide support specialists and group managers with access to monitoring
solutions according to their responsibilities, as follows:
-
Service desk and Tier 1 support Users who are experiencing problems with
an LOB application first contact the internal service desk. Service desk and Tier
1 support groups provide assistance concerning functional questions and problems
related to the user interface. These support groups have the permissions of read-only
operators so that they can use the Operations Console to view relevant alerts and
check the health of application components. This information facilitates escalation
decisions.
-
Tier 2 support Application problems that service desk and Tier 1 support
groups cannot solve are escalated to a Tier 2 support engineer in the business IT
group that is responsible for the affected LOB application. This engineer must locate
and solve technical problems in a highly distributed LOB application space. Accordingly,
Tier 2 support engineers are assigned the Operator role so that they can interact
with alerts, run tasks, and access views to identify malfunctioning components or
bottlenecks in the infrastructure and solve problems quickly.
-
Group managers Based on the Author role, Microsoft IT provides individual
teams and groups with control over subsets of resources in the monitoring infrastructure.
In a first step, an administrator from the Enterprise Monitoring team defines the
basic monitoring resources and configurations for each group. Within the configured
scope, group managers who have authoring permissions can then use the Operations
Console to perform administrative tasks, such as creating rules, alert streams,
monitors, and views, without having to depend on an Enterprise Monitoring administrator.
In this way, the business units gain more control over service management functions.
Figure 2 illustrates how Enterprise Monitoring uses role-based authorization to
support individual teams and groups in a consolidated monitoring environment, such
as HR IT and Commercial IT.
.jpg)
Figure 2. Role-based authorization in a centralized monitoring infrastructure
Line-of-Business Applications at Microsoft
Microsoft spends more than $500 million each year on the acquisition, design, development,
implementation, support, and maintenance of server systems and LOB applications.
About 20 percent of the Microsoft IT LOB applications are mission critical or used
for executive decision making.
Among the most critical are the following solutions that establish a core infrastructure
for other LOB applications:
-
E-Mail Notifications Web service This is an automated system that implements
action-based and time-based notification logic to notify users of tasks to complete.
For example, if users do not reply to an e-mail notification within a period specified
by the business rules, the E-Mail Notifications system triggers reminders that ask
the users to take action. Centralizing e-mail notification handling in a dedicated
Web service helps to eliminate duplicated business logic in LOB applications, reduces
overhead associated with the management of business rules, and leads to leaner and
more robust application code.
-
Error Logging Web service This service provides a centralized solution for
error reporting in the corporate production environment. Structured exception handling
is an integral part of all Microsoft IT LOB applications. If an exception is raised
in an LOB application, the application code serializes the Exception object
and passes the debugging information in XML format to the Error Logging Web service
to report that a critical error has occurred.
-
Role-Based Permissions (RBP) system This service centralizes security and
permissions management for LOB applications through automated processes that replicate
changes of master data from the company's ERP system. All Microsoft IT LOB applications
support single sign-on based on Integrated Windows authentication. Most internally
developed LOB applications also rely on RBP to help ensure that only authorized
users with appropriate roles can access sensitive data, such as personally identifiable
information. For example, an LOB application for HR might allow a manager to see
social security numbers and salaries of direct employees, whereas employees can
see only their own data. Currently, RBP defines roles for about 14,000 managers.
-
Digital Asset Store Microsoft IT manages personally identifiable information
and other highly sensitive data in a centralized and encrypted database called Digital
Asset Store. The Enterprise Data Services (EDS) group within Microsoft IT created
this database solution based on Microsoft SQL Server 2005 to isolate highly sensitive
data from the LOB application space. To provide this information to subscribers,
Digital Asset Store integrates with FeedStore.
-
FeedStore This is a 2-terabyte data warehouse that pulls data from 39 internal
sources, including the company's ERP system and other key databases from business
units, and feeds worldwide more than 500 subscribing LOB applications via three
distribution servers. By using transactional replication, the distribution servers
in Redmond, Dublin, and Singapore provide subsets of the FeedStore data to subscribers.
The EDS group developed and maintains FeedStore.
Note: Microsoft IT tracks all LOB applications in a database with information
about each application's purpose, status of changes, versions that are in production,
and other key data, such as availability and performance statistics.
Application Interdependencies
By reusing core services across a large number of other LOB applications, Microsoft
IT can reduce data duplication in the corporate production environment, keep sensitive
information in an encrypted central location to maintain confidentiality, reduce
the overhead associated with security and permissions management, apply common business
rules (such as e-mail notification logic) consistently across LOB applications,
and streamline error reporting and error handling. However, the downside of reusing
common components across a large number of LOB applications in a distributed service-oriented
architecture (SOA) is increased complexity in service management, maintenance, and
support.
For example, a complex Microsoft IT LOB application might rely on ASP.NET to implement
the basic business logic, use a variety of SQL Server 2005 databases through Microsoft
ADO.NET, and consume a number of separate Web services based on Simple Object Access
Protocol (SOAP) and XML to pull in data in a standardized way from the ERP system
or other sources. Due to the synchronous nature of the data communication, this
LOB application now depends on the availability and performance of its own components
in addition to the availability and performance of the Web services. LOB applications
can also use asynchronous communication methods based on transactional or message
queuing systems, such as SQL Server 2005 Service Broker or Message Queuing (also
known as MSMQ). Asynchronous communication enables distributed components to interact
even if individual components are temporarily unavailable, but it introduces a new
dependency on the availability and reliability of the queuing system.
Benefits Enrollment Application
The Benefits Enrollment application is an example of a mission-critical LOB application
that incorporates SQL Server databases, Web services, and asynchronous communication
methods based on Message Queuing. HR IT maintains this application for nearly 30,000
Microsoft employees in the United States. Each year, during the employee open enrollment
period, U.S. employees can use this mission-critical Web application to change coverage
for medical, dental, employee life, and long-term disability benefits after qualifying
changes in family or employment status. The open enrollment period starts on November
1 and ends on November 30. Employees who do not initiate a change during this period
will automatically continue their current benefits package into the next year. During
the remainder of the calendar year, employees can use the Benefits Enrollment application
to review current benefits information.
Figure 3 shows the architecture of the Benefits Enrollment application. Employees
access this Web-based solution through one of five front-end Web servers, clustered
through Network Load Balancing (NLB) to ensure high availability and scalability.
The required client browser is Windows Internet Explorer®. Communication between
client and server relies on Secure Sockets Layer (SSL), Hypertext Transfer Protocol
Secure (HTTPS), and trusted connections for network security. The Web servers run
IIS version 6.0 and host the application's ASP.NET pages. The application's databases
reside on a SQL Server 2005 failover cluster based on Windows Clustering. Another
server cluster hosts two virtual servers in an Active/Active configuration that
runs core LOB services for data exchange with FeedStore and E-Mail Notifications,
RBP, and other application support services. The communication between servers occurs
through a mixture of technologies, including ADO.NET, SOAP/XML, and Message Queuing.
.jpg)
Figure 3. Benefits Enrollment application architecture
Distributed Application Monitoring
Using System Center Operations Manager 2007 to pilot an end-to-end service management
solution for the Benefits Enrollment application enabled Microsoft IT to highlight
new distributed operational processes. In these processes, Microsoft IT takes advantage
of role-based authorization in a consolidated environment to implement effective
monitoring of loosely coupled LOB applications from the end-user perspective.
Distributed Enterprise Monitoring Strategy
At Microsoft IT, individual teams and groups view their responsibilities from the
perspective of providing business-critical services. This directly leads to Microsoft
IT's philosophy of measuring service levels: Keeping servers running is not sufficient.
Instead of focusing on individual server status, all components required to deliver
a service to the business user are important.
Whereas the Enterprise Monitoring team maintains the global monitoring infrastructure
and health information for all IT groups, the individual groups use specifically
designed monitoring tools to view this information, create reports, and configure
alert streams according to specific requirements of their LOB applications and operational
processes. When LOB applications share common resources and components, individual
end-to-end service management tools overlap and rely on the same health information,
but without duplicating the underlying data, management agents, or management servers.
Customizations that one group applies to a monitoring solution (such as custom alerts)
in order to accommodate group-specific needs do not affect the monitoring solutions
of other groups that include the same resources and components.
Figure 4 shows an example that illustrates the distributed enterprise monitoring
principle. The EDS group maintains the FeedStore data warehouse that a large number
of other LOB applications use, such as the Benefits Enrollment application that
the HR IT group maintains. By using an end-to-end service management solution, an
EDS operator can observe the mission-critical FeedStore warehouse to ensure its
availability and performance. An HR IT operator can use another monitoring solution
to keep track of the Benefits Enrollment application. Because this is an end-to-end
scenario, the HR IT monitoring solution covers all components that the Benefits
Enrollment application depends on, including FeedStore. HR IT does not maintain
FeedStore, yet to the end user, this is irrelevant. From the end user's point of
view, any problem concerning the Benefits Enrollment application is the jurisdiction
of HR IT—end to end, including FeedStore and all other core services and components.
.jpg)
Figure 4. End-to-end service management of distributed LOB applications at Microsoft
Implementing distributed monitoring based on System Center Operations Manager has
the following advantages for Microsoft IT:
-
Reuse of existing investments in Windows technologies Microsoft IT groups
can monitor their specific LOB applications and components without having to deploy
or maintain redundant Active Directory, SQL Server, or monitoring infrastructures.
-
Proactive end-to-end management of distributed IT services Any issue that
affects the availability or performance of an LOB application is directly visible
within that LOB application's monitoring solution. Operators can quickly locate
trouble spots and bottlenecks, whether they reside within the application itself
or in any of the devices, systems, and components that the application depends on.
-
Rich, new reports and an easy-to-customize reporting environment Diagnostic
and planning reports include comprehensive information and reveal the true reasons
of incidents. The business units can recognize the causes of issues, whether they
are the responsibility of the group maintaining the LOB application or that of another
group, bringing additional insight to troubleshooting and planning.
-
Increased service levels Overlapping monitoring solutions enable individual
groups within Microsoft IT to support each other to ensure highest service levels
across the entire IT organization.
Benefits Enrollment Monitoring Solution
To create the monitoring solution for the Benefits Enrollment application according
to the distributed monitoring strategy of Microsoft IT, the Enterprise Monitoring
team and the HR IT group closely collaborated and performed the following steps
based on guidelines outlined in the Microsoft Solutions Framework (MSF):
-
Clarifying business requirements and project scope To establish a common
understanding of the intended monitoring solution, Enterprise Monitoring demonstrated
the capabilities of System Center Operations Manager to the HR IT group. After this
presentation and based on the architecture of the Benefits Enrollment application,
HR IT and Enterprise Monitoring defined the scope and deliverables for the pilot
version of the monitoring solution.
-
Planning the monitoring solution Within the defined scope of the project,
Enterprise Monitoring and HR IT analyzed how employees work with the Benefits Enrollment
application and how the individual components and dependencies within the application's
architecture affect users. Among other activities, the two teams established baselines
that defined levels of health states for the Benefits Enrollment application. For
example, if response times exceeded 10 seconds, the state of the application would
be considered unhealthy.
-
Creating the monitoring solution Using standard authoring tools available
in System Center Operations Manager, the Enterprise Monitoring team created the
components of the monitoring solution according to HR IT specifications. The Enterprise
Monitoring team first developed the solution on a test system and then manually
re-created the solution in the corporate production environment. Re-creating the
monitoring solution required less than one hour of work.
-
Stabilizing and deploying the monitoring solution Following a quick functionality
check in the corporate production environment, the Enterprise Monitoring team delivered
the solution to HR IT by granting authoring permissions to the product manager responsible
for the Benefits Enrollment application. According to the defined scope, this step
concluded the project for the Enterprise Monitoring team. Granting authoring permissions
enabled HR IT to customize the solution further, such as by defining alert streams,
notifications, custom views, and reports.
Note: Detailed information about the MSF, including an MSF Resource
Kit and case studies, is available on Microsoft TechNet at
http://www.microsoft.com/technet/solutionaccelerators/msf/default.mspx.
Clarifying Business Requirements and Project Scope
Taking beta issues and time constraints into consideration, Enterprise Monitoring
and HR IT agreed that a basic, reliable, and usable monitoring solution had more
immediate value than a deluxe version delivered after the end of the employee open
enrollment period. By focusing on core functionality, Enterprise Monitoring delivered
the base solution swiftly, providing HR IT with a solid foundation to apply customizations
in subsequent steps.
Figure 5 shows the decision tree that Enterprise Monitoring and HR IT followed to
determine the scope of the project. In a first step, the teams decided to include
only the actual application elements and the core services in the monitoring solution.
The teams considered hardware, network, and infrastructure elements (such as Active
Directory, DNS, and Dynamic Host Configuration Protocol [DHCP]) less important for
the pilot version because Microsoft IT guarantees 99.99 availability in these areas.
Adding these components to the monitoring solution would have given HR IT a very
detailed view of the application infrastructure, yet it would have required Microsoft
IT to fully deploy System Center Operations Manager 2007 in the corporate production
environment, which did not correspond to the plans of Microsoft IT for the Release
Candidate 1 version of the product.
.jpg)
Figure 5. Clarifying monitoring scope for the Benefits Enrollment application
Enterprise Monitoring and HR IT defined the project scope based on the following
decisions:
-
Included components The base solution must monitor the Web servers that host
the application, the HR databases on MSSQLDB, and the core services running on MSMQCLUST01
and MSMQCLUST02. (Figure 3 earlier in this paper showed the position of these servers
in the Benefits Enrollment application architecture.)
-
Component health monitoring The base solution must include a distributed
application model and health rollup configuration so that HR IT can monitor the
health of each application component.
-
Application performance monitoring The base solution must include a Web application
monitor to perform synthetic transactions so that HR IT can track availability and
response times of the Benefits Enrollment application based on the actual end-user
perspective. The Web application monitor must use a special Run As account because
System Center Operations Manager system accounts are not authorized to access HR
LOB applications.
-
Role-based authorization The base solution needs to include only standard
views and reports. For alert streams, notifications, and other customizations, Enterprise
Monitoring grants authoring permissions to HR IT so that HR IT can create these
elements without the involvement of Enterprise Monitoring.
Planning the Monitoring Solution
To create an effective monitoring design, Enterprise Monitoring and HR IT analyzed
the impact of the individual application components on the overall user experience,
specifically how employees would notice component failures and performance bottlenecks
while working with the Benefits Enrollment application. For example, the Benefits
Enrollment application is hosted on five Web servers in an NLB cluster (as shown
in Figure 3 earlier in this paper). Because the NLB cluster provides automatic failover
capabilities, unavailability of a single Web server does not affect users. The remaining
Web servers are available and have sufficient capacity to handle the user requests
without performance impact. Accordingly, Enterprise Monitoring and HR IT decided
that a critical situation occurs only if more than 50 percent of the servers in
the NLB cluster are unavailable.
Table 1 summarizes the key monitoring criteria that Enterprise Monitoring and HR
IT established for the Benefits Enrollment application.
Table 1. Monitoring Plan for the Benefits Enrollment Application
|
Category |
Components |
Monitoring criteria |
|
Web site
|
WEB01, WEB02, WEB03, WEB04, and WEB05
|
The health state is critical if 50 percent of the servers in the NLB cluster are
unavailable. Because the NLB cluster provides automatic failover capabilities, unavailability
of single Web server does not represent a critical state.
|
|
HR databases
|
MiscData, DataReplication, General, IssueTracking, Metadata, and Payments
|
The health state is critical if any of the databases is unavailable.
|
|
Core LOB services
|
E-Mail Notifications, Issue Tracking, RBP, and data exchange services (FeedStore).
|
The health state is critical if any of the core LOB services is unavailable.
|
|
Web application
|
Response Times
|
The health state reaches the warning level if the response times exceed 10 seconds
and the critical level if response times exceed 20 seconds.
User tolerance for performance is around 20 seconds response time for a page.
|
Creating the Monitoring Solution
To implement the monitoring plan according to the scope of the pilot version, the
Enterprise Monitoring team created the following elements by using the Operations
Console:
-
Custom management pack System Center Operations Manager uses the concept
of management packs to save the rules and configuration settings of a monitoring
solution. Enterprise Monitoring delivered the Benefits Enrollment monitoring solution
in a single custom management pack to HR IT.
-
Distributed application model The application model is a structural representation
of the distributed application, outlining the application's organization in terms
of component groups and group relationships. To create the application model for
the Benefits Enrollment solution, Enterprise Monitoring used the Distributed Application
Designer directly available in the Operations Console.
-
Health rollup configuration Each component group within the distributed application
model provides access to health rollup settings. Enterprise Monitoring configured
these settings to describe the dynamic behavior of the Benefits Enrollment application
according to the monitoring criteria that Enterprise Monitoring and HR IT established
during the planning phase.
-
Web application monitor To check availability and measure response times
of the Benefits Enrollment application, Enterprise Monitoring created a Web application
monitor. This monitor performs synthetic Hypertext Transfer Protocol (HTTP) requests
in a recorded browser session by using the account credentials that HR IT provided.
Creating the Distributed Application
Enterprise Monitoring created the application model for the Benefits Enrollment
solution by using the Distributed Application Designer and the Line of Business
Web Application template. This template contains predefined container groups for
Web sites and databases. In addition to the predefined groups, Enterprise Monitoring
created a third container for the core LOB services and established a relationship
between the Web site component group and the application services component group.
Because the core LOB services run on Windows Clustering virtual servers, Enterprise
Monitoring restricted the new component group to the Windows Cluster Resource
object class. Subsequently, Enterprise Monitoring used the Objects pane in the Distributed
Application Designer to discover automatically all relevant objects and added them
via a drag-and-drop operation to the corresponding component groups. Figure 6 displays
the resulting application model.
.jpg)
Figure 6. Benefits Enrollment application model
Defining Health Rollup Settings
Table 2 summarizes the health rollup configuration that Enterprise Monitoring applied
to the individual component groups in the distributed application model. The Enterprise
Monitoring team configured only the most basic parameters according to the monitoring
plan discussed in the section "Planning the Monitoring Solution" earlier in this
paper. Authoring permissions enable the HR IT team to define further settings on
a group-specific basis.
Table 2. Health Rollup Configuration for the Benefits Enrollment Application
|
Component group |
Parameter |
Setting |
Effective value |
|
Benefits Enrollment Web application Web sites
|
Rollup algorithm
|
Worst state of a percentage of members in good health state
|
Show the worst state of any member
|
|
|
Percentage
|
50
|
|
|
Benefits Enrollment Web application databases
|
Rollup algorithm
|
Worst health state of any member
|
Show the worst state of any member
|
|
Benefits Enrollment Web application services
|
Rollup algorithm
|
Worst health state of any member
|
Show the worst state of any member
|
Creating a Web Application Monitor
To measure availability and performance of the Benefits Enrollment application from
the end-user perspective, Enterprise Monitoring added a Web application monitor
object to the custom management pack for HR IT. Based on the configuration settings
defined in this monitor object, System Center Operations Manager agents running
on managed computers in the corporate production environment perform synthetic transactions.
An agent-managed computer that is running an application monitor is called a watcher
node. To track availability and performance comprehensively across the corporate
production environment, Enterprise Monitoring selected computers in all U.S. locations
to act as watcher nodes.
Enterprise Monitoring and HR IT performed the following steps to create the Web
application monitor:
-
Added an HR IT system account to the custom management pack as a Run As account
to provide an appropriate identity for the Web application monitor.
-
Created the Web application monitor in the Operations Console by using the Add Monitoring
Wizard and the Web Application management pack template. During this configuration
step, Enterprise Monitoring assigned agent-managed computers from all relevant locations
as watcher nodes.
-
Recorded a browser session to perform synthetic transactions. The transactions simulate
typical user actions, such as logging on and browsing Web pages. Because of legal
constraints, Enterprise Monitoring and HR IT did not configure the Web application
monitor to complete actual transactions in the HR databases.
-
Defined content matching and response time criteria for the warning and critical
error levels for the Web application monitor.
-
Activated Windows authentication and specified the HR IT account to access the pages.
Deploying the Monitoring Solution
Following a functional test of the solution in the corporate production environment,
Enterprise Monitoring created separate Author, Operator, and Read-Only Operator
user roles for HR IT by using the Create User Role Wizard in the Operations Console.
Enterprise Monitoring also restricted the authoring scope to the resources created
in the custom management pack for HR IT. The Author user role provides HR IT with
control over the Benefits Enrollment monitoring solution. The Operator user role
includes a set of rights to interact with alerts, execute tasks, and access views
according to their configured scope. The Read-Only Operator user role allows the
corresponding users to view alerts and access views.
Enterprise Monitoring assigned the user roles to security groups. Assigning to security
groups instead of individual user accounts enables HR IT to assign user roles by
adding or removing group members in Active Directory without further involvement
of an Enterprise Monitoring administrator. Through the corresponding security groups,
HR IT assigned the Author user role to the program manager who is responsible for
the Benefits Enrollment application, the Operator user role to Tier 2 support engineers,
and the Read-Only Operator user role to Tier 1 specialists.
Pilot Project Review
The HR IT group used the solution to monitor the Benefits Enrollment application
very closely during the open enrollment period in 2006 and created daily performance
and availability reports. These reports included the number of concurrent connections,
processor time, processor queue length, available memory, memory pages swapped per
second, and free disk space for each server. Based on these reports, HR IT communicated
statistics to stakeholders in the Microsoft IT and HR departments that showed 100
percent application availability with an average server utilization of 4.42 percent
(Web servers), 13.04 percent (database servers), and 7.92 percent (reporting servers).
The Benefits Enrollment application had 26,687 unique visitors that saved 17,046
records.
After the end of the open enrollment period, Enterprise Monitoring and HR IT reviewed
the pilot project and summarized their findings as follows:
-
Increased flexibility to meet service levels The end-to-end service management
solution gave HR IT flexible control over all application-monitoring aspects. These
aspects included synthetic transaction monitoring and the customization of alert
streams and reports according to application-specific support requirements and the
promotion of service desk, Tier 1, and Tier 2 operators. Essentially, the monitoring
solution appeared as owned exclusively by HR IT, yet without the overhead of infrastructure
maintenance and administration. HR IT was promptly able to satisfy business unit
requests for custom availability and performance reports without the involvement
or assistance of Enterprise Monitoring.
-
Clear separation of responsibilities Whereas HR IT focused on satisfying
the needs of the business user through highly customized monitoring and support
solutions, Enterprise Monitoring focused on maintaining a centralized monitoring
infrastructure in the corporate production environment. Enterprise Monitoring can
increasingly concentrate on overall enterprise monitoring, and a large number of
small application support teams (each containing four IT specialists or fewer) can
take care of each individual LOB application, quickly identifying and resolving
issues that affect service levels.
-
Low development costs Enterprise Monitoring created the base version of the
Benefits Enrollment monitoring solution by using standard tools and wizards readily
available in the System Center Operations Manager Operations Console. Providing
authoring permissions to HR IT freed Enterprise Monitoring from having to determine
detailed business unit requirements during the initial solution deployment.
-
Rapid deployment The end-to-end service management solution did not require
a specialized monitoring environment. All required resources are centrally available.
Reusing existing technology investments eliminates the need to deploy additional
management agents, servers, or data warehouses for reporting purposes. Deploying
end-to-end service management solutions is mainly a task of delegating appropriate
user roles to the individual business IT groups.
Best Practices
A well-defined project scope and monitoring plan, intuitive authoring tools and
wizards, and a clear separation of administration, authoring, and operational tasks
enabled Microsoft IT to deliver the end-to-end service management solution for the
Benefits Enrollment application in time to run the first pilot during the open enrollment
period of 2006. Based on the experiences gained during the pilot phase, Microsoft
IT developed the following best practices to use in planning and deploying end-to-end
service management solutions based on System Center Operations Manager 2007:
-
Centralize the monitoring infrastructure By monitoring the entire corporate
production environment in a centralized infrastructure, Microsoft IT can include
all components that affect the availability and performance of LOB applications
in end-to-end service management solutions without having to duplicate the underlying
health information. Standardizing on System Center Operations Manager and reusing
existing investments helps Microsoft IT to lower costs.
-
Decentralize application monitoring By giving individual business IT groups
targeted monitoring tools (for their specific LOB applications) that include all
components that the LOB applications depend on, Microsoft IT can create an overlapping
monitoring environment where individual teams can support each other to ensure highest
service levels across the entire IT organization.
-
Focus on immediate value By providing basic, solid, and usable monitoring
solutions for mission-critical LOB applications with the most business value, Microsoft
IT can quickly realize improved service levels in the LOB application space. In
subsequent steps, individual business IT groups can customize their solutions according
to specific needs.
-
Analyze application behavior and dependencies By defining a clear monitoring
plan that reflects the application behavior under critical and error conditions,
Microsoft IT can ensure that the end-to-end solutions effectively monitor the availability
and performance of the corresponding LOB applications.
-
Monitor availability and performance from all relevant locations By using
synthetic transactions and agent-managed computers in all relevant locations as
watcher nodes, Microsoft IT can track availability and performance from the perspective
of all employees who use the LOB application.
-
Use Windows security groups for role-based authorization By granting the
Author, Operator, or other permissions to security groups instead of individual
user accounts in System Center Operations Manager, Microsoft IT can delegate permissions
management to the individual teams responsible for LOB applications.
-
Document the solution By using the standard authoring tools available in
the Operations Console and a basic set of documentation that outlines the components
and health settings to be monitored, Microsoft IT can quickly recreate monitoring
solutions if necessary.
Conclusion
System Center Operations Manager 2007 provides the foundation for Microsoft IT to
reduce costs while at the same time increasing service levels concerning LOB applications
in the corporate production environment. Performance and scalability improvements
and new features, such as role-based authorization, enable Microsoft IT to streamline
IT operations by consolidating management groups. Centralizing the monitoring infrastructure
eliminates redundant management servers and reporting databases. Consolidating all
health information in a centralized monitoring infrastructure is also the basis
to develop end-to-end service management solutions for individual teams and groups
that maintain LOB applications and other systems in the corporate production environment.
Individual monitoring solutions can overlap to include shared devices, servers,
components, and services, without duplicating the underlying health data.
Reusing existing technology investments is one key element for Microsoft IT to lower
costs and operations overhead. Another element is reusing knowledge and expertise
from Microsoft operating system, server, client, and application development teams,
as well as from third-party vendors through management packs. Yet another key element
is the integration of System Center Operations Manager with Helpdesk ticketing and
other systems to automate routine tasks in order to increase the efficiency of IT
operations. Microsoft IT furthermore automates redundant administration tasks, such
as the deployment of management agents and the discovery of new systems and applications
in the corporate production environment.
System Center Operations Manager provides end-to-end service management that is
easy to customize and extend. Service templates and intuitive design tools, such
as the Distributed Application Designer and the Add Monitoring Wizard in the Operations
Console, enable Microsoft IT to create end-to-end service management solutions in
short development cycles. The solutions include synthetic transaction monitoring
to measure application availability and performance from the end-user perspective.
The groups that actually use the monitoring solutions have authoring permissions
to customize the solutions further according to the specific needs of the business
units. For example, business IT groups can customize performance and availability
reports to give business units convenient access to statistics and other information
that shows achieved service levels and application performance.
The new service-oriented views and availability reporting enable both the Microsoft
IT operations teams and management to get the information that they need to identify
and resolve issues that affect the end-to-end delivery of IT services. In this way,
System Center Operations Manager enables Microsoft IT to perfect the alignment of
IT services with the needs of the company.
For More Information
For more information about Microsoft products or services, call the Microsoft Sales
Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information
Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact
your local Microsoft subsidiary. To access information through the World Wide Web,
go to:
http://www.microsoft.com
http://www.microsoft.com/technet/itshowcase