Reducing Costs and Improving Systems Management with Hyper-V and System Center Operations Manager
Technical Case Study
Published: June 2010
Is your company looking for ways to better manage your computer systems and reduce hardware costs? Learn how Microsoft IT uses the Hyper-V™ feature available in the Windows Server® 2008 R2 operating system to virtualize its development, test, and production systems, reduce physical hardware demand, and improve CPU utilization. Microsoft IT is also using System Center Operations Manager 2007 to comprehensively monitor and respond to system issues, enabling Support teams to proactively address incidents and reduce user support calls.
Products & Technologies
The Microsoft Entertainment and Devices (E&D) division was encountering long lead times of up to two months to provision physical hardware for development and test purposes. Delays for production servers could reach up to three months.
In addition, the division’s manual monitoring of critical jobs made it difficult to quickly identify system health issues. Most problems were called in by users, which resulted in a suboptimal reactive support mode.
Microsoft IT built a new virtual development/test environment for E&D using Hyper-V and designed a highly granular monitoring solution based on Microsoft System Center Operations Manager 2007.
Microsoft IT migrated Microsoft Operations Manager (MOM) 2005 management packs to the System Center Operations Manager 2007 platform, and standardized server deployment with virtual server templates.
The Entertainment and Devices (E&D) division at Microsoft is responsible for bringing a large array of products to market. In any given quarter, almost 100 updates to existing products and new products and technologies are being developed, tested, and prepared for release.
The Microsoft Information Technology (Microsoft IT) team responsible for E&D's infrastructure was struggling with the ever-increasing demand for development, testing, and production servers that needed to be provisioned for each product release. Not only were system hardware costs escalating, but also the physical space required to house all these new systems was in short supply. Delays in bringing new systems online were starting to impact release schedules.
In the rush to deliver the new development, test, and production servers, systems were not always configured the same way. Over time, Microsoft IT discovered a large discrepancy between environments. The debugging and testing processes were inefficient due to the number of system variances, and in many cases, rollbacks to previous states were unavailable.
Microsoft IT was also challenged by their existing support model, which was primarily a manual process—employees would typically call Support and request Microsoft IT to respond. Not only are manually-created Service Requests (SRs) expensive to resolve, but in many cases, multiple people would call Support about the same issue, which then generated many SRs related to a single system failure.
Microsoft IT needed to evolve its operations from a reactive mode to a proactive one; to do so required implementing an automated monitoring solution that could monitor the entire E&D environment at a very granular level and notify Support at the first detection of trouble.
Microsoft IT is implementing a broad-reaching new system operations model for E&D that:
- Uses the Hyper-V feature available in Windows Server 2008 R2 to virtualize its development, test, and production systems in order to reduce overhead and facilitate rapid deployment of standardized development and test environments.
- Incorporates System Center Operations Manager (Operations Manager) 2007 to automate system monitoring and provide greater insight into the entire environment's operational status.
Operations Manager Implementation at Microsoft
This section of the document provides an overview of how Microsoft IT implemented Operations Manager 2007 in order to better monitor the entire E&D environment and establish the means to respond proactively to operational issues.
Why Operations Manager 2007?
System Center Operations Manager 2007 R2 enables businesses to reduce the cost of data center management across server operating systems and hypervisors through a single, familiar, and easy to use interface. Through numerous views that show state, health, and performance information as well as alerts generated according to some availability, performance, configuration, or security situation being identified, operators can gain rapid insight into the state of the IT environment and the IT services running across physical and virtual systems and workloads.
As previously mentioned, E&D had been dependent on a manual support model where over 80 percent of support issues were expensive and time-consuming SRs, which are created when customers call Support. Microsoft IT wanted to upgrade its monitoring environment from Microsoft Operations Manager (MOM) 2005 to gain the enhanced application performance management that Operations Manager 2007 provides—especially its ability to granularly define service-level objectives that can be targeted against the different components that comprise an IT service. This capability was the foundation of E&D's new monitoring platform that was designed to enable timely, low-cost, system-generated alerts known as infrastructure requests (IRs) to become the standard means of notifying Support of a problem, and consequently enable Microsoft IT to respond proactively to situations as early as possible.
New Monitoring Platform
To create a system with the most granular and automated means of system monitoring, Microsoft IT's implementation plan was to upgrade from MOM 2005 to Operations Manager 2007 across the entire E&D environment.
Microsoft IT also wanted to incorporate Operations Manager's support of remote gateways into their new monitoring platform. A new feature in Operations Manager 2007, gateway servers gave Microsoft IT the ability to monitor systems in untrusted domains and consolidate traffic between locations. The following illustration shows the ability of gateway servers to aggregate data from systems in untrusted domains and pass their details to the primary domain.
Figure 1: Operations Manager 2007 gateway servers can pass system data in untrusted domains to Operations Manager 2007 management servers in the primary domain
The new monitoring platform was designed from the outset to have redundancy throughout the system, including:
- Clustered database servers
- Deployment of multiple management servers to the primary domain to allow for a manual failover of the RMS as well as for agent failovers
- Installation of multiple gateway in remote domains to allow agents in remote locations to fail over gracefully
Microsoft IT also designed the new monitoring infrastructure to support the installation of additional monitoring servers as needed, as well as the ability to bring more gateways online when future remote locations need to be added.
Deployment of DEV/SIT/UAT and Production Monitoring Servers
The new Operations Manager 2007–based monitoring servers were deployed into two types of servers:
- DEV/SIT/UATThese servers monitored development (DEV), system integration testing (SIT), and user acceptance testing (UAT) environments. By incorporating a custom set of management packs as described in the following section, three different versions of the same management pack—one each for DEV, SIT, and UAT—could reside on a single server.
- ProductionA separate set of Operations Manager 2007 monitoring servers were allocated to monitor applications that are deployed for production use. Management packs deployed to these servers have been tested and tuned, and alerts are constantly being reviewed for further reduction of high-firing issues.
The following illustration provides a high-level schematic of the new Operations Manager 2007–based monitoring platform, including DEV/SIT/UAT servers, production servers, and gateways. It also highlights Operation Manager agents' ability to monitor all environments, irrespective of whether they are virtual or physical.
Figure 2: Schematic of the new E&D monitoring platform, based on Operations Manager 2007
Management Pack Development
To ensure a standardized management pack development methodology, Microsoft IT designed a template that is used as the basis for all new E&D management packs. A batch file parses the management pack XML and update fields to identify the management pack as DEV, SIT, or UAT. The batch file then executes a Windows PowerShell™ cmdlet to install the management pack into the appropriate environment (DEV/SIT/UAT).
The result is the automatic installation of a set of three unique versions of the same management pack that can all reside on a single Operations Manager server and monitor the three different types of systems (DEV, SIT, and UAT) without interacting with each other.
The following illustration displays the three different versions of the same management pack successfully operating with the same Operations Manager 2007 monitoring server.
Figure 3: Template from which multiple versions of the same Operations Manager 2007 management pack are developed
Since implementing their new monitoring platform based on Operations Manager 2007 for the E&D division, Microsoft IT has:
- Reduced the frequency of expensive, manual SR from 88 percent of all support incidents down to 15 percent.
- Established a 24x7 monitoring team that is fully dedicated to monitoring the E&D environments with Operations Manager 2007.
- As of January 2008, obtained a 100 percent adoption of standardized management pack development—all development efforts must use the E&D management pack template when creating their applications' monitoring agents.
As shown in the following illustration, implementing Operations Manager 2007 monitoring resulted in an increase of the timely, low-cost system-generated infrastructure requests (IRs) and a reduction in the number of expensive user-called service requests (SRs). With the new monitoring platform in place, Microsoft IT can respond to IRs proactively and focus on refining the process for overall alert reduction and increased customer satisfaction.
Figure 4: Trend showing increase in timely, low-cost, automated system-based alerts (yellow line) versus expensive user-phoned or email support calls (red line)
Implementing Server Virtualization with Hyper-V
This section of the document provides an overview of how Microsoft IT implemented Hyper-V to virtualize E&D's development, test, and production systems.
Hyper-V in Windows Server 2008 R2 offers increased flexibility in deployment and life cycle management of applications. Hyper-V is commonly used to consolidate workloads and reduce data center power consumption. Additionally, Hyper-V's clustering technologies provide a robust IT infrastructure with high availability and quick disaster recovery.
Windows Server 2008 R2 Hyper-V also provides greater flexibility. With Hyper-V live migration, administrators can move running virtual machines (VMs) from one Hyper-V physical host to another without any disruption or perceived loss of service. The live migration feature provides the core technology required for dynamic load balancing, VM placement, and high availability for virtualized workloads during physical computer maintenance.
Implementing a New Virtualized Server Environment
Microsoft IT wanted to evaluate Hyper-V virtualization in E&D's preproduction environments as a method of increasing operational efficiencies and reducing costs. Microsoft IT also wanted to see if they could use Hyper-V to shorten their system delivery lead times.
Microsoft IT saw Hyper-V as the best means to optimize the E&D division's infrastructure, both from an asset utilization standpoint as well as for its ability to balance workloads across different resources. In addition, the ability to use Hyper-V as a platform that could help standardize E&D's development, system integration testing, and user acceptance testing (DEV/SIT/UAT) environments was highly desirable.
In addition to Hyper-V, Microsoft IT also wanted to investigate incorporating System Center Virtual Machine Manager (SCVMM) as the centralized means to control the deployment and management of virtual systems. If the performance benchmarks for the new virtual servers were acceptable, Microsoft IT planned to expand its Hyper-V implementation to cover all environments as deemed appropriate.
The proof of concept (PoC) work began in 2008 by building six virtual guests on a single host, whose primary criteria for evaluation included performance, speed of deployment, and manageability both from an IT perspective, as well for the E&D product teams.
In 2008, the PoC was based on a pre-release version of Windows Server 2008 with Hyper-V. Even at this early stage, the Hyper-V technology demonstrated a significant improvement over the Windows® Virtual Server 2005, and so the program expanded its scope by steadily increasing the number of virtual systems in use.
As shown in the following illustration, over the following two years as the Hyper-V technology matured with the release of Windows Server 2008 R2, the virtualization implementation has grown to over 600 VMs, with additional virtual systems coming online on a regular basis.
Figure 5: Number of virtual machines operating in the E&D division
Microsoft IT has also successfully used Hyper-V to reduce E&D's demand for new physical hardware while simultaneously improving existing hardware utilization. E&D achieved a compression ratio of 30:1 on many hosts and improved average CPU utilization from 3–5 percent to 65–80 percent. The following illustration charts the number of rack units that avoided deployment due to deploying VMs instead of physical servers.
Figure 6: Number of physical servers that avoided deployment due to virtualization
From a cost perspective, the Hyper-V effort has resulted in savings derived from a number of factors. The following illustration shows the cumulative savings achieved due to reduced costs associated with hardware. Reduced demand for rack space, power consumption, space requirements, and hosting charges for discrete servers enhance the savings.
Figure 7: Savings realized by utilizing virtual machines
Finally, implementing Hyper-V has enabled Microsoft IT to reduce the time required to deliver new server environments while ensuring the deployment of standardized DEV/SIT/UAT and production environments. In addition, by incorporating SCVMM into the DEV/SIT/UAT environments, users can build virtual systems out of a private pool without any IT interaction. Once the user has completed using the VM, they can release the resources back into that same pool for the next user.
In the course of working with Hyper-V and Operations Manager to design, implement, and operate the new Entertainment & Device infrastructure, Microsoft IT followed these best practices:
- Ensure that development, test, and corporate domains are separated to maximize security and performance. Allocate resources to these separate domains as appropriate—redundant servers in production environments for failover, and potentially isolated networks or subnetworks to support development and test activities.
- If you have untrusted domains, use Operations Manager gateway servers to communicate with the corporate network. Gateway servers can be configured to either pass monitoring agent data directly to a monitoring server in a trusted domain, or gateway servers can chain together, with one gateway server aggregating data from other downstream gateway servers and then passing all the data to a monitoring server.
- Adhere to standard Microsoft security recommendations and SDL practices to enhance server security. Adopt the Trustworthy Computing Security Development Lifecycle (SDL) to help develop software that can withstand malicious attacks. For more information, see http://msdn.microsoft.com/library/ms995349.aspx.
Development and Testing
- Utilize virtual machine templates to standardize virtual environments and enforce consistency across your infrastructure. Using templates eliminates the human error factor that can affect builds, in a similar vein as a SYSPREP image that can automatically prepare a system for deployment to other environments.
- Establish policy for including resolutions steps in every management pack's list of fireable alerts. Instead of using resolution steps directly in the rule, post them as a wiki link in a Microsoft SharePoint® site to ease the management and updates/changes to the resolution steps.
- Use the Windows PowerShell cmdlets supplied with SCVMM to perform offline VM patching. The Offline Virtual Machine Servicing Tool 2.1 uses Windows PowerShell scripts to manage updating large numbers of offline virtual machines. For more information, see http://technet.microsoft.com//library/cc501231.aspx.
By implementing Hyper-V and Operations Manager 2007 in its Entertainment & Device division, Microsoft IT has derived a number of benefits:
- Improved Delivery Time: With Hyper-V, Microsoft IT can deploy new virtual environments to DEV, SIT, and UAT teams far more quickly than the traditional time required to build and configure physical servers.
- Proactive Support: Operations Manager 2007's granular level of system monitoring has enabled Microsoft IT to identify and respond to system issues (IRs) in a much more proactive manner than the previous model, which was primarily based on manual and expensive phone calls (SRs) from users.
- Cost Savings: The E&D division is realizing over a million dollar savings from a combination of reduced infrastructure costs from Hyper-V's ability to reduce demand for physical servers, hosting charges, rack space, and power consumption; and reduced support costs due to Operations Manager's use of automated IR alerts.
- Consistency: Operations Manager Management pack templates help ensure consistent development of management packs across all development teams. In addition, Microsoft uses Hyper-V snapshots in order to roll back changes to applications that did not meet the product's quality criteria.
- Simplified View and Management of Complex Environments: When used together, Operations Manager and SCVMM can provide an in-depth view on the utilization of the virtualization resources. This information can help you make the right decisions about hardware procurement, right-sizing virtual machines according to usage and performance, and augmenting an archival strategy to ensure productive use of resources.
Microsoft System Center Operations Manager (Operations Manager) 2007 and the Hyper-V feature available in Windows Server 2008 R2 have dramatically improved operations in Microsoft's Entertainment and Device (E&D) division.
Before developing the new Operations Manager 2007–based monitoring platform, over 80 percent of E&D support issues were logged as expensive support requests (SRs) that were called in by users. With the new monitoring platform in place, system infrastructure requests (IRs) comprise the majority of identified issues. These automated alerts enable Support to respond proactively and have reduced SR calls to 15 percent. Microsoft IT continues to see regular reductions in the number of SRs and expects to eventually drive them down to only 5 percent of total service calls.
Similarly, incorporating Hyper-V as a virtualization platform for E&D development and test environments has significantly improved Microsoft IT's delivery time for new systems, helped enforce standardization across environments, and reduced costs. From an initial six virtual systems at the start of the pilot in 2008, Microsoft IT has deployed over 600 virtual machines and is bringing more on line each month. Hyper-V has helped Microsoft IT achieve a 30:1 compression ratio on many server hosts and has improved average CPU utilization from 3—5 percent to 65–80 percent. In addition to helping Microsoft IT streamline their server deliveries, Hyper-V's ability to reduce hardware, rack space, and power demands are saving Microsoft more than $3 million in cumulative costs as of May 2010.
For More Information
For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:
This document supports a preliminary release of a software product that may be changed substantially prior to final commercial release. This document is provided for informational purposes only and Microsoft makes no warranties, either express or implied, in this document. Information in this document, including URL and other Internet website references, is subject to change without notice. The entire risk of the use or the results from the use of this document remains with the user. Unless otherwise noted, the companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted in examples herein are fictitious. No association with any real company, organization, product, domain name, email address, logo, person, place, or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.
© 2010 Microsoft Corporation. All rights reserved.
Microsoft, Hyper-V, SharePoint, SQL Server, Windows, Windows PowerShell, and Windows Server are trademarks of Microsoft Corporation in the United States and/or other countries.
All other trademarks are property of their respective owners.