Microsoft Avoids Millions of Dollars in Revenue Loss and Penalties with 99.95 Percent SAP Uptime
Published January 2014
To facilitate continuous operations, Microsoft needs its SAP enterprise resource planning (ERP) system to be available every day, around the clock. By using built-in features in Windows Server 2012 and SQL Server 2012, the company maintains 99.95 percent uptime, avoids spending millions of dollars, reduces risk, cuts disk space by three times, and boosts productivity.
Article, 274 KB, Microsoft Word file
To manage people, assets, services, and projects at Microsoft, employees rely on the company's SAP ERP system. To increase availability, reduce downtime during outages, and scale to meet evolving requirements, Microsoft IT upgraded the SAP ERP system to run on Windows Server 2012 and SQL Server 2012.
The move has helped dramatically reduce planned down-time and costs, improve efficiency, and minimize business risk. New functionality, including Windows Failover Clustering and SQL Server 2012 AlwaysOn, has given Microsoft IT the ability to provide the flexibility and reliability employees have come to expect.
"We can now install software security patches on our mission-critical SAP ERP system and accrue no more than one minute of downtime by using SQL Server 2012 AlwaysOn and Windows Failover Clustering."
Principle Service Architect
Microsoft is the worldwide leader in software, services, and solutions. To manage people, assets, services, and projects, every employee uses the company's SAP ERP system. Customers also use it when they access the Microsoft Marketplace. Because SAP ERP is mission-critical, engineers need to keep it available. "The cost of SAP ERP downtime really depends on the time of day and when it occurs on the calendar," says Elke Bregler, Principle Service Architect at Microsoft IT. "Five minutes might not cost us very much if it happens in the middle of the night during a slow period. However, five minutes of downtime during a busy period could cost us millions of dollars."
Previously, IT personnel made required system changes—including installing software patches or adding new business processes—only on weekends, in the middle of the night, because any modification elevated risk and required 30 minutes to several hours of downtime. Bregler says, "Whenever we made a system change, I would be nervous during the whole process because there was so much business risk involved. I felt like I aged five years each time."
The system's size and load also make it more difficult to keep it continuously available. For example, in peak months, SAP ERP processes more than 125 million transactions. In an average month, the system manages 4.6 million batch processes. It also requests about 120 terabytes (TB) of data from its 5.4 TB database that resides on a storage area network (SAN) with 402 petabytes of data. To increase uptime and still scale the system to meet evolving requirements, IT personnel sought tools that could boost efficiency and reduce downtime and risk.
Upgrading the SAP ERP system to run on the Windows Server 2012 Enterprise operating system and Microsoft SQL Server 2012 Enterprise Edition software helped engineers meet these goals. For example, IT personnel eliminate a single point of failure by using Windows Server 2012 Failover Clustering to run two replicated instances of SAP ERP. The cluster for the SAP Central Services Instances (CI)—which is the single point of failure for SAP systems—runs on one HP ProLiant DL380 G5 server in an active/passive configuration. The same configuration exists in the disaster-recovery site. If a disaster (real or test) occurs, the failover of the SAP CI is expedited by using the same alias on both clusters. By doing so, no one has to modify SAP ERP to recognize a different server name as being primary when failover occurs. Instead, engineers only need to designate the IP address of the primary cluster in the Domain Name System in Windows Server.
At the database level, the IT team uses page-level compression in SQL Server 2012 to save disk space. Engineers also use SQL Server AlwaysOn availability groups to maintain three database replicas. Each one runs on a separate HP ProLiant DL580 G7 server. Each of the primary and secondary replicas, which are synchronous mirrors, connects to an EMC Symmetrix VMAX enterprise SAN. Both SANs are also mirrors, and each attaches to a different cluster to reduce risk. The third database replica, which is at a remote data center for disaster-recovery purposes, is updated via asynchronous replication.
To make a change to the SAP ERP system, engineers take the passive cluster or database replica offline and make the modification. After verifying that the offline cluster or system is working, an engineer adds it back into production as the passive cluster or replica, and it automatically synchronizes with the active cluster or replica. To update the active system, an engineer makes it the passive one and then follows the same procedure. To be safe, engineers also maintain several complete replicas of the SAP ERP environment for test and development.
By using features in SQL Server 2012 and Windows Server 2012, engineers increased the uptime of the mission-critical system; avoided unnecessary fees, risk, and stress; and boosted savings and efficiency.
Avoids Millions in Expenses with 99.95 Percent Uptime
Today, the SAP ERP system is available 99.95 percent of the time. As a result, Microsoft avoids paying millions of dollars in fees and lost productivity. "We can now install software security patches on our mission-critical SAP ERP system and accrue no more than one minute of downtime by using SQL Server 2012 AlwaysOn and Windows Failover Clustering" says Bregler. "We also replaced some of the CPUs in the SAP ERP database servers and migrated our SAN to new hardware, and, in both cases, we did it with just one minute of downtime." Engineers are using the same high-availability design to achieve similar results in other environments.
Minimizes Business Risk
Although unplanned downtime is rare, it can happen. Engineers have dramatically reduced the impact of an unexpected error just by using features in Windows Server 2012 and SQL Server 2012. For example, running SAP ERP on failover clusters means that servers or even a cluster can fail and the application will continue to run. In addition, engineers can easily maintain three copies of the production database. In the off chance that the primary and secondary replicas go offline simultaneously, engineers can failover to the third copy: it will not yet reflect the error because it's updated asynchronously. IT personnel also reduce risk and streamline the approval process for system modifications by first making changes in the test environment. Engineers have also expedited disaster recovery processes by requiring ongoing drills using the test environment. "We have reduced our recovery point objective to just a few seconds and our disaster-recovery processes to just four hours," says Bregler.
Cuts Space by Threefold
Microsoft has minimized its data-center requirements and costs with database compression. Because IT staff have used different levels of compression for years, it is difficult to measure how much space and money engineers have saved with it. However, Bregler says, "We recently created a custom table for a new business process that had 1.5 billion rows and was 1.5 terabytes in size. By using page-level compression in SQL Server 2012, the table now spans 6 billion rows and yet it is only 400 gigabytes in size."
Improves Efficiency and Peace of Mind
By running SAP ERP on Windows Server 2012 and SQL Server 2012, business employees have become more productive because they can dependably access a core tool. IT personnel are also more efficient and less stressed because they can work on the system while the replica application, database, SAN, or server support operations. As a result, "We can now make changes to our SAP ERP system during regular business hours, which makes us more efficient," says Bregler. Not only are fewer errors made because people are rested, but also more IT personnel are available for questions, and spare parts are available faster. Bregler adds, "Windows Failover Clustering and SQL Server 2012 AlwaysOn have really given us much greater freedom and peace of mind. I'm also not aging as fast anymore because we have greatly reduced risk, and we have a lot more options for maintaining our mission-critical systems."
For More Information
For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:
© 2014 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.