Microsoft Deploys Innovative Solution That Reports Near-Real-Time Data
Business Case Study
Published: June 2011
The following content may no longer reflect Microsoft’s current position or infrastructure. This content should be viewed as reference documentation only, to inform IT business decisions within your own company or organization.
Always looking for ways to improve efficiency, Microsoft wanted its executives and finance managers to be able to create near-real-time operating reports. The company also sought to increase system availability and cut costs. In 2011, Microsoft deployed a mission-critical reporting solution that uses high-availability features in Microsoft SQL Server 2008 R2 to move data between online transaction processing systems, data marts, and a reporting server—instead of using the traditional extract, transform, and load operations. The highly scalable solution also splits each 250-gigabyte data mart based on the age of information. As a result, data marts are updated every 7 minutes so that users can create accurate reports around the clock. The company has also increased data availability, reduced server footprint by 72 percent, sped system performance, and improved agility.
Business Case Study, 374 KB, Microsoft Word file
Software and Services
Based in Redmond, Washington, Microsoft is a Fortune Global 500 software company that employs 88,000 people and earned U.S. $62 billion in revenue in 2010.
Executives and finance managers needed a solution that could generate revenue and expense reports with near-real-time data.
|Microsoft deployed a reporting solution that separates information into current and historical data marts and uses only native features in Microsoft SQL Server 2008 R2 Enterprise to extract and move data.||
Microsoft provides comprehensive software, services, and IT solutions to people and companies around the globe. Since its founding in 1975, the company has grown into a multibillion-dollar organization that in 2010 ranked 36 in the Fortune 500 and 115 in the Fortune Global 500. Its 88,000 employees develop and support products that include operating systems, database and development technologies, and productivity tools.
To facilitate the corporate forecasting process, about 1,500 global executives and finance managers generate mission-critical revenue and expense reports during the first 10 days of each fiscal month. Before 2011, they created the reports by querying four data marts using Microsoft Excel 2010 spreadsheet software and a custom add-in similar to Microsoft SQL Server PowerPivot for Microsoft Excel. The data marts, which ran on Microsoft SQL Server 2008 Enterprise data management software and the Windows Server 2008 Enterprise operating system, each contained about 60 million records or 250 gigabytes (GB) of information about revenue and expense forecasts and budgets. The data marts did not have a reporting capability, so the data had to be exported to other reporting systems. Overall, the solution supported about 1,000 custom queries each day.
On average, the information in the data marts was updated once every 24 hours. More frequent updates were not possible because the solution used extract, transform, and load (ETL) operations to obtain information from mission-critical online transaction processing (OLTP) systems. The ETL operations slowed the performance of OLTP systems and so were conducted only periodically, during off-peak times.
To increase efficiency, the Microsoft IT department needed a reporting solution that could provide access to near-real-time data. This way, reports would contain accurate information regardless of what time of day they were created. In addition, users would not have to spend time waiting to validate the numbers in the report. The validation process was particularly inefficient because reports often included information from multiple data marts that were updated at different times each day. "Users were frustrated because they could enter data in an OLTP system but then not be able to see it in the reporting solution for a day or more," says Celestine Joseph, Senior Software Development Engineer in Business Intelligence Engineering at Microsoft.
Employee productivity also suffered from occasional system outages. The day, time, and length of an outage determined the level of employee stress, need for overtime, or limit on the number of reports that were generated during that time.
Another challenge was that the solution was expensive to maintain and scale. Each data mart used four to six server computers to support the distributed processing of data to create reports. Any significant increases in data growth, performance requirements, or number of users required additional hardware. Likewise, to create any new data marts, Microsoft had to make similar investments in server computers, data-center space, and IT resources.
Microsoft decided to implement a reporting solution that could increase efficiency, improve data accuracy, and simplify scalability.
After evaluating how it could use the Microsoft platform to address its challenges, IT personnel designed an innovative mission-critical reporting solution that uses database mirroring, change data capture (CDC), table compression, table partitioning, and the compressed backup and restore capabilities in Microsoft SQL Server 2008 R2. "People usually use SQL Server 2008 R2 database mirroring for high availability," says Joseph. "In this reporting solution, we use it as a tool to move data from OLTP databases to a processing server. This approach not only gets up-to-date information to the processing server, but it also has a negligible impact on the OLTP servers."
Another major change from the previous architecture includes the location and content of the data marts. Rather than using four to six server computers to support each data mart, the new solution consolidates the processing for revenue and expense data marts into one physical system. The solution then moves the data mart that the users access for their reporting needs into another physical system. In addition, each data mart is physically divided into two databases: one contains the data from the current month and the other contains data for the previous months.
Even though this solution doubles the number of data marts in this configuration, splitting the data marts based on the age of information significantly speeds overall solution performance, because the active data sets are only about 40–60 GB. "Usually the transactions on the OLTP systems involve the data about the current month," explains Joseph. "Because 80 to 90 percent of the data in the data marts almost never changes, there is no point in processing that static information over and over again. Even if there is an occasional change in the static data, it comes through a restatement process."
In January 2011, IT personnel deployed two new HP ProLiant DL580 G5 servers at the data center in Tukwila, Washington, to support the new solution. Each system includes 24 processors and 128 GB of RAM—and runs Windows Server 2008 R2 Enterprise and SQL Server 2008 R2 Enterprise. One server extracts and processes data from the OLTP systems. The second server includes read-only copies of the data marts that are used to generate reports. A third SQL Server instance on the processing server stores metadata about the data marts and it tracks which replicated servers are the primary sources of data.
To obtain information from the OLTP systems, IT personnel configured the processing server so that it contains mirrored copies of the revenue and expense databases, as shown in Figure 1.
Figure 1. The solution moves data between systems using high-availability features in SQL Server 2008 R2
Because the volume of transactions on the OLTP systems is low, IT personnel took advantage of the high-performance, asynchronous mirror setting so that data is written to the OLTP systems first and then to the mirrored copy. By using this setting, IT personnel reduced the impact on the OLTP systems. In addition, the tables in the OLTP databases have CDC enabled so that all new information is identified and tracked by SQL Server.
"By using SQL Server 2008 R2 to process data from the snapshots of the mirrored databases every seven minutes, we facilitate the creation of near-real-time reports and automatically ensure data integrity."
Senior Software Development Engineer in Business Intelligence Engineering
Every 7 minutes, SQL Server 2008 R2 takes snapshots of the mirrored databases and uses the information from CDC to identify all data that users entered into the OLTP systems since the last snapshots were taken. SQL Server 2008 R2 then processes the new data by applying business rules, structuring the data in the appropriate table format, and compressing it. The data is then stored in the appropriate data mart on the processing server. Processing all of the new data takes about three to four minutes.
To move the updated data marts to the reporting server so that users can query the information, every seven minutes SQL Server 2008 R2 initiates a compressed backup of the data marts using compressed striped backup. It then restores the data marts on the reporting server. The compressed striped backup uses the multiple processors available on the server efficiently, thereby increasing the performance of the backup and restore.
The above process is seamless to the users querying the reports because the architecture maintains primary and secondary copies of each data mart. Users always access the primary data mart, and the restore process always updates the secondary data mart. As soon as a restore process is complete, the secondary data mart becomes the primary and any new user requests are then automatically routed to the new primary data mart, which contains the up-to-date information.
To help promote the highest levels of availability, IT personnel clustered the processing and reporting servers. If either system becomes unavailable, the other system in the cluster automatically takes over so that users can still create near-real-time reports.
In February 2011, about 40 employees began to test the new solution. By September 2011, the 1,500 executives and finance managers who create operating reports will use it.
By taking advantage of built-in features in SQL Server 2008 R2 Enterprise, Microsoft has gained near-real-time reporting for corporate forecasting. The company has also improved availability, cut costs, sped data processing and system response times, simplified scalability, and improved agility.
Speeds Data Access and Improves Accuracy
Today, the revenue and expense data marts are updated 192 times each day. As a result, executives and finance managers can create reports that contain near-real-time information from OLTP systems. "By using SQL Server 2008 R2 to process data from the snapshots of the mirrored databases every seven minutes, we facilitate the creation of near-real-time reports and automatically ensure data integrity,"; says Joseph. "By using only native SQL Server features, the data quality and integrity is maintained even while moving across systems. This reduces the need for the users to do in-depth data validations thereby increasing efficiency and productivity."
Increases Data Availability
With redundancy built in at every step of the process, executives and finance managers have around-the-clock access to the reporting solution and no longer struggle with system outages. Not only does this ability reduce frustration for the users, but it also helps to ensure that they meet reporting deadlines that affect critical business decisions.
The system availability of 99.99 percent also helps IT personnel. "Because this solution is used by global employees, when system outages occurred in the past we used to send email messages to notify people about any system issue, indicate when we thought it would be available, and then send constant updates on the status," explains Joseph. "Now, because the new solution uses redundant systems and clustering, users will not be affected if a system outage occurs. If one server fails, the next available node begins to process data in two minutes or less—without any manual intervention from IT personnel."
Reduces IT Expenses
By changing the design of its reporting infrastructure, Microsoft has compressed information in the data marts by 39 percent and cut data-center and IT administration costs. "We reduced our hardware footprint from 18 systems to just 5–or 72 percent–by consolidating infrastructure and using features in SQL Server 2008 R2 Enterprise for high availability and data compression," says Joseph. "IT personnel also have fewer systems to manage."
Accelerates System Performance
The new solution not only minimizes the hardware footprint, it also puts less stress on the user-facing OLTP systems and speeds the processing of mission-critical data. As a result, users can access near-real-time data for reports, rather than using information that was entered into OLTP systems 24 hours earlier. The new solution also offers greater performance capacity: it can support up to 350 simultaneous reports from users without any degradation in system response times.
"Using change data capture dramatically reduces data processing, since we are only processing the data that was changed," Joseph explains. "Also, moving information around on the same physical system takes less time than transferring it between numerous physical systems. We also reduced the time needed to restore data marts on the reporting server by splitting the data marts and using compressed striped backup and data compression."
Improves Efficiency, Agility, and Scalability
With the new reporting solution, executives and finance managers can access the information they need—when they need it. The enhanced data transparency makes it easier to identify business trends, create innovative offerings, increase efficiency—which in turn, improves agility. Microsoft can also easily scale its solution to support more users, data, and OLTP systems without an automatic investment in additional hardware. "We are already looking at incorporating our budget and incentive compensation systems into this solution infrastructure," says Joseph. "To do so, we will add new database mirrors and data mart instances on the existing processing server. It can then push this additional data to the reporting server for users to consume."
For More Information
For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:
© 2011 Microsoft Corporation. All rights reserved.
Microsoft, SQL Server 2008, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.