Export (0) Print
Expand All

Reducing the Costs of Enterprise Data Warehouse Creation and Data Acquisition

Published: November 2010

The following content may no longer reflect Microsoft’s current position or infrastructure. This content should be viewed as reference documentation only, to inform IT business decisions within your own company or organization.

For Microsoft developers in the Business Intelligence Engineering group, creating an Enterprise Data Warehouse (EDW) was challenging, due to inefficient processes. To change that, the BI Engineering group developed METaL 3.0, which provides fast design, deployment, and management of data warehouses. As a result, customers have increased ETL performance by 120 percent, lowered development time, simplified project management, and improved data acquisition.

Download

Download Business Case Study, 203 KB, Microsoft Word file

Customer Profile

Situation

Solution

Benefits

The Microsoft Business Intelligence Engineering (BIE) group develops software tools that help internal Microsoft customers more efficiently deliver business intelligence solutions.

The BIE group's developers needed a new tool to help manage the build flow, control the dependencies needed for integrating and transforming data, and manage the dependencies among all the steps in EDW creation.

The BIE group created METaL, a high-performance metadata driven tool supporting rapid configuration, deployment, and operational management of enterprise data warehouses

  • Boosts Performance by 120 Percent
  • Reduces Development Time
  • Eases Management
  • Improves Data Acquisition

Situation

Building an EDW is an incredibly important task. Because organizations use the EDW to gather data required to make critical business decisions, a solid data warehouse is essential to business success.

Unfortunately, creating the EDW can be one of the most complex and error-prone tasks that developers and engineers face on a daily basis. Specifically, the (ETL) process can involve hundreds of steps and dependencies that need to be controlled. These steps need to be well-organized and executed smoothly. If problems arise, manual intervention is required and the time-to-market could be delayed.

These were some of the challenges facing Microsoft developers and software engineers in the BI Engineering group. "There was a great need internally for a new tool to help manage the build flow, control the dependencies needed for integrating and transforming data, and manage the dependencies among all the steps in EDW creation," says Amar Dabbara, a Microsoft BI Engineering (BIE) engineer who is also the Lead Developer of the METaL 3.0 enhancement.

One of the biggest problems was dependency control, which was attributed to ETL processes being scheduled by time-based SQL Server® 2005 jobs. Data acquisition and data integration processes, for instance, were running on schedules. But if one of those processes encountered an error, performance issues often ensued and developers were forced to manually intervene. The lag time between these different scheduled jobs also caused performance issues. "If the first scheduled job finished early but the next job didn't start until its scheduled time, there could be 30 minutes or more of wasted time," says Dabbara.

Developers had to tediously set up data acquisition in ETL processes, and manually write scripts to acquire data for each table. Additionally, there was no function for moving data at the table level, which meant developers were often forced to copy an entire database for only a handful of tables. Also, there was a lack of data lineage functions for impact assessment, which often led to manual and time-consuming deployment of those tasks.

This resulted in low efficiency and high risk during the EDW creation process for developers and database administrators.

To solve this problem, the Microsoft BIE group decided to create a new tool that would give developers an easy way to manage all tasks in the EDW creation process. "We wanted to create a tool for the rest of BIE that enabled them to efficiently deliver business intelligence solutions for business partners," says Dabbara.

Solution

In late 2007, the BIE group delivered METaL, a high-performance metadata driven tool supporting rapid configuration, deployment, and operational management of enterprise data warehouses. METaL, which stands for Metadata-driven Extract, Transform and Load, provides a method for developers and database managers to easily manage all EDW activities. Designed to integrate with Windows Server® 2003, Windows Server 2008, and Windows® 7, as well as SQL Server 2008 and the .NET Framework 3.5, the tool was created primarily to help the BIE group build the EDW.

In December 2009, BIE introduced METaL 3.0, the newly-enhanced version of the tool that includes many new features designed to increase developer agility and further ease the ETL process. The new features include:

  • Group and Flow Layers. These layers sit on top of existing orchestration to enhance dependency control.

  • Table Copy. This feature activates the non-blocking copy of table data from source to destination, and provides row-based versioning.

  • Auto Acquire. This capability speeds up the process of defining an acquisition in the ETL process by using SSIS data flow to increase data movement efficiency.

  • Data Lineage. This feature maps the lineage foundation to show the end-to-end flow of metadata, and offers impact analysis that provides a list of potential metadata that would be impacted when a metadata object is changed.

  • Metadata Promotion. This feature allows users to promote schema metadata from one environment to another.

  • Source to Target Mapping (STM) UI. This new user interface helps developers ease the importing of schema metadata, organize metadata into applications and folders, and create and manage mappings. A mapping source can be the metadata of a SQL table, flat file, or Excel file.

  • ETL Templates Framework. This feature provides metadata definition for a task, and offers an extensible framework so developers can add custom ETL functions.

Using these new features, developers and database administrators can easily set up and manage ETL processes for data warehouse construction. "METaL 3.0 integrates closely with central metadata structures, and improves developer agility and productivity," says Elizabeth Pickering, Program Manager, Microsoft BIE. "It gives developers a very simple interface with which they can more easily acquire data."

Currently, METaL 3.0 is being used for 20 different solutions within the Microsoft BIE group and other groups at Microsoft.

Benefits

Boosts Performance by 120 Percent

By using the new features in METaL 3.0, EDW and BIE developers are more agile and efficient overall, and ETL activities can be managed and maintained easier. For example, dependency control can be better managed through Group and Flow Layers. "METaL 3.0 introduces two more layers of ETL structure, which enhances the dependency control of all ETL processes," says Pickering. "Now, different ETL processes can be scheduled based on their dependency relationship and there is a reduced need for manual intervention."

Developers taking advantage of these new features have already reported a 120 percent increase in ETL process performance, according to Dabbara. "Developers are much more efficient as a result," he says.

Reduces Development Time

Because METaL 3.0 enhances dependency control within ETL processes, scheduled project lag time is reduced and the overall time needed to create an EDW is lowered. Additionally, the solution's Table Copy function gives developers the ability to copy selected tables instead of copying an entire database, which further reduces time spent building the EDW. Data acquisition is also faster because the STM UI and Auto Acquisition features contribute to a faster overall ETL process.

The solution also helps development teams reduce QA times, specifically by using the new ETL Templates Framework. "Once you have a template to work from, you can define the project's scope much faster and easier, which helps cut down QA time," says Pickering.

Eases Management

For IT operations teams, the solution's standardization and enhanced dependency control helps reduce the total number of disconnected data-gathering environments to manage. "Instead of having some ETL processes controlled by SQL Server, some controlled by METaL, and some controlled manually, it's all integrated within METaL 3.0 now," says Pickering. "This makes managing the whole process much faster and easier for operations managers and teams."

Also, the standardized environment—enabled through features like Auto Acquire, the STM UI, and the ETL Templates Framework—reduces the amount of training needed for operational support staff.

Improves Data Acquisition

METaL 3.0 introduces the standardization of critical ETL processes through features like Auto Acquire, Group and Flow Layers, Data Lineage, and Metadata Promotion. As a result, the general consistency of data is significantly improved. "Developers can use the Data Lineage feature, for example, to more easily and consistently see the flow of data through the ETL process," says Pickering. "This is leading to more accurate data, and improved overall data quality in the enterprise so business managers can make better business decisions. That's the ultimate goal of any data warehouse—to benefit the business and business managers."

Products & Technologies

  • Microsoft Windows Server 2003

  • Microsoft Windows Server 2008

  • Windows 7

  • SQL Server 2008

  • .NET Framework 3.5

For More Information

For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:

http://www.microsoft.com

http://www.microsoft.com/technet/itshowcase

© 2010 Microsoft Corporation. All rights reserved.

Microsoft, .NET, SQL Server, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft