Creative Integrated Data Management: Linking Data Sources Generates Insights
Historically, databases that contain human resources (HR) data are maintained and queried separately from databases that manage system configuration information. By linking these sources, Microsoft IT addressed a multitude of business challenges, including identifying deployment improvements, developing targeted user communications, and enhancing internal software publishing.
Article, 769 KB, Microsoft Word file
Over the past year, Microsoft IT has developed a business intelligence (BI) solution that extracts and transforms data from both its HR repository and System Center Configuration Manager. IT then integrated this BI data to produce reports that add more specificity to its operations:
Much is written about the "big data" phenomenon and the ability of SQL Server to keep pace with the huge data volume increase that modern data centers are experiencing. However, not all IT workloads are directly affected by data proliferation. Medium and large enterprises also use the SQL Server toolset to manage slower-growing data repositories, such as those that collect data about the company's network-attached computers and the people who use them. SQL Server supports authoring queries that can combine data from each repository in ways that are especially useful to an IT organization, but not all IT environments take advantage of this capability. While enterprise developers continue building applications that use the data these individual systems contain, the data sources themselves are often kept separate and not integrated.
In particular, databases that track human resources (HR) data provide opportunities for integrated queries with databases that support systems management software, such as System Center Configuration Manager (Config Manager). Recently, Microsoft IT explored ways to combine this data to obtain more insight into certain operational areas. This seemingly simple exploration, which began as part of a small project, ultimately delivered new benefits in multiple IT focus areas, including deployment operations, issue management, and software publishing.
Constraints from the Lack of Integrated HR Data
The exploration began when IT started gathering requirements for redesigning the online portal employees use for downloading corporate licensed software and internal applications. The updated site required an intuitive design that would provide maximum value to visitors. Specifically, IT wanted to organize the site by users' roles within the company—for example, by creating an area labeled "Engineering Applications" where engineers would look first for needed downloads.
But at the time IT began its portal redesign, it maintained job-role data in an HR database, while data about users' installed applications was maintained in Config Manager. No method existed for combining data from these separate sources; each database supported separate applications and business processes.
Figure 1. Before IT developed the integrated data management solution, data in the Config Manager and HR data repositories was used for entirely separate purposes.
Without data to drive the portal design, IT would be relegated to making assumptions about each visitor's user experience to answer important questions. In the engineering example, these questions would be: Of all applications in the portal, which ones do engineers actually use? Which downloads are most popular among engineers, and why?
As it began considering a solution, IT soon identified other areas where its operations were constrained by using Config Manager data on its own. In particular, IT lacked effective communication with users about each user's need to update his or her network-attached systems. Using data from Config Manager only, IT sent email to large groups of users describing their compliance status in general terms and encouraging them to opt in to installing the update. Overall, IT experienced a low response rate to these communications, for three key reasons:
- The message did not speak to nuances in the user's specific situation—for example, developers who own multiple computers could not be sure which one(s) were out of compliance.
- Recipients, understanding that the mail was sent to a large number of users, assumed that the information did not apply to them, and so they disregarded the call to action.
- The message lacked specific steps for the recipient to take to become compliant.
As with the portal project, these communication activities were hampered by imprecision and an overall loss of efficiency. In fact, IT soon found, communications were only part of the problem; actual software deployments, including everything from patches and updates, to new operating systems and productivity applications, were also constrained. In configuration management, deployments happen to computers, not to people, and as such, Config Manager data yields user-to-device relationships that support getting deployments to all users. However, just as with communications, this data lacks the HR-related details needed to target deployments more precisely.
By failing to target specific groups—whether for deployment, communication, or user-experience purposes—IT was losing an opportunity to provide optimal effective service. To remedy this, proceeding from the portal design exercise, IT broadened its scope of inquiry to ask the following: What might be the benefits of querying both HR and Config Manager data, and then defining business rules for any operational activities based on the results?
To learn the answers to these questions, and with the goal of improving IT and business productivity through useful data relationships among multiple isolated sources, Microsoft IT undertook a project to combine SQL Server queries of its HR and Config Manager databases and thereby better understand the computer environment at Microsoft.
The benefits of using this approach would be significant, IT found. Integrated querying of HR and Config Manager databases provides a rich matrix of characteristics about employees, including the configurations of computers assigned to them, as well as individuals' job roles, geographic locations, and status as full-time employees (FTEs) or vendors.
Although the work to build this solution is iterative and ongoing, the initial project comprised four main phases:
- Understand the HR and Config Manager data sources in greater detail, including the identification of how the data was structured in each source, and determine which data from one source could be related, or keyed, to data in the other source.
- Analyze and prioritize the reporting requirements of all data based on business needs and database characteristics, such as refresh timing, data structure, and record volume.
- Determine the best mechanisms for accessing each required data set, and write SQL Server Integration Services (SSIS) queries for all data extractions and transformations.
- Develop preformatted, customizable reports for the data based on business requirements, and perform validation on the data in all reports before they are delivered to customers.
IT performed all steps in this process, particularly reporting, with a careful accounting of security requirements. The security and privacy features of SQL Server, along with manual checks during the data validation phase, allowed IT to ensure that personally identifiable information (PII) included in any report did not exceed the need-to-know parameters of the report's business owners.
Figure 2. The new solution strategically queries and integrates data from multiple sources to produce BI that serves many benefits to IT, business groups, service managers, application owners, and end users.
The HR data this solution uses is the official feed, accessed by all corporate applications that require user information, and the data is updated on a daily basis. Over time, through ongoing scrutiny of the data relationships and execution of queries that capitalize on these relationships, the overall quality of the data has improved and will continue to do so. This is accomplished both by adding new metadata to enrich the data already stored in the repositories, and by completing the data where it is currently missing information. By encompassing both data sources, the master data to use for queries and the SQL Server 2012 report presentation capabilities, the new dual-integrated querying solution further optimizes use of this high-quality data. The ability to produce detailed reports from the solution gives business groups powerful opportunities to benefit from its combined specificity.
Applying creativity to the queries of HR and Config Manager data produced benefits in the areas of deployment and issue management, and also enabled IT to meet the original business challenge—improving usability of the software publishing portal.
Smarter End-to-End Deployment Cycles
Targeting the deployment of a software upgrade produces considerable time savings and provides more productivity to users than a generic update that is blindly pushed to the entire organization. Merging HR with Config Manager data positively affected IT deployments in multiple ways.
Efficient Targeting of Deployments
As in many companies, the deployment of configuration changes to users' computers at Microsoft are often driven by legal mandates and security concerns. The higher the percentage of users whose computers have been updated to the latest operating system, for example, the lower the overall incidence of security issues. Additionally, many companies, including Microsoft, are becoming increasingly strict about licensing issues related to software running on the non-FTE vendors' network-attached computers. In both of these areas, integrating HR data with configuration data helps to maintain full compliance with best practices.
For example, when Windows 8.1 became available, IT took steps to ensure that it was deployed to a feasible number of computers on the corporate network. However, defining "feasible" proved complicated: The optimal result, IT determined, would be to require installation of Windows 8.1 on the primary computers of all FTEs. This complex requirement exceeded the capability of performing the deployment using Config Manager data alone. To target the deployments exactly to those computers, IT queried both the Config Manager data and HR data to generate a more refined list of targets, and used this list to guide the deployment activities.
To achieve better targeting of the Windows 8.1 deployments, IT performed the following activities:
- Filtered the upgrade list to include only computers owned by FTE users.
- Identified business rules for determining which computers would be considered each user's "primary" computer, using Config Manager data about existing installed operating systems and when the user was last logged in.
- For each deployment phase, queried HR and Config Manager databases to identify users whose primary computers were already upgraded, and removing those users' remaining computers from the upgrade list for that phase.
The value of querying both databases in this scenario was that IT could leverage the one-to-many relationship of users to computers in ways that were not possible using only the Config Manager data. In the end, IT accomplished the deployment in five single-day phases, with all computers upgraded that met the requirement and no additional effort expended. Future IT deployments of new and updated software will incorporate similar processes using this dual querying approach.
Because operating system deployments are key to the overall stability and growth of company assets, deployment stakeholders benefit greatly from knowing up-to-the-minute status about how the deployment has progressed. In this case, the Windows team was able to request near-real-time reports that provided views into deployment success and included clear decision points for issue resolution.
Figure 3. Integrated BI data from the solution drives various new custom reports for both IT and the business groups it serves.
More Precise Targeted Communications
The availability of more precise, detailed reports on the existing state of users' computer configurations enabled IT to be more agile and specific in the communications sent to users regarding updates.
With the goal of personalizing update-related communications to provide the maximum information relevant to each recipient, IT now runs queries to consolidate HR and Config Manager data, and then refines the target user list for each communication. Rather than providing a general status and call to action, communications can now offer prescriptive guidance: "Take this action on this computer, for these reasons."
Communications with group managers are likewise improved, because the messages include specific and even competitive insights into the update process: "Here is your team's compliance status; here is how your peer teams are doing by comparison." On a related note, Microsoft leadership has new capabilities for driving healthy IT cooperative activities, including upgrade compliance. For example, a senior IT manager can offer incentives that reward the managers or teams who are the first to achieve 100 percent compliance with an update goal. Before the dual data source queries occurred, no method existed for measuring progress at these levels.
Targeted Feedback Loops and User Testing
The improved precision also helps IT gather feedback from selected installation bases within the enterprise. For example, IT can send a survey to all users in Western Europe who recently upgraded to the latest version of Microsoft Office and ask them about their overall satisfaction with the new system. Based on the query structure for producing the communication list, the responses can yield geographically specific information about the success of the update, to determine which users are happiest with the upgrade, what teams and business areas they work in, and what questions or input they have for IT or the Office product team.
Likewise, new-software user testing benefits from richer querying. IT frequently releases early versions of software to internal personnel to test on behalf of the company and obtain useful feedback for subsequent development cycles. By analyzing and vetting the list of participants in a test cycle using combinations of HR and Config Manager data, IT can assist the product groups in targeting specific cross-sections of the company to yield the most holistic input. This is particularly useful when IT wants to test the software on a complete variety of hardware and software configurations.
Improved Issue Management: Diagnostic Support and Targeted Fixes
When diagnosing certain user issues, IT now has more data it can use to detect patterns among the issue reports and take action accordingly. For example, if dozens of support calls come in for connection-related issues, IT can query to find information that the calls might have in common, such as whether calls are coming from a single building or geographic area. If the data shows that the problem is isolated to certain locations or conditions, IT can act more quickly to remedy the situation.
In cases where fixes must be applied to user's computers, IT has increased agility in how it targets the fixes by using the combined HR and Config Manager data. For example, if a computer manufacturer alerts IT to a known issue with one of the drivers on its 64-bit machines sold in Germany, IT can upgrade the driver on the affected computers by querying and synthesizing the configuration (manufacturer, 64-bit) and geographic (Germany) data from their respective sources. By isolating this data set and pushing the new driver version only to those computers, IT acts more efficiently and reduces the risk of performing unnecessary or potentially damaging work.
Data-Driven Software Publishing Model
The original goal of performing dual queries and matching data between the HR and Config Manager data sources was to improve the usability of the internal software installation portal. Just as the principles of improved targeting were applied creatively in areas of deployment management and issue tracking, the same idea also served a larger purpose in the portal—guiding best practices not only for organizing the downloads, but also for maintaining the portal and providing valuable business intelligence to application owners.
Improved Portal Design and Usability
To first resolve the issue of designing the portal for usability, IT arranged the applications into categories based on two basic queries: job-role data for all users, accessed from the HR database; and applications installed by each user, also sorted by job role, accessed from the Config Manager database. This process yielded business rules for aligning each application to the category where it was most likely to be discovered by users. In this way, software publishing in the portal was driven by the inventory and user-behavior statistics rather than by guesswork. Usability was further enhanced by the ability to communicate trending statistics to drive further adoption, such as: "These are the top 10 engineering applications downloaded in the past week" or "More than 75 percent of users in the Marketing organization are using these applications."
IT also defined business rules for maintaining the portal based on the relative popularity of each application among users, by geography or job role. Going forward, IT will conduct quarterly reviews of all applications in the portal, analyze the download statistics, and take various actions based on the findings: removing the application altogether, moving it to an archive location, or contacting the application owner for additional guidance.
More, Better Data to Drive Business Decisions
Application owners gain valuable business insight from the new solution by receiving reports on activities related to the adoption of their downloadable offerings. With Config Manager, a limited version of this reporting was possible, but the new querying style adds the ability to filter results by job role, reporting structure, geographic location, and other HR-related information.
In a recent example, IT released a new version of a popular internal training application, updated with all new courses, in a specific geographic region. Managers were encouraged to drive adoption of the courseware by their employees. Different business groups were required to enroll in different courses: some for Sales, some for IT, and some for Finance, as appropriate. At any point, IT could run a query to determine which business groups in each area had downloaded the training, and take action to reward or remind each group accordingly. The application was also new to the portal, and had previously been available only via a dedicated web-based installation application. Reports about the download activities advised IT about the number of downloads at each location, and helped to support decision making about the right time to shut down the old install point and drive all download traffic to the portal.
Dual integrated querying of HR and Config Manager data continues to provide new value to IT and the Microsoft business groups it supports. What began as a portal design support exercise grew to encompass many additional business benefits, and the creative application of dual querying will continue to influence new activities and enhance productivity. Meanwhile, IT is following its own best practices to document the progress it has made thus far in charting the data structures of its HR and Config Manager repositories. IT also continues to ensure data quality in reporting and eventually will add more self-service capabilities to business users using SQL Server tools such as Power Query and Power View.
Benefits for Microsoft IT
- Smarter end-to-end deployment cycles by using dual integrated querying
- Increased productivity by driving deployments that match business requirements
- Increased efficiency through targeting deployments, communications, and user feedback
- Better overall IT understanding of the corporate user and computer ecosystem at Microsoft
Benefits for Service Managers and Application Owners
- Data-driven software publishing model including detailed insights for application owners
- Richer, user-specific business intelligence about downloads and deployments
- Near-real-time views for service managers to learn about the progress of their deployments
Benefits for Employees
- Reduced risk associated with software configurations that might contain security vulnerabilities
- Increased adoption of deployed software due to more detailed targeting
- More timely diagnostic support and targeted fixes for improved issue management
Related Case Studies
How Microsoft IT India’s Test Organization Enabled Efficient Business Intelligence (Published December 26, 2013)
Using Visualization to Bring out the Power of BI within Microsoft (Published December 20, 2013)
IT Showcase On: Big Data Analytics and Platform Services (Published July 08, 2013)
IT Showcase On: Business Intelligence (BI) (Updated June 20, 2013)
Microsoft IT Builds a Field Dashboard to Improve Business and Marketing Operations (Published May 07, 2013)
For More Information
For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:
© 2014 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.