To prevent the inadvertent disclosure of High Business Impact (HBI) information
Microsoft IT designed and implemented a system using Microsoft technologies in conjunction
with a third-party solution that automatically identifies and classifies HBI information
at risk, and then starts the remediation process.
The security of sensitive information is one of the greatest concerns facing many
companies today. The loss or theft of HBI information is of particular concern as
it could expose the company to an information breach that could potentially cause
a loss in revenue, productivity, reputation, brand value, or even a company's competitive
advantage if the information includes key intellectual property (IP).
This paper describes the approach, design, implementation, and benefits of such
a technical solution at Microsoft. The paper also provides suggested best practices
so that Microsoft customers can benefit from the lessons that the project team learned.
This paper is intended for IT professionals who design and manage compliance systems,
in addition to risk managers and compliance auditors.
Situation
At Microsoft, about 100 terabytes of data is disbursed over 110,000 managed Microsoft®
SharePoint® sites and over 30,000 file shares across the company where HBI information
could reside. Microsoft IT needed to create a solution that relied on technology
to identify information that could be at risk and then to help prevent the unauthorized
disclosure—whether inadvertent or malicious—of this information.
Microsoft has long had content policies in place in accordance with a number of
regulatory and corporate mandates. The missing component was an automated identification
and monitoring mechanism through which line-of-business owners and end users could
confirm their compliance with policies and guidelines at a detailed level for handling
that content. The sheer volume of information at Microsoft made manual inspection
of all SharePoint sites and file shares, and manual notification of policy compliance
issues to information owners or custodians of HBI information, an impossible task.
The challenge was how to deliver this missing capability across the global organization
without installing a huge new IT infrastructure or incurring enormous costs. Prior
to the development of the information classification solution, there were concerns
about �the potential for unintended accessibility of HBI information to a wide range
of Microsoft personnel. These personnel included those who simply used basic search
tools to gather information in the daily course of their work. At the same time,
it was important to raise awareness to the many end users about the risks of non-secure
HBI information and their role in helping to ensure the security of such sensitive
information.
Considering the volume of systems, content, employees, and business processes potentially
affected by the implementation of a content loss prevention (CLP) solution, many
chief information officers, chief information security officers, and chief security
officers struggle with identifying how and where to start. Implementing a technology
solution to prevent the loss or misuse of sensitive content is just one part of
the picture. In fact, an organization must address an entire set of business processes
and operations to prepare for such an implementation and to manage the resulting
incidents and intelligence that arise from the use of this technology. The most
effective content loss prevention efforts are those that an organization meticulously
plans and executes based on a deep understanding of its most important content governance,
risk, and compliance challenges.
For large enterprises, the technical aspects of a CLP solution can play a critical
role in enabling automation of discovery and remediation activities. CLP solutions
that use Microsoft technologies can provide a solid foundation that enables an enterprise
to scan and classify enormous volumes of information in a timely and regularly scheduled
manner. The enterprise can then focus valuable human resources on remediation efforts.
Automation of these otherwise time-intensive activities also enables the creation
of repeatable, service-oriented operations processes with the lowest possible total
cost of ownership (TCO) for the solution.
Solution
In 2006, Microsoft IT initiated a CLP project with the objective of addressing content
security and compliance objectives at Microsoft regarding HBI information, while
minimizing impact to business operations. By using Microsoft technologies such as
Microsoft Office SharePoint Server 2007 in conjunction with a third-party application,
Microsoft IT ultimately designed and implemented the HBI Information Classification
solution. This solution automates the identification and classification of HBI information
at risk, as well as a portion of the subsequent remediation process. It enables
users to effectively classify and help protect HBI information contained in SharePoint
sites and file shares according to Microsoft data-handling standards. The third-party
part of the solution, Tablus Content Sentinel, was itself built on the Microsoft
.NET Framework, Windows® Compute Cluster Server 2003, and Microsoft SQL
Server® 2005.
Prior to embarking on designing and implementing the technical solution, the project
team spent a considerable amount of time and effort defining the scope and approach
of the project. The team's primary goals were to: - Develop an initial solution for HBI data, which could then be expanded to various
classifications of data in various locations.
- Establish an effective, repeatable service while minimizing impact to daily business
activities.
- Make the HBI project widely recognizable across the business.
Solution Approach
As part of a global corporation with approximately 71,000 employees working in more
than 500 Microsoft offices around the world, Microsoft IT realized that it needed
to approach HBI information security in incremental steps and address multiple,
sometimes competing requirements. Microsoft IT overall information security policies
had to take into consideration regulatory compliance requirements as well as the
protection of intellectual property, which considerably broadened the scope of the
HBI project.
Many organizations focus first on network monitoring–based solutions to prevent
unwanted transmission of HBI information. But the project team realized that at
the scale Microsoft had to address, catching content in motion would be overly costly
and ineffective without first addressing the root of the problem: gaining visibility
and control over HBI content at rest.
The team established five objectives for the initial project: - Identify the location of HBI content across the network.
- Reduce the volume of HBI content that could move across the network or be used on
workstations.
- Implement ownership and access controls for HBI content.
- Understand and address the business processes that contribute to the sprawl of HBI
content.
- Establish the content loss prevention capability as a valued service within the
larger IT organization in accordance with the Microsoft Operations Framework (MOF).
"One of the primary project objectives was to establish the content loss prevention
capability as a valued service within the larger IT organization in accordance with
the Microsoft Operations Framework."
Olav Opedal, Security Program Manager
On a strategic level, the MOF posits that IT groups, including IT security groups,
must clearly focus on supporting the business objectives of the organization and
emphasizing the business value that IT provides. The idea is that IT can help reduce
risks and enable new ways of doing business. In addition, IT systems and services
are more effectively managed when regarded as an asset to the development and execution
of key business strategies. This approach requires IT groups to demonstrate how
their services make specific, tangible, and critical contributions to achieving
business outcomes.
In the context of content security at Microsoft, the project team created a project
plan. The goal of this plan was to demonstrate that the proposed content loss prevention
strategies and technologies were the most effective means of achieving compliance
and maintaining policy mandates now and well into the future. The plan focused on
rapidly advancing the maturity of the content loss prevention service from a basic
level, where the new IT infrastructure is generally considered a cost center, to
a fully mature level. At that level, the business value of the IT infrastructure
is clearly understood and viewed as a strategic business asset and enabler within
the first three months following implementation.
From a scope perspective, the project team decided to start by inventorying HBI
information within the enormous volume of content stored across the network file
shares and SharePoint sites at Microsoft. The team decided to develop an automated
scanning tool and apply it to those data repositories to identify HBI information.
Remediation of issues with HBI information would then follow, including limiting
broad access, asset classification, asset lockdown, asset removal, and data rights
management encryption. This approach would also become the framework that would
eventually include all data assets in motion and, potentially, additional classifications
of data, such as Medium Business Impact (MBI) or Low Business Impact (LBI) information.
In defining the scope and approach for the project, the team adopted the following
methodology: - Develop proof of concept
- Conduct risk analysis
- Design and build
- Pilot and deploy
- Provide service management
The approach that the team followed for the initial project supported control requirements
to mitigate critical information security risk for data at rest. It required the
design of several compliance modules and deployment of an existing incident tracking
and remediation tool integrated with the MSE ticketing system used throughout Microsoft.
Each compliance module would focus on specific types of search and remediation activities,
such as scanning and locking down SharePoint sites or file shares, or identifying
specific types of intellectual property, such as source code in various locations.
The deployment of these combined elements would provide precise, automated detection
of HBI data at rest in documents located on SharePoint sites and file shares, or
elsewhere, and methods to quickly remediate potential issues. In general, after
an organization identifies issues with HBI information, it has a duty to address
those issues and safeguard that information. Therefore, the remediation component
of the solution was crucial to implementing a complete solution.
Because of the large volume of content at Microsoft, and the heavy reliance on SharePoint
sites that facilitate the sharing of information and that have varying levels of
data owners and users, the project team had to transform high-level corporate policies
into detailed guidelines for how to apply content security to IT services. This
transformation required close collaboration between IT service owners, the corporate
legal department, and various other stakeholders to determine appropriate remediation
steps.
The parameters that Microsoft IT developed for its initial discovery required the
ability to define stringent criteria for automated content evaluation. With enormous
data loads and thousands of locations to scan, enterprise scalability, performance,
and accuracy were all top considerations. Precision of content detection, in particular,
was a concern. Microsoft IT wanted a system would reliably catch most at-risk content
while maintaining a very low rate of false positives. Previous research that the
project team conducted indicated that systems that generate high false positives
require much higher levels of human intervention, resulting in a much higher TCO.
Finally, the project team developed an education campaign improve end users' awareness
and understanding of their role in helping to ensure compliance and the security
of the company's sensitive digital information assets. Indeed, the implementation
of the HBI project is a step in Microsoft IT's evolving governance model. With a
strong focus on enforcing Microsoft data-handling policies and standards, the compliance
framework that the project established is a major component for the articulation
of IT governance.
Solution Technical Design
Accuracy, performance, and scalability are the three most important attributes in
an enterprise content scanning solution and the HBI project in particular. The project
team evaluated the third-party Tablus Content Sentinel application in a proof-of-concept
phase to determine firsthand how well the technology could meet its needs. In November
2006, after a successful proof of concept followed by extensive risk analysis in
conjunction with the business owners, the project team selected Tablus Content Sentinel
as the core content scanning tool for the solution.
This application enabled Microsoft IT help identify and classify HBI data within
the Microsoft environment. At a very high level, the core intellectual property
of the application is a content analysis engine used to recognize confidential information.
The content analysis engine evaluates information assets by using a variety of techniques
to identify protected data. These techniques include searching for specific keywords,
phrases, or entities; identifying patterns in data; and analyzing the context in
which a suspicious string is detected.
With enormous volumes of data to scan and remediate, the infrastructure supporting
the automated scanning tool must be high performance and highly scalable. The Tablus
Content Sentinel engine is built on the Microsoft .NET Framework. It is capable
of running on the Windows Compute Cluster Server 2003 operating system to create
a grid computing architecture that allows for capacity expansion by adding servers
to the grid of compute clusters. Figure 1 shows how compute clusters are located
in various regions where significant amounts of data reside: .jpg) Figure 1: Windows Compute Cluster Server 2003 grid architecture for Tablus Content
Sentinel
Microsoft IT dedicated 10 load-balanced grid computers to scan all of the SharePoint
sites and file shares connected to its storage area network (SAN). A lightweight
agent was automatically deployed to scan contractor workstations in Asia and the
stand-alone file shares not connected to the SAN. Tablus Content Sentinel scanning
activities are coordinated through the enterprise controller. The site connectors
for each location manage both grid computers and lightweight agents.
The grid computers are permanent components of the infrastructure and are used to
scan large, centrally located data stores. The lightweight agents deploy temporarily
to workstations or servers where content resides in remote locations, and then remove
themselves after each scanning activity. The results of the scans are stored in
a SQL Server 2005 database at each location, and then combined into the SQL
Server 2005 Enterprise results database. Approximately 1 percent of the data
at rest changes daily. Incremental scans of the systems analyze only new, moved,
edited, or renamed files.
Compliance Modules
Although Tablus Content Sentinel provides a content analysis engine for the solution,
the project team needed to create or customize additional components to automate
as many processes as possible. Based on the business requirements, the project team
identified the need for the following technical components, called compliance modules,
for automated tool development: - SharePoint Lockdown module
- File Share Lockdown module
- WinSE IP Identification module
The project team created modules based on custom Web services and Office SharePoint
Server 2007 workflow capabilities. These modules enable content classification,
automated lockdown, and remediation notification for the two main content sources,
SharePoint sites and file shares. This part of the solution bridges the gap between
the Tablus Content Sentinel content scanning engine and end users by providing relevant
workflow for security operators. The Web service and workflow engine solution integrates
file share and SharePoint classification and lockdown modules with the content scanning
solution to lock down and reclassify content appropriately. Figure 2 shows the high-level
automated workflow actions that both the SharePoint and file share compliance modules
accomplish. .jpg) Figure 2: Automated workflow for SharePoint and File Share compliance modules
Apart from the specific automated lockdown remediation actions that each of the
SharePoint and File Share compliance modules performs, the additional automated
workflow activities are generally the same for SharePoint sites and file shares.
The content for a particular site or share is scanned based on well-defined criteria,
and HBI information is identified. Results are compared to a known list of exceptions
and previously identified false positives. Content owners are then automatically
notified by e-mail to take appropriate actions, including notification to Microsoft
IT in the event of a false positive, visual classification of the site or share,
or deletion of the information.
SharePoint Lockdown Module
The SharePoint Lockdown module provides the capability to help lock down IT-managed
SharePoint sites by using a three-pronged strategy:
- Content monitoring to identify sensitive content
- Classifying data by classifying SharePoint sites
- Enforcing higher levels of access controls on HBI data
File Share Lockdown Module
The File Share Lockdown module provides the capability to help lock down IT-managed
file shares and achieves the following:
- Classifying each file share and tracking the owners of all managed file shares
- Enabling the administrator to specify a list of disallowed user groups (from policy,
Microsoft Windows NT®-authenticated users, etc.)
- Removing those from the access control lists (ACLs) in file shares and directories
every 24 hours
- Removing groups larger than a specified size from shares that are classified at
a higher level
- Notifying share owners of the removal with information about compliance policy
- Providing workflow for security operators and end users to perform remediation to
comply with standards
WinSE IP Identification Module
The project team also developed a compliance module to detect and remediate nonsecure
source code on vendor-assigned desktop computers. The WinSE IP Identification module
includes the following capabilities:
- Rules to identify Windows source code
- A workflow to approve identification of source code
- Tools to lock down data
Figure 3 shows the high-level automated workflow actions that the WinSE IP Identification
module accomplishes. .jpg) Figure 3: Automated workflow for WinSE IP Identification
compliance module
The automated workflow activities for the WinSE IP Identification compliance module
begin with content scans of specific workstations based on well-defined criteria.
Potential Windows source code is identified, and the results are compared to a known
list of exceptions and previously identified false positives. The appropriate security
operators are then automatically notified by e-mail to take appropriate actions,
including notifying Microsoft IT in the event of a false positive, locking down
access to data and potentially investigating the source of that data, or deleting
data and potentially investigating the source of that data.
Solution Implementation
The initial content scan to locate and remediate HBI content focused on 12 terabytes
of content across the file shares and SharePoint sites located in a single data
center, the Redmond data center. That initial scan took only nine days to complete.
After three months, the total volume scanned was up to 75 percent of the HBI content
across the file shares and SharePoint sites worldwide. The project team completed
100 percent of scanning for the HBI portion of the project in September 2007, when
the total scanned content exceeded 100 terabytes.
The project team progressed from initial deployment to an established IT service
in just 90 days. Incremental scans now occur on a scheduled basis, and end users
routinely use remediation tools that the solution provides when they are notified
of issues with HBI information.
As a critical part of the implementation, Microsoft IT pursued a range of awareness
and outreach efforts to internal customers. Because long-term success would depend
on building a culture of compliance across the company, Microsoft IT planned to
create a grass-roots awareness of, and ultimately demand for, content discovery
and other services built around remediation. The internal promotional tactics included
poster campaigns, e-mail, and newsletter notices that educated users on HBI, MBI,
and LBI data.
In all cases, these marketing messages educated end users on compliance priorities
and emerging capabilities. For instance, Microsoft IT sent e-mail that alerted users
to the availability of content scanning and remediation capabilities for individual
business users as Tablus Content Sentinel scanning capabilities came online. Ultimately,
all these efforts fostered awareness among end users that they are frontline data
custodians and play a lead role in maintaining policy compliance.
The solution further empowers a culture of compliance within Microsoft by involving
line-of-business and content owners and others within the company in remediation
of security issues. For instance, when a Tablus Content Sentinel scan of a particular
network share reveals a highly sensitive document that has been misclassified as
Low Business Impact, the system automatically notifies the owner of that document
that a problem needs to be addressed. The system can then monitor the problem and
recognize whether it is adequately resolved within a certain period. The project
team measures risk reduction and success rate by using key performance indicators
(KPIs)—for instance, the time needed to remediate, how many notifications
are sent until an incident is closed, and how many incidents are uncovered in each
content category.
In the near future, Microsoft IT plans to implement a non-compliance amnesty program.
Users will be able to use Tablus Content Sentinel to scan their laptops, desktop
computers, or other kinds of systems on their own, and then remediate any issues
that might arise. By using subtle societal pressure, the company can progress toward
its goal of cultural change. Rather than trying to implement technology unilaterally,
the self-scan empowers users across the company to support security objectives.
It also encourages people who would otherwise be hard to reach through direct on-network
scanning to appropriately manage the sensitive content on their systems in compliance
with corporate policy.
Best Practices
Prioritize Content According to Governance, Risk, and Compliance
The first step in a content loss prevention effort is to assess enterprise content:
what it is, how much of it the organization has, how it is used, and where it is
located. Table 1 provides some basic guidelines for evaluating content. Table 1. Content Evaluation Guidelines | Inventories | Purpose | |
Types of content that are or should be classified as sensitive |
Begin to understand what content requires protection | |
Locations where content resides |
Outline and quantify the systems that will need to be monitored
| |
Business functions that require access to this content |
Understand how the content is currently used to keep business flowing | |
Individuals, by business function, who require access to this content |
Learn which individuals can potentially access and expose sensitive content |
To understand what content must be protected and how it should be protected, an
organization first needs to clearly understand any industry or government regulations
with which it must comply. The organization should start by listing the regulations
that pertain to the business and then any business governance requirements that
exist for the protection of content that is most sensitive. In other words, each
type of content requires an evaluation of the impacts of a potential breach. The
goal is to prioritize risks and address the most serious threats first. An organization
best accomplishes prioritization through a thorough understanding of risk in the
context of business impact and content type.
Build a Project Plan to Establish the Solution as an Operational Service
After an organization has implemented an initial set of content protection goals,
the next step is to create an overall project plan with clearly delineated benchmarks
and steps to reach these goals. The plan should drive the team beyond a proof of
concept or initial implementation with a special IT project team toward a complete
operational solution that is fully integrated with the day-to-day business operations
at all appropriate points throughout the organization. This effort will require
mapping content protection policies into guidelines that will determine how to handle
the myriad content protection situations that may arise.
Start at the Root of the Problem
After an organization develops an understanding of its sensitive content according
to business and policy priorities, and develops a basic understanding of how that
content is stored and where it travels on the network, the next logical step is
to create an inventory of this content stored across the network. Starting with
content discovery enables the organization to understand the magnitude of the sprawl
of sensitive content that has organically accumulated over time. This aids greatly
in estimating and focusing subsequent efforts.
It is also wise to approach a content inventory activity with a narrow initial scope,
or content vector, that scans for a single class of content or a limited number
of classes—for example, high-impact personal information or PCI-regulated
content. Limiting the class of content for discovery initially enables IT and compliance
executives to keep tighter control over discovery and remediation.
Use Cross-Functional Teams
One of the most important aspects of a successful content security strategy is to
obtain the involvement of the key business team members from across the organization
in the effort. The reason is that different employees handle sensitive content for
different purposes and in different ways. The flow of content across the company
varies from one business process to the next. Because predicting where content might
ultimately go in or outside the network can be difficult, an organization needs
the support of staff from all departments—for example, IT, privacy/compliance,
human resources, legal, marketing/communications, and business operations—to
act on policies and remediate any incidents that are discovered.
Promote a Culture of Content Protection and Awareness
Technology and policies alone will not protect an organization. The organization
needs to continuously evangelize the importance of protecting sensitive content,
and provide training on the do's and don'ts of sharing content. Establishing who
within the organization has ownership over content is just the first step in promoting
an attentive and vigilant culture of content security. Training and ongoing oversight
are also key, and are just as important as the technical safeguards and solutions
that organizations implement.
Expand Coverage
After an organization completes the process of implementing and rolling out content
discovery for the highest-priority segment of the sensitive content, it should expand
the program. There are two directions for this expansion: - Implementing additional safeguards, such as network and desktop monitoring of HBI
segments
- Expanding content segments covered, such as MBI and LBI data classifications
Benefits
Microsoft IT estimates that the return on investment (ROI) for the HBI project is
as high as 600 percent since the project's implementation. The automated solution
has significantly reduced the number of operators required to conduct manual search
and notification efforts, while performing a far more comprehensive analysis of
all digital information assets. In fact, Microsoft IT estimates that manual scanning
reached less than 1 percent of all these assets over the course of a year, whereas
the initial automated comprehensive scan of huge volumes of shares and sites finished
in just 14 days.
Perhaps the greatest benefit from the solution has been the reduction in risk to
Microsoft.
Conclusion
Adequate safeguarding of HBI information in large organizations is a critical but
daunting undertaking. The sheer volume of information at Microsoft made manual methods
of identifying and classifying HBI information a challenging task. To streamline
the effort, Microsoft IT developed a comprehensive approach that includes clear
articulation and enforcement of IT governance, thorough engagement of business owners
to prioritize risks, and service-oriented operational processes.
By using Microsoft technologies and the third-party Tablus Content Sentinel application,
Microsoft IT implemented automated discovery scanning and remediation methods for
HBI information. These methods can examine and classify enormous volumes of information
in a short period of time. This solution has resulted in significant ROI, increased
compliance with data-handling standards, and an impressive reduction in the overall
risk associated with the loss of sensitive information.
For More Information
For more information about Microsoft products or services, call the Microsoft Sales
Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information
Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact
your local Microsoft subsidiary. To access information via the World Wide Web, go
to: http://www.microsoft.com http://www.microsoft.com/technet/itshowcase
© 2008 Microsoft Corporation. All rights reserved.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES,
EXPRESS OR IMPLIED, IN THIS SUMMARY. Microsoft, SharePoint, Windows, and Windows
NT are either registered trademarks or trademarks of Microsoft Corporation in the
United States and/or other countries. The names of actual companies and products
mentioned herein may be the trademarks of their respective owners.
|