Security

Behind The Scenes: How Microsoft Built a Unified Approach to Windows Security

Robert Hensing

As you can imagine, security is a pretty important topic at Microsoft, and a pretty complex one, too.

In this article, I’ll give you an inside look at the internal teams and processes behind the security efforts at Microsoft. We’ll take a look at the various groups involved in keeping software as secure as possible, seeing how they work, how they respond to new threats, and how they keep you up to date on all the important ­security issues.

Security Technology Unit

At a Glance: Responsible for ensuring that Windows is the most secure operating system for all users, providing security leadership and acting as the security conscience within Microsoft.

The Security Technology Unit (STU) is responsible for ensuring that Microsoft® Windows® is the most secure operating system it can be, providing security leadership and continuing to help all of its users to be more secure. Within this group is the Secure Windows Initiative (SWI) team, whose original goal was to create engineering practices that would lead to the development of more secure software. The result of this effort became known as the Security Development Lifecycle (SDL). (You can read more about this at "The Trustworthy Computing Security Development Lifecycle".)

Today the SWI team maintains the SDL, creating and documenting secure development processes and guidelines, providing ongoing education about security and privacy best practices to the product groups, and enforcing various aspects of the SDL during the product groups’ development cycles (design reviews, threat models, adherence to secure coding policy, code reviews, final security reviews, penetration testing, and so on). Because the security landscape is constantly changing, it has taken many years to streamline the process and ensure that it is implemented to better protect customers on a consistent basis. Some of the earliest products to undergo aspects of the SDL were Microsoft Exchange Server 2000 SP2, SQL Server™ 2000 SP3a, Windows XP SP2, Windows Server™ 2003, and Windows Server 2003 SP1. These products have achieved measurably improved security—a true testament to the effectiveness of the SDL. For example, Windows Server 2003 was the first operating system released from Microsoft that implemented large portions of the SDL, and compared to Windows 2000, it had 63 percent fewer vulnerabilities in the first year. The number of bulletins released for SQL Server 2000 during the 24 months prior to the release of SP3 (December 2000 through December 2002) was 16, versus 3 in the 27 months following the release of SP3.

Microsoft Security Response Center

At a Glance: Responsible for investigating reports of security vulnerabilities and providing resources and guidance to help protect customers from malicious threats. The center offers information and issues monthly fixes.

The Microsoft Security Response Center (MSRC) is responsible for investigating reports of security vulnerabilities from the security community and for developing resources and guidance to help protect customers from malicious threats such as zero-day vulnerabilities, for which no security update is available. In addition, the MSRC (which works 7 days a week, 365 days a year) is responsible for developing security advisories and issuing updates on a monthly basis to fix vulnerabilities in Microsoft software. (More information on the MSRC can be found at "The Microsoft Security Response Center: Overview" and at the the Microsoft Security Response Center Blog.)

On any given day the MSRC receives a number of security vulnerability reports at its secure@microsoft.com address. One of the MSRC duty officers must analyze an incoming report to determine whether more information is needed to understand and reproduce the problem. If more info is not needed, the duty officer opens an investigation in the MSRC bug database (for tracking purposes), notifies the finder that an investigation is being initiated, and alerts the SWI React team. For critical vulnerabilities (such as a potential remote anonymous code execution vulnerability in a default OS component) the SWI React team will usually begin attempts to reproduce the vulnerability within minutes of receiving the report.

Sometimes the MSRC gets to work with very friendly security researchers who are professional and provide very detailed repro steps (usually including a list of the vulnerable products, disassembly of the vulnerable code explaining what they found, and possibly a repro tool that can be used to reproduce the vulnerability, or some source code that can be compiled to create a repro tool). Other times the process is a bit more complicated when the MSRC and SWI are only given vague information on the nature of the vulnerability.

SWI React

At a Glance: Drives the investigation that leads to the monthly security updates and ensures that fixes cover all attack vectors.

The SWI React team drives the security investigation that leads to comprehensive monthly security updates. Each security investigation opened by the MSRC team is assigned an SWI React technical advisor (known as an ethical hacker) to help product teams build rock-solid security updates. The SWI advisor is the owner of the MSRC investigation, responsible for finding any other bugs related to the externally reported issue and also for making sure the fix produced by the product team is comprehensive and covers all attack vectors.

The core process has several distinct phases (illustrated in Figure 1). As soon as an issue is reported and assigned to an SWI React member, the owner goes to work reproducing the issue internally and making an initial determination regarding the proper team to handle the issue as well as the initial severity of the issue. That’s generally fairly easy as external researchers usually provide enough information to reproduce the issue in our lab.

Figure 1 SWI Reactive Process

Figure 1** SWI Reactive Process **

Then comes the fun part—variation hunting! After the core issue at fault is identified, the React team begins looking in the surrounding code for similar issues. This happens via code review, protocol or file fuzzing (fuzz testing), threat model analysis, bug scrub (looking in our bug database for similar or identical bugs already fixed in a future version of the product but not back-ported to the down-level platform), and general probing around the product area looking for weakness. Some of our most rewarding successes are identifying new attack vectors that allow access to the same vulnerable code that sometimes even the product team did not realize existed. Often this results in an increase in the severity rating of the resulting MSRC bulletin and subsequent attack-surface reduction in the security update. The React team is always looking for any Web-based or e-mail-based vector to hit vulnerable code.

After time has been spent hacking for variations, the React team member moves on to the next issue in the queue as the product team goes to work building the fix for the reported issue and any variations or additional attack surface that has been discovered. Product teams give the React team early "privates" which are the initial build of a fix containing only the updated binaries without the update.exe infrastructure used in the final package. The React team advisor tests the privates and verifies that the reported bug and any other issues that were identified are fixed. The advisor then gives the go-ahead to build the official package, which will be tested again as the time to release draws closer.

One of the final phases of the React team’s work is to review the security advisory or security bulletin that is published along with the fixed bits. It’s very important that Microsoft maintain a consistent rating scale and disclose the severity level of every issue so that users can make informed deployment decisions. Security bulletins go through a rigorous review cycle to ensure that the information is correct and that thorough guidance is being provided to users. The security bulletins are written by the MSRC owner in cooperation with the SWI React member to help ensure that every security update is accompanied by a high quality bulletin.

The React team has security experts who deal with specific products or product areas. For example, one React member is the foremost Microsoft authority on browser-based security issues and probably knows more about security issues in Internet Explorer® than anyone else in the world. Another member is very good at building network protocol test tools. Another focuses mostly on image file formats and continually refines the art of image manipulation. The team works in a highly collaborative environment where ideas are brainstormed daily.

SWI Defense

At a Glance: Provides the MSRC with workarounds so customers can take steps to protect themselves from a security vulnerability. These workarounds are published in security advisories and bulletins.

The SWI Defense team focuses primarily on providing the MSRC with mitigations and workarounds that can then be supplied to users in security advisories and bulletins produced by the MSRC. Workarounds are temporary steps users can take to protect themselves from attempts to exploit a vulnerability. Sometimes these workarounds are provided in security advisories in response to a zero-day exploit for which a security update has not yet been released. Most often, workarounds are provided in public security bulletins along with the security update (see microsoft.com/technet/security/current.aspx). The workarounds published in these security bulletins are rigorously tested, and published publicly because we realize that some organizations have extensive testing procedures that security updates have to go through before they can be widely deployed. Users can deploy these workarounds to protect themselves from immediate threats until the security updates are ready to be deployed.

Some workarounds are very strong and can address the vulnerability directly, protecting systems from all possible attack ­vectors (for example, an e-mail attack ­vector or a browser attack vector), while others are attack vector-specific and only offer protection from one or two angles. All identified workarounds are documented with preference given to core workarounds that work for all attack vectors.

Mitigations and mitigating factors are different from workarounds. Mitigations are usually configurations that, by definition, are not vulnerable. A good example of a mitigation or mitigating factor would be a named pipe that can be used to exploit a security vulnerability in Windows that is anonymously accessible on Windows 2000 but on Windows XP SP2 has been secured to only allow administrative access. If the vulnerability exists on both platforms, a mitigating factor for Windows XP SP2 is that only administrators could remotely exploit the vulnerable named pipe.

A good example of a workaround is one that was originally documented in the security advisory (microsoft.com/technet/security/advisory/912840.mspx) for the recent WMF zero-day security vulnerability. This information is now in the security bulletin released with the security update MS06-001. Within 24 hours of the WMF zero-day being reported to the MSRC, a security advisory was released with a recommended workaround to protect systems from the active Internet Explorer-based attack vector. Even though it was the SWI Defense team that identified, tested, and documented this attack vector-specific workaround (which amounted to unregistering shimgvw.dll), it should be noted that the Defense team does not work alone; the team receives great support from the product groups. Many times the product groups will propose workarounds or mitigating factors immediately in response to a vulnerability report and the Defense team will validate and test the workarounds while looking for more. At other times, the Defense team asks the product groups to validate workarounds or mitigating factors discovered by the Defense team.

For security vulnerabilities that are responsibly disclosed, SWI Defense generally has more time to reproduce the vulnerability, identify additional attack vectors, look for workarounds for all of the identified attack vectors, and then test them. This is obviously the preferred scenario. This process can take anywhere from a few hours to a few weeks—it all depends on the complexity of the product and the vulnerability.

For their investigative work, the Defense team uses virtualization technology extensively. They have an arsenal of Virtual PC images ranging from fully patched installations of Windows NT® 4.0 SP6a through Windows Server 2003 SP1 and everything in between. When looking for workarounds they always start by reproducing the vulnerability in a fully patched Virtual PC image for the affected platform while the affected process is running under a debugger using internally created repro tools or files. During the debugging process a call stack for the thread encountering the exception is observed. (The call stack is a list of functions that were called prior to the vulnerable function being called.) If the list of functions is large enough, we can usually find a way to avoid calling into the vulnerable code by making some changes that affect code earlier in the call stack.

Having access to debugging symbols and source code makes looking for work­arounds easier, but often the functions that are called are very small and their behavior is not easily configurable or able to be influenced in any way (in other words, to induce an error condition causing the function to exit versus calling the next function in the chain). In situations like that it becomes necessary to start looking for other dependencies. If you can’t change or influence a function’s behavior via some configuration change, then what happens if it is removed entirely? (Sometimes this is the only way to force an error condition and prevent a later function from being called.)

In the case of the WMF vulnerability, it was observed that for the most common attack vectors (Internet Explorer, Windows Explorer, and so forth), prior to calling the vulnerable function, a function in shimgvw.dll was called. So unregistering the DLL to block these attack vectors was tested and recommended.

Another good example of this type of work is the vulnerability addressed in Security Bulletin MS05-050 (microsoft.com/technet/security/Bulletin/MS05-050.mspx). It was discovered that an attacker could send a user a malformed video file which could exploit a vulnerability in Microsoft DirectShow®, allowing the attacker to run code of his choice in the context of the user. While analyzing the vulnerability, it was determined that the only known attack vector for this issue depended on the user being able to access a registry key under HKEY_CLASSES_ROOT ({1B544C20-FD0B-11CE-8C63-00AA0044B51E}). If access to this key was prevented, the vector was then blocked.

Other issues the Defense team looks for are non-workarounds and non-mitigating factors. Sometimes there are configuration steps that may seem like they would block a given attack vector but that actually won’t. A good example of this is the TNEF vulnerabilities that affected Microsoft Outlook® and Exchange Server. TNEF is an encoding format that is used to send rich text e-mail messages. During testing, the Defense team discovered that the Outlook "read e-mail as plain text" feature did not mitigate this vulnerability. In addition, Exchange Server offers an option to prevent it from sending rich text-formatted e-mail messages but this does not protect Exchange Servers from inbound attempts to exploit the vulnerability. This is the kind of information that is important to call out in the security bulletin so that users do not assume that seemingly related configuration changes will protect them.

Finally, this team is responsible for providing feedback to the product teams. Often during an investigation, SWI Defense will find a potential workaround that should provide a solution but doesn’t, or they will find ways a current feature of the product could be extended to improve its ability to withstand or block attacks. In situations like this, the team may file a bug report against the affected product and then inform the product team that they need to implement a change.

SWI Program Manager Team

At a Glance: Enforces requirements of the Security Development Lifecycle and works closely with various Microsoft product teams to ensure security requirements.

The SWI Program Manager team is responsible for evangelizing and enforcing the requirements of the SDL throughout Microsoft. This team works closely with product teams through their development process to ensure the dev teams understand and meet all of the security requirements.

Each product team is responsible for registering their upcoming releases with SWI and are then assigned a security advisor from the SWI Program Manager team. This SWI PM works with the product team to ensure that all SDL requirements will be met prior to release date. The SWI PM also assists the team with security engineering practices such as developing security plans, threat modeling, and running the various security tools Microsoft has developed. Before a team releases its product to customers, it must pass a Final Security Review (FSR) from SWI. The FSR ensures that all of the mandatory security checks and tools have been run and implemented. The goal is to ensure that products release with no known MSRC class vulnerabilities.

SWI PMs act as liaisons between MSRC and the product teams to ensure that the teams fix all known vulnerabilities and are aware of trends in security research. Arming teams with this knowledge helps reduce exposure to vulnerabilities

SWI Tools

At a Glance: Creates and maintains tools that help the SWI team fuzz test for various technologies and attack surface code, as well as promotes the latest best practices.

The SWI Tools team writes some fantastic tools. Among some of the most important tools created and maintained by this team are fuzzing tools for various technologies, as well as attack surface, code, and run-time analysis tools. Fuzzing is one of the most effective and popular approaches professional security researchers are using to find exploitable input validation flaws and other vulnerabilities in numerous operating system components and applications. The team has created and continues to refine a versatile range of fuzzing tools that we use to find input validation flaws in various products before they ship. Fuzzing tools, for example, helped identify weaknesses in many file parsing components of Windows that were, as a result, fixed as part of the Windows XP SP2 security push.

Another part of the SWI Tools team’s charter is to create quality tools that can be used to support the Security Development Lifecycle throughout Microsoft. Whenever a new set of engineering best practices or recommendations is added to the SDL, the Tools team tries to build a tool that will help product groups quickly analyze their SDL compliance and follow the most up-to-date security best practices. One good example is the Threat Modeling Tool, which is available for download.

SWI Penetration Testing

At a Glance: Looks for design weaknesses that could pave a way for attackers to subvert security guarantees.

The SWI Penetration Testing team operates on several fronts (see Figure 2). It consists of a small number of world-class security researchers. A big part of the team’s job is to put products getting ready to ship through in-depth security audits, looking for implementation and design weaknesses that could pave a way for attackers to subvert security guarantees. The Penetration Testing team pays attention to new and legacy code, managed and native components, and an entire range of areas where security weaknesses can manifest themselves, including user interface code, kernel-mode code, cryptography, network and remote procedure call code, and so forth. When problems are identified, the Penetration Testing team works with the product team to ensure the problems are properly addressed. This can range from verifying simple coding fixes to driving engineering process improvements to having the product team create its own security assurance team. On occasion the Penetration Testing team has to convince the product team of the severity of a product security hole. To illustrate the weakness, an attack tool is developed and demonstrated to the product team. The feedback loop is completed by feeding the findings back to folks who provide training and education to product teams.

Figure 2 Prerelease Security Process

Figure 2** Prerelease Security Process **

The final stage of development is too late in the game to be finding and fixing design-level issues. Thus, this team is plugged in increasingly earlier in the development pipeline. The Penetration Testing team has the necessary skills to do in-depth design reviews and help product teams identify and eliminate serious shortcomings before even a line of code has been written.

Finally, members of the team occasionally spend their time on pure security research. Some of the most notable accomplishments of late include significant strengthening of the protections afforded by the guard stack (GS) flag in the Visual Studio® 2005 C/C++ compiler and in-depth analysis of heap corruption safeguards provided by the Windows Vista™ heap manager.

Not surprisingly, Windows Vista presents a significant challenge to our team due to the sheer size of the product, in addition to the number of new security technologies it includes. Windows Vista is undergoing the largest internal prerelease penetration test of any product in Microsoft history. I’m sure all the security teams involved are up to the challenge.

Wrapping Up

I hope this article has given you a better understanding of the people and processes involved in making Microsoft software as secure as possible. For more information, be sure to visit the Microsoft Security Web site.

Robert Hensing has been at Microsoft for over eight years and is a Software Security Engineer on the Secure Windows Initiative team, providing assistance to the Microsoft Security Response Center. Robert currently holds MCSE and CISSP certifications. He can be reached at rhensing@microsoft.com.

© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.