On Measuring Progress

Jeffrey R. Jones

By Jeffrey R. Jones, Senior Director - Microsoft Security Business Unit

This month I want to spend some time discussing two questions that I get asked a lot—do you think Microsoft is making progress on improving security for customers and what metrics are you using? Ultimately, I think the latter question should be defined by IT professionals and other customers, but I thought maybe I could start the discussion by sharing some of my thoughts.

First, so that we as a community can take the discussion beyond theoretical and into the practical and realistic realm, I propose the following guidelines:

  • The criteria should be measurable and distinguishable.

  • The methodology should be repeatable by objective third parties.

  • The criteria should support real-world scenarios.

  • The criteria should be useful for business decisions.

  • The criteria should support measuring version over version improvements.

  • The criteria should support comparisons between vendors.

My proposal for security criteria would be to have the following high-level areas: assurance, security quality, protective capabilities, manageability and automation, security update tools, and security update policies. I see the first three areas as a group that could help provide some idea of base security strength for products, while the latter three areas provide some guidelines for IT departments to measure aspects of operationally managing security.

Assurance is an area that those experienced in security will recognize. In the days of the NSA’s Trusted Computer Systems Evaluation Criteria, or Orange Book, different assurance levels were tied to different feature sets. The Common Criteria, in use today by many countries, introduced a more flexible model that has separated assurance levels from the feature set—EAL4, for example, stands for Evaluation Assurance Level 4. So, in the assurance area, I can think of several measurable sub-criteria: third-party certifications, source integrity protections, established change control processes, quality assurance processes, security architecture definitions, threat modeling, adherence to coding guidelines, use of automated code review tools, and penetration testing.

Security Quality is the term I have begun using for things like the number of vulnerabilities that are found in code. In this category, I can think of several sub-criteria: number of vulnerabilities fixed, number of publicly reported vulnerabilities unfixed, average time to fix a publicly reported vulnerability, number of open listening ports, and number of listening services or daemons.

Protective Capabilities are the capabilities that might protect the software from attack even in the presence of a vulnerability. Sub-criteria include: firewall capabilities, antivirus/worm capabilities, strong authentication, behavior blocking, and security policy enforcement features.

Manageability and Automation is a category designed toward ensuring that configuration mistakes are less likely to decrease security and that less expertise is required to deploy and manage a product with low security risk. Sub-criteria would include: security configuration wizards/tools, lockdown tools, vulnerability assessment capabilities, security reporting capabilities, and central management capabilities.

Security Update Packages and Tools are a complementary part of measuring security until either security quality or protective capabilities reach 100% effectiveness. Sub-criteria should include: automatic update capabilities, patch approval capability for enterprise updates, patch deployment tool, uninstall capabilities, recompile requirements, reboot requirements, size of packages, and the ability to work with third-party deployment tools.

Security Update Policies is an interesting area that will warrant a lot of discussion. Sub-criteria for this area puts the emphasis on vendors. Do they provide security lifecycle time period for products? Do they patch all publicly disclosed issues? Do they issue security advisories, follow responsible disclosure, offer mitigation guidance? Do they publish patch test quality criteria, patch all versions/languages at the same time, and limit the number of patch recalls?

After reviewing my proposed list, there is one anticipated topic that I left out—security features. I did not include it because, in some sense, I think we should be able to measure “more secure” and have it apply to any general type of software product. If a product supports remote, encrypted tunneling, that feature is more about the remote access scenario than it is about the fundamental security of the product. I believe we should be able to have metrics for the base security and then separately be able to discuss business scenarios that may be enabled by additional security features.

So, what do you think of my straw man proposal? Please send me your feedback. As a next step, I am going to go to the security newsgroups and begin a similar discussion so that we can be interactive in developing measurement criteria for progress. I hope some of you will join the discussions there. When we can make further progress as a community in establishing measurement criteria, I will report back to you with an update.

Best regards,

Jeff