Why Create a Security Incident Response Process

By Christopher Budd, Security Program Manager, Microsoft Corporation

See other Security Tip of the Month columns

Combating malicious software in your environment isn’t just a matter of implementing the right technology solutions. Like all things in the IT world, effectively combating malicious software is a solution that combines those three classic, critical elements: people, processes, and technology.

This means that if your strategy for combating malicious software is based solely on technology, your strategy is missing the equally critical elements of people and processes. In this column, I want to focus on the process element and talk about the incident response process. In many ways, this is the most important process element in a comprehensive strategy for dealing with malicious software.

First, it’s important to understand what an incident response process is. At Microsoft, we have spent years developing and refining our own Software Security Incident Response Process (SSIRP) which we use, among other things, in situations when malicious users put our customers at risk by deliberately seeking to exploit vulnerabilities. With this structured, regulated process, we can effectively investigate, analyze, and resolve these situations with the goal of protecting our customers. Our SSIRP process is one of the many kinds of incident response processes that organizations have developed in answer to intrusions, malicious software outbreaks, or other threats. Regardless of the specific threats that are addressed, the key thing is that an incident response process must be structured so that you can use it to respond to situations and events that put your environment at risk. An incident response process can be thought of as a processes-based (rather than technology-based) way to mitigate that risk.

By their very nature, situations are inherently unpredictable. They arise unexpectedly, they unfold in ways that cannot be fully anticipated, and you cannot predict going into them what the results will be when the incident has passed. In fact, the greatest challenge with an incident is its inherent chaos. Chaos works to compound the damage of the incident itself (regardless of the type) by slowing or halting responses, causing incorrect or ineffective paths of response to be chosen, or, in the worst case, causing responses to stop altogether, allowing the situation to rage unabated.

An incident response process helps mitigate the risk posed by that chaos. In a way, an incident response process can be thought of as an attempt to impose some degree of order and control over what is fundamentally a disordered and uncontrolled situation. It is a constant and steady answer to the panicked question, “What do I do?” when bad things start to happen. In looking at malicious software outbreaks, for example, the observed events in the early stages of an outbreak can differ radically depending on the specific nature of malicious software in your environment. In fact, many outbreaks begin in ways such that it’s not clear there is a malicious software outbreak in progress: it could just as plausibly be another kind of threat, such as an intrusion or a highly disruptive outage. Most importantly, in these early stages, quickly choosing a course of action based on an incorrect understanding of the problem can lead to a longer, more damaging incident as you lose time by following the wrong course of action before you have to reevaluate the situation and decide on an effective solution. An incident response process strives to put order and consistency into handling these situations and, in so doing, minimize the damaging effects of chaos during the incident.

What an incident response process brings to the situation is a structured sequence of data gathering, analysis, and decision making that progress through clearly defined stages with clearly specified supporting processes and procedures. Using our SSIRP process as an example, we can follow a defined process of gathering and analyzing data during our “Alert” phase to determine if we should invoke our full process and move to the “Mobilize” phase. By following a similar structured process in responding to malicious software outbreak incidents, you can respond more effectively and limit damage because all resources are focused in an ordered and controlled manner.

An example of the benefits that this structured process can bring to a malicious software outbreak incident response process is a containment phase. This phase will focus first on understanding what needs to be done to prevent further spreading or damage. Actions focused on returning already compromised or infected systems to service are left for a later phase, once containment has been achieved. Without a focused, structured process like this in place, many organizations will try to solve the problem all at once, mixing containment and recovery processes. Combining them often has the net effect of hampering overall success by diluting resource effectiveness across multiple tasks. By focusing all resources first on containment and then on recovery, the issue is more quickly contained, which leads to less impacted systems, which in turn minimizes the overall recovery time. This staged approach can seem to delay recovery for some systems, but it actually speeds up the overall recovery of all systems. In an incident response process, it is important to always keep the big picture in mind when making decisions.

An incident response process alone is not sufficient to comprise your strategy for dealing with malicious software in your environment: it must be a part of a solution that addresses the technology element, through things like update management, antivirus and antispyware, and the people element, through user education and training to foster better security awareness. Incident response is a vital part of that strategy, even though it is often still thought of as something that only large organizations need. The reality is that any organization big enough to have an incident involving malicious software is big enough to need an incident response process for dealing with that threat, even if it’s only a small office and the response is to call your IT support vendor.