Responding to IT Security Incidents

On This Page

Introduction
Before You Begin
Minimizing the Number and Severity of Security Incidents
Assembling the Core Computer Security Incident Response Team
Defining an Incident Response Plan
Containing the Damage and Minimizing the Risks
Related Information

Introduction

How prepared is your information technology (IT) department or administrator to handle security incidents? Many organizations learn how to respond to security incidents only after suffering attacks. By this time, incidents often become much more costly than needed. Proper incident response should be an integral part of your overall security policy and risk mitigation strategy.

There are clearly direct benefits in responding to security incidents. However, there might also be indirect financial benefits. For example, your insurance company might offer discounts if you can demonstrate that your organization is able to quickly and cost-effectively handle attacks. Or, if you are a service provider, a formal incident response plan might help win business, because it shows that you take seriously the process of good information security.

This document will provide you with a recommended process and procedures to use when responding to intrusions identified in a small- to medium-based (SMB) network environment. The value of forming a security incident response team with explicit team member roles is explained, as well as how to define a security incident response plan.

To successfully respond to incidents, you need to:

  • Minimize the number and severity of security incidents.

  • Assemble the core Computer Security Incident Response Team (CSIRT).

  • Define an incident response plan.

  • Contain the damage and minimize risks.

Before You Begin

System administrators spend a lot of time with network environments, and are very familiar with networks. They document the environments and have backups in place. There should be an auditing process already in place to monitor performance and utilization. There should be a level of awareness already achieved prior to implementing an incident response team.

No matter how much detail you know about the network environment, the risk of being attacked remains. Any sensible security strategy must include details on how to respond to different types of attacks.

Minimizing the Number and Severity of Security Incidents

In most areas of life, prevention is better than cure, and security is no exception. Wherever possible, you will want to prevent security incidents from happening in the first place. However, it is impossible to prevent all security incidents. When a security incident does happen, you will need to ensure that its impact is minimized. To minimize the number and impact of security incidents, you should:

  • Clearly establish and enforce all policies and procedures. Many security incidents are accidentally created by IT personnel who have not followed or not understood change management procedures or have improperly configured security devices, such as firewalls and authentication systems. Your policies and procedures should be thoroughly tested to ensure that they are practical and clear and provide the appropriate level of security.

  • Gain management support for security policies and incident handling.

  • Routinely assess vulnerabilities in your environment. Assessments should be done by a security specialist with the appropriate clearance to perform these actions i.e. (bondable and given administrator rights to the systems).

  • Routinely check all computer systems and network devices to ensure that they have all of the latest patches installed.

  • Establish security training programs for both IT staff and end users. The largest vulnerability in any system is the inexperienced user ? the ILOVEYOU worm effectively exploited that vulnerability among IT staff and end users.

  • Post security banners that remind users of their responsibilities and restrictions, along with a warning of potential prosecution for violation. These banners make it easier to collect evidence and prosecute attackers. You should obtain legal advice to ensure that the wording of your security banners is appropriate.

  • Develop, implement, and enforce a policy requiring strong passwords. You can learn more about passwords in "Enforcing Strong Password Usage Throughout Your Organization" in the Security Guidance Kit.

  • Routinely monitor and analyze network traffic and system performance.

  • Routinely check all logs and logging mechanisms, including operating system event logs, application specific logs and intrusion detection system logs.

  • Verify your back-up and restore procedures. You should be aware of where backups are maintained, who can access them, and your procedures for data restoration and system recovery. Make sure that you regularly verify backups and media by selectively restoring data.

  • Create a Computer Security Incident Response Team (CSIRT) to deal with security incidents. You can learn more about CSIRT in the following section of this document.

Assembling the Core Computer Security Incident Response Team

The CSIRT is the focal point for dealing with computer security incidents in your environment. Your team should consist of a group of people with responsibilities for dealing with any security incident. Team members should have clearly defined duties to ensure that no area of your response is left uncovered.

Assembling a team before an incident occurs is very important to your organization and will positively influence how incidents are handled. A successful team will:

  • Monitor systems for security breaches.

  • Serve as a central communication point, both to receive reports of security incidents and to disseminate vital information to appropriate entities about the incident.

  • Document and catalog security incidents.

  • Promote security awareness within the company to help prevent incidents from occurring in your organization.

  • Support system and network auditing through processes such as vulnerability assessment and penetration testing.

  • Learn about new vulnerabilities and attack strategies employed by attackers.

  • Research new software patches.

  • Analyze and develop new technologies for minimizing security vulnerabilities and risks.

  • Provide security consulting services.

  • Continually hone and update current systems and procedures.

When you create a CSIRT, prepare the team so they are equipped to handle incidents. To prepare the team, you should:

  • Train them on the proper use and location of critical security tools. You should also consider providing portable computers that are preconfigured with these tools to ensure that no time is wasted installing and configuring tools so they can respond to an incident. These systems and the associated tools must be properly protected when not in use.

  • Assemble all relevant communication information. You should ensure that you have contact names and phone numbers for people within your organization who need to be notified (including members of the CSIRT, those responsible for supporting all of your systems, and those in charge of media relations). You will also need details for your Internet service provider (ISP) and local and national law enforcement agencies. Discuss with your legal counsel about contacting local law enforcement before an incident happens. This will help you to ensure that you understand proper procedures for communicating incidents and collecting evidence. Legal counsel should be informed of any contacts with law enforcement.

  • Place all emergency system information in a central, offline location, such as a physical binder or an offline computer. This emergency information includes passwords to systems, Internet Protocol (IP) addresses, router configuration information, firewall rule set lists, copies of certification authority keys, contact names and phone numbers, escalation procedures, and so on. This information must both be readily available and be kept extremely physically secure. One method of securing and making this information readily available is to encrypt it on a dedicated security portable computer that is placed in a secure vault and limit access to the vault to authorized individuals such as the CSIRT leader and the CIO or CTO.

The ideal CSIRT membership and structure depends on the type of your organization and your risk management strategy. However, the CSIRT should generally form part or all of your organization's security team. Inside the core team are security professionals responsible for coordinating a response to any incident. The number of members in the CSIRT will typically depend on the size and complexity of your organization. However, you should ensure that there are enough members to adequately cover all of the duties of the team at any time.

Establishing Team Roles

A successful CSIRT team consists of several key members.

CSIRT Team Leader. The CSIRT must have an individual in charge of its activities. The CSIRT Team Leader will generally be responsible for the activities of the CSIRT and will coordinate reviews of its actions. This might lead to changes in polices and procedures for dealing with future incidents.

CSIRT Incident Lead. In the event of an incident, you should designate one individual responsible for coordinating the response. The CSIRT Incident Lead has ownership of the particular incident or set of related security incidents. All communication about the event is coordinated through the Incident Lead, and when speaking with those outside the CSIRT, he or she represents the entire CSIRT. The Incident Lead might vary depending on the nature of the incident, and is often a different person than the CSIRT Team Leader.

CSIRT Associate Members. Besides the core CSIRT team, you should have a number of specific individuals who handle and respond to particular incidents. Associate members will come from a variety of different departments in your organization. They should specialize in areas that are affected by security incidents but that are not dealt with directly by the core CSIRT. Associate members can either be directly involved in an incident or serve as entry points to delegate responsibility to a more appropriate individual within their departments. The following table shows some suggested associate members and their roles.

CSIRT Associate Members

Associate Member

Role Description

IT Contact

This member is primarily responsible for coordinating communication between the CSIRT Incident Lead and the rest of the IT group. The IT Contact might not have the particular technical expertise to respond to the particular incident; however, he or she will be primarily responsible for finding people in the IT group to handle particular security events.

Legal Representative

This member is a lawyer who is very familiar with established incident response policies. The Legal Representative determines how to proceed during an incident with minimal legal liability and maximum ability to prosecute offenders.

Before an incident occurs, the Legal Representative should have input on monitoring and response policies to ensure that the organization is not being put at legal risk during a cleanup or containment operation. It is very important to consider the legal implications of shutting down a system and potentially violating service level agreements or membership agreements with your customers, or not shutting down a comprised system and being liable for damages caused by attacks launched from that system.

Any communication to outside law enforcement or external investigative agencies should also be coordinated with the Legal Representative.

Public Relations Officer

Generally, this member is part of the public relations department and is responsible for protecting and promoting the image of the organization.

This individual might not be the actual face to the media and customers, but he or she is responsible for crafting the message (the content and objective of the message is generally the responsibility of management). All media inquiries should be directed to Public Relations.

Management

Depending on the particular incident, you might involve only departmental managers, or you might involve managers across the entire organization. The appropriate management individual will vary according to the impact, location, severity, and type of incident.

If you have a managerial point of contact, you can quickly identify the most appropriate individual for the specific circumstances. Management is responsible for approving and directing security policy.

Management is also responsible for determining the total impact (both financial and otherwise) of the incident on the organization. Management directs the Communications Officer regarding which information should be disclosed to the media and determines the level of interaction between the Legal Representative and law enforcement agencies.

Responding to an Incident

In the event of an incident, the CSIRT will coordinate a response from the core CSIRT and will communicate with the associate members of the CSIRT. The following table shows the responsibilities of these individuals during the incident response process.

Responsibilities of CSIRT During the Incident Response Process

Activity

Role

?

CSIRT Incident Lead

IT Contact

Legal Representative

Communications Officer

Management

Initial Assessment

Owner

Advises

None

None

None

Initial Response

Owner

Implements

Updates

Updates

Updates

Collects Forensic Evidence

Implements

Advises

Owner

None

None

Implements Temporary Fix

Owner

Implements

Updates

Updates

Advises

Sends Communication

Advises

Advises

Advises

Implements

Owner

Check with Local Law Enforcement

Updates

Updates

Implements

Updates

Owner

Implements Permanent Fix

Owner

Implements

Updates

Updates

Updates

Determines Financial Impact on Business

Updates

Updates

Advises

Updates

Owner

Defining an Incident Response Plan

All members of your IT environment should be aware of what to do in the event of an incident. The CSIRT will perform most actions in response to an incident, but all levels of your IT staff should be aware of how to report incidents internally. End users should report suspicious activity to the IT staff directly or through a help desk rather than directly to the CSIRT.

Every team member should review the incidence response plan in detail. Having the plan easily accessible to all IT staff will help to ensure that when an incident does occur, the right procedures are followed.

To instigate a successful incident response plan, you should:

  • Make an initial assessment.

  • Communicate the incident.

  • Contain the damage and minimize the risk.

  • Identify the type and severity of the compromise.

  • Protect evidence.

  • Notify external agencies if appropriate.

  • Recover systems.

  • Compile and organize incident documentation.

  • Assess incident damage and cost.

  • Review the response and update policies.

These steps are not purely sequential. Rather, they happen throughout the incident. For example, documentation starts at the very beginning and continues throughout the entire life cycle of the incident; communication also happens throughout the entire incident.

Other aspects of the process will work alongside each other. For example, as part of your initial assessment, you will gain an idea of the general nature of the attack. It is important to use this information to contain the damage and minimize risk as soon as possible. If you act quickly, you can help to save time and money, and your organization's reputation.

However, until you understand the type and severity of the compromise in more detail, you will not be able to be truly effective in containing the damage and minimizing the risk. An overzealous response could even cause more damage than the initial attack. By working these steps alongside each other, you will get the best compromise between swift and effective action.

Note: It is very important that you thoroughly test your incident response process before an incident occurs. Without thorough testing, you cannot be confident that the measures that you have in place will be effective in responding to incidents.

Making an Initial Assessment

Many activities could indicate a possible attack on your organization. For example, a network administrator performing legitimate system maintenance might appear similar to someone launching some form of attack. In other cases, a badly configured system might lead to a number of false positives in an intrusion detection system, which could make it more difficult to spot genuine incidents.

As part of your initial assessment, you should:

  • Take steps to determine whether you are dealing with an actual incident or a false positive.

  • Gain a general idea of the type and severity of attack. You should gather at least enough information to begin communicating it for further research and to begin containing the damage and minimizing the risk.

  • Record your actions thoroughly. These records will later be used for documenting the incident (whether actual or false).

Note: You should avoid false positives whenever possible; however, it is always better to act on a false positive than fail to act on a genuine incident. Your initial assessment should, therefore, be as brief as possible, yet still eliminate obvious false positives.

Communicating the Incident

Once you suspect that there is a security incident, you should quickly communicate the breach to the rest of the core CSIRT. The incident lead, along with the rest of the team, should quickly identify who needs to be contacted outside of the core CSIRT. This will help to ensure that appropriate control and incident coordination can be maintained, while minimizing the extent of the damage.

Be aware that damage can come in many forms, and that a headline in the newspaper describing a security breach can be much more destructive than many system intrusions. For this reason, and to prevent an attacker from being tipped off, only those playing a role in the incident response should be informed until the incident is properly controlled. Based on the unique situation, your team will later determine who needs to be informed of the incident. This could be anyone from specific individuals up to the entire company and external customers. Communication externally should be coordinated with the Legal Representative.

Containing the Damage and Minimizing the Risks

By acting quickly to reduce the actual and potential effects of an attack, you can make the difference between a minor and a major one. The exact response will depend on your organization and the nature of the attack that you face. However, the following priorities are suggested as a starting point:

  1. Protect human life and people's safety. This should, of course, always be your first priority.

  2. Protect classified and sensitive data. As part of your planning for incident response, you should clearly define which data is classified and which is sensitive. This will enable you to prioritize your responses in protecting the data.

  3. Protect other data, including proprietary, scientific, and managerial data. Other data in your environment might still be of great value. You should act to protect the most valuable data first before moving on to other, less useful, data.

  4. Protect hardware and software against attack. This includes protecting against loss or alteration of system files and physical damage to hardware. Damage to systems can result in costly downtime.

  5. Minimize disruption of computing resources (including processes). Although uptime is very important in most environments, keeping systems up during an attack might result in greater problems later on. For this reason, minimizing disruption of computing resources should generally be a relatively low priority.

There are a number of measures that you can take to contain the damage and minimize the risk to your environment. At a minimum, you should:

  • Try to avoid letting attackers know that you are aware of their activities. This can be difficult, because some essential responses might alert attackers. For example, if there is an emergency meeting of the CSIRT, or you require an immediate change of all passwords, any internal attackers might know that you are aware of an incident.

  • Compare the cost of taking the compromised and related systems offline against the risk of continuing operations. In the vast majority of cases, you should immediately take the system off the network. However, you might have service agreements in place that require keeping systems available even with the possibility of further damage occurring. Under these circumstances, you can choose to keep a system online with limited connectivity in order to gather additional evidence during an ongoing attack.
    In some cases, the damage and scope of an incident might be so extensive that you might have to take action that invokes the penalty clauses specified in your service level agreements. In any case, it is very important that the actions that you will take in the event of an incident are discussed in advance and outlined in your response plan so that immediate action can be taken when an attack occurs.

  • Determine the access point(s) used by the attacker and implement measures to prevent future access. Measures might include disabling a modem, adding access control entries to a router or firewall, or increasing physical security measures.

  • Consider rebuilding a fresh system with new hard disks (the existing hard disks should be removed and put in storage as these can be used as evidence if you decide to prosecute attackers). Ensure that you change any local passwords. You should also change administrative and service account passwords elsewhere in your environment.

Identifying the Severity of the Compromise

To be able to recover effectively from an attack, you need to determine how seriously your systems have been compromised. This will determine how to further contain and minimize the risk, how to recover, how quickly and to whom you communicate the incident, and whether to seek legal redress.

You should attempt to:

  • Determine the nature of the attack (this might be different than the initial assessment suggests).

  • Determine the attack point of origin.

  • Determine the intent of the attack. Was the attack specifically directed at your organization to acquire specific information, or was it random?

  • Identify the systems that have been compromised.

  • Identify the files that have been accessed and determine the sensitivity of those files.

By performing these actions, you will be able to determine the appropriate responses for your environment. A good incident response plan will outline specific procedures to follow as you learn more about the attack. Generally, the nature of the attack symptoms will determine the order in which you follow the procedures defined in your plan. Since time is crucial, less time-consuming procedures should generally be carried out before more lengthy ones. To help determine the severity of the compromise, you should:

  • Contact other members of the response team to inform them of your findings, have them verify your results, determine whether they are aware of related or other potential attack activity, and help identify whether the incident is a false positive. In some cases, what might appear to be a genuine incident on initial assessment will prove to be a false positive.

  • Determine whether unauthorized hardware has been attached to the network or whether there are any signs of unauthorized access through the compromise of physical security controls.

  • Examine key groups (domain administrators, administrators, and so on) for unauthorized entries.

  • Search for security assessment or exploitation software. Cracking utilities are often found on compromised systems during evidence gathering.

  • Look for unauthorized processes or applications currently running or set to run using the startup folders or registry entries.

  • Search for gaps in, or the absence of, system logs.

  • Review intrusion detection system logs for signs of intrusion, which systems might have been affected, methods of attack, time and length of attack, and the overall extent of potential damage.

  • Examine other log files for unusual connections; security audit failures; unusual security audit successes; failed logon attempts; attempts to log on to default accounts; activity during nonworking hours; file, directory, and share permission changes; and elevated or changed user permissions.

  • Compare systems to previously conducted file/system integrity checks. This enables you to identify additions, deletions, modifications, and permission and control modifications to the file system and registry. You can save a lot of time when responding to incidents if you identify exactly what has been compromised and what areas need to be recovered.

  • Search for sensitive data, such as credit card numbers and employee or customer data, that might have been moved or hidden for future retrieval or modifications. You might also have to check systems for non-business data, illegal copies of software, and e-mail or other records that might assist in an investigation. If there is a possibility of violating privacy or other laws by searching on a system for investigative purposes, you should contact your legal department before you proceed.

  • Match the performance of suspected systems against their baseline performance levels. This of course presupposes that baselines have been created and properly updated.

When determining which systems have been compromised and how, you will generally be comparing your systems against a previously recorded baseline of the same system before it was compromised. Assuming that a recent system shadow copy is sufficient for comparison might put you in a difficult situation if the previous shadow copy comes from a system that has already been attacked.

Note: Tools such as EventCombMT, DumpEL, and Microsoft Operations Manager (MOM) can help you determine how much a system has been attacked. Third-party intrusion detection systems give advance warning of attacks, and other tools will show file changes on your systems.

Protecting Evidence

In many cases, if your environment has been deliberately attacked, you may want to take legal action against the perpetrators. In order to preserve this option, you should gather evidence that can be used against them, even if a decision is ultimately made not to pursue such action. It is extremely important to back up the compromised systems as soon as possible. Back up the systems prior to performing any actions that could affect data integrity on the original media.

Someone skilled in computer forensics should make at least two complete bit-for-bit backups of the entire system using new, never-before-used media. At least one backup should be on a write-once, read-many media such as a CD-R or DVD-R. This backup should be used only for prosecution of the offender and should be physically secured until needed.

The other backup can be used for data recovery. These backups should not be accessed except for legal purposes, so you should physically secure them. You will also need to document information about the backups, such as who backed up the systems, at what time, how they were secured, and who had access to them.

Once the backups are performed, you should remove the original hard disks and store them in a physically secure location. These disks can be used as forensic evidence in the event of a prosecution. New hard disks should be used to restore the system.

In some cases, the benefit of preserving data might not equal the cost of delaying the response and recovery of the system. The costs and benefits of preserving data should be compared to those of faster recovery for each event.

For extremely large systems, comprehensive backups of all compromised systems might not be feasible. Instead, you should back up all logs and selected, breached portions of the system.

If possible, back up system state data, as well. It can take months or years until prosecution takes place, so it is important to have as much detail of the incident archived for future use.

Often the most difficult legal aspect of prosecuting a cyber crime is collecting evidence in a manner acceptable to the particular jurisdiction's laws of evidence submission. Hence, the most critical component to the forensic process is detailed and complete documentation of how systems were handled, by whom, and when, in order to demonstrate reliable evidence. Sign and date every page of the documentation.

Once you have working, verified backups, you can wipe the infected systems and rebuild them. This will enable you to begin running your operation again. The backups provide the critical, untainted evidence required for prosecution. A different backup than the forensic backup should be used to restore data.

Notifying External Agencies

After the incident has been contained and data preserved for potential prosecution, you should consider whether you need to start notifying appropriate external entities. All external disclosures should be coordinated with your Legal Representative. Potential agencies include local and national law enforcement, external security agencies, and virus experts. External agencies can provide technical assistance, offer faster resolution and provide information learned from similar incidents to help you fully recover from the incident and prevent it from occurring in the future.

For particular industries and types of breaches, you might have to notify customers and the general public, particularly if customers might be affected directly by the incident.

If the event caused substantial financial impact, you might want to report the incident to law enforcement agencies.

For higher profile companies and incidents, the media might be involved. Media attention to a security incident is rarely desirable, but it is often unavoidable. Media attention can enable your organization to take a proactive stance in communicating the incident. At a minimum, the incident response procedures should clearly define the individuals authorized to speak to media representatives.

Normally the public relations department within your organization will speak to the media. You should not attempt to deny to the media that an incident has occurred, because doing so is likely to damage your reputation more than proactive admission and visible responses ever will. This does not mean that you need to notify the media for each and every incident regardless of its nature or severity. You should assess the appropriate media response on a case-by-case basis.

Recovering Systems

How you recover your system will generally depend on the extent of the security breach. You will need to determine whether you can restore the existing system while leaving intact as much as possible, or if it is necessary to completely rebuild the system.

Restoring data presumes, of course, that you have clean backups ? backups made before the incident occurred. File integrity software can help pinpoint the first occurrence of damage. If the software alerts you to a changed file, then you know that the backup you made before the alert is a good one and should be preserved for use when rebuilding the compromised system.

An incident could potentially corrupt data for many months prior to discovery. It is, therefore, very important that as part of your incident response process, you determine the duration of the incident. (File/system integrity software and intrusion detection systems can assist you in this.) In some cases, the latest or even several prior backups might not be long enough to get to a clean state, so you should regularly archive data backups in a secure off-site location.

Compiling and Organizing Incident Evidence

The CSIRT should thoroughly document all processes when dealing with any incident. This should include a description of the breach and details of each action taken (who took the action, when they took it, and the reasoning behind it). All people involved with access must be noted throughout the response process.

Afterward, the documentation should be chronologically organized, checked for completeness, and signed and reviewed with management and legal representatives. You will also need to safeguard the evidence collected in the protect evidence phase. You should consider having two people present during all phases who can sign off on each step. This will help reduce the likelihood of evidence being inadmissible and the possibility of evidence being modified afterward.

Remember that the offender might be an employee, contractor, temporary employee, or other insider within your organization. Without thorough, detailed documentation, identifying an inside offender will be very difficult. Proper documentation also gives you the best chance of prosecuting offenders.

Assessing Incident Damage and Cost

When determining the damage to your organization, you should consider both direct and indirect costs. Incident damage and costs will be important evidence needed if you decide to pursue any legal action. These could include:

  • Costs due to the loss of competitive edge from the release of proprietary or sensitive information.

  • Legal costs.

  • Labor costs to analyze the breaches, reinstall software, and recover data.

  • Costs relating to system downtime (for example, lost employee productivity, lost sales, replacement of hardware, software, and other property).

  • Costs relating to repairing and possibly updating damaged or ineffective physical security measures (locks, walls, cages, and so on).

  • Other consequential damages such as loss of reputation or customer trust.

Reviewing Response and Updating Policies

Once the documentation and recovery phases are complete, you should review the process thoroughly. Determine with your team which steps were executed successfully and which mistakes were made. In almost all cases, you will find some processes that need to be modified so you can better handle future incidents.

You will find weaknesses in your incident response plan; the point of this post-mortem exercise ? you are looking for opportunities for improvement, which should initiate a whole new round of the incident response planning process.

Much of this document has dealt with measures that you can take to minimize the risk of being attacked. However, organizations have the most success at reaching their security goals when they do everything that they can to minimize their chances of attack, and then plan what they will do when they are attacked. Part of this process is to audit carefully for attack. Another equally important part is to have a clearly defined, well-rehearsed set of responses that you can put into place if an attack does occur.

For more information about creating an incidence response plan, see the following:

For more information about security, see the following:

  • The Internet Security Guidebook: >From Planning to Deployment by Juanita Ellis and Tim Speed (Academic Press, ISBN: 0223747).