Hacking: Fight Back
The Day After: Your First Response To A Security Breach
Kelly J. Cooper
At a Glance:
- Defining a post-mortem
- Types of incidents to submit to a post-mortem review
- Organizing the post-mortem
- Managing and facilitating the meeting
- Topics to cover
- Results and follow-ups
- Integration of newfound knowledge into your company
The security incident is over. The techs have all gone home and are snug in their beds, dreaming of flawless code trees and buffer-overflow repellent. Upper management has done all the damage control they can. Everyone's shifting back into their
normal activities and schedules. Everyone, that is, except you. What can you do to prevent this from ever happening again?
The best way to understand how a security incident happened is to conduct a post mortem. Incidents can range from an internal configuration error that resulted in system downtime, all the way up through an attack on your company, or even a natural disaster that impacted your company's physical location. Any event that didn't go as well as you hoped, or any set of processes that need to be checked, is a perfect candidate for a post mortem.
A post mortem is a review of what happened; a good post mortem delves into the who, what, how, when, and why of the incident. Even if the incident was clearly documented at the time, you're still going to need to review how things could have gone better in order to improve your processes, tools, and training for the future. These improvements may not prevent all future attacks, but they will allow you to prepare your business for the next incident.
You need to schedule your post mortem as soon as possible after the incident. Give everyone the opportunity to recover first (especially if people need to catch up on sleep), but don't wait too long. Get everyone who was actively involved in the incident, or at least a representative from each person's group, into a room. You may not be able to schedule time with any upper-level executives who participated, but you can touch base with them later. In fact, their presence can hinder an open dialogue, so carbon-copy them on the invite, but don't require their presence.
Have all participants bring whatever notes they may have made. If you have a trouble ticket or a timeline or any kind of documentation of the incident, print it out and provide copies to everyone. Be sure to mark the printouts as confidential. At the beginning of the meeting, tell people either to hand them back at the conclusion of the meeting or keep the materials in a locked drawer. Many people would rather hand the papers back.
Make the confidentiality issue clear at the beginning of the meeting so that any notes the participants might take for themselves aren't written on the printout that they then decide they want to hand back. After the meeting, shred any returned documentation. You don't want dumpster divers getting their hands on the details of your security problems. If you don't have a shredder, buy one. You'll be surprised by how many people will use shredders once they're available to them.
Post mortems can be extremely emotional. There's a tendency to fling blame around the table and around the company. Your job is to minimize the emotional outbursts by steering the meeting so that everyone can draw as much useful information as possible. You may need to discuss an amnesty agreement with the group, where you trade a promise of no firings in exchange for honesty. This may sound impractical in some work environments, but it should be seriously considered. You should also think about implementing a policy like this on various whistleblower policies and legislation.
The first thing to consider when designing your agenda is the structure. The easiest and the simplest to follow is chronological: what happened first, next, and last? Who did what and when? How long between the first and second event?
A structured agenda is useful for bringing people back on topic when they start to go off on tangents. Of course, remember that nothing ever goes according to plan. Leave room for complex discussions and be willing to follow up with specific individuals outside of the meeting in order to put a tangent aside, at least temporarily.
As the meeting coordinator, you also have your own agenda. Aside from whatever political pressure you may be under, you also have a responsibility to compile data that will allow your company to be better prepared for the next incident. You may have to create an incident response process from scratch, including training and documentation. Look at what worked and what didn't throughout the post mortem as the basis for your process. If an incident handling process already exists, look for areas where education or improvements are needed.
Starting Your Analysis
First, you have to get the meeting started. The sidebar "Five Starter Questions" can help kick off the discussion.
Five Starter Questions
- Does anyone have a timeline representing the incident?
- When was the incident first reported? That is, when was a problem reported, not necessarily as a security issue?
- What was the mechanism of discovery? Was it a user complaint, system malfunction, or monitoring notice?
- When was the situation recognized as a security incident?
- How was it determined to be a security incident?
If there is no timeline, you'll probably have to piece this material together from time stamps on trouble ticket entries and e-mail messages. If none of this material is available in the meeting, discuss the generalities and try to establish specifics after the meeting is over. Make a note for the future that creating a timeline and fully documenting the incident should be part of the incident response process.
Once you've determined when and how the issue was recognized as a security incident, you may be able to parlay that information into some sort of early warning system and teach the rest of the staff to recognize the symptoms.
For instance, in the early days of Denial of Service (DoS) attacks, the monitoring centers of ISPs noticed an upswing in outages due to sudden bandwidth saturation. Most were due to one or two types of DoS attacks. Once this was recognized, all operators were trained to look for a DoS attack in the course of their normal investigation of an outage or reported problem.
As different attacks evolved, the symptoms of each were broken down and provided to the operators. This became an early warning system that often allowed ISPs to notice that customers were under attack before the customers themselves were aware of the escalating problem.
Discussing the Incident
Once you have good idea of what happened and how it was recognized, assess the quality of the response by asking some of the questions in the "Post-Mortem Discussion Points" sidebar.
Listen closely to people's complaints. Gripe sessions are the best source for understanding company friction, whether it's interpersonal issues, tool malfunctions, or arcane and frustrating processes. If people don't like a process or tool, they won't use it. They'll even circumvent it without considering the possible consequences.
Once the security incident was recognized, how much time elapsed before it was resolved? This is a simple question, but it may have a very complex answer. Depending on the type of incident (virus infestation, e-mailed Trojan, DoS attack, insider exploitation) and how widely its impact was felt across the company, clean up could take hours or days. In fact, clean up may still be happening while you're having your meeting. You need to decide what marks the end of your incident. Otherwise, events that are holding you up, like waiting for a patch from a vendor, will continue to show up in the documentation of the incident. To avoid this, make sure you close this high-priority ticket and open a separate ticket at a lower priority to track any long-term events.
For a software exploit, you should examine your company's patch process and possibly your firewall configuration. For a viral or Trojan-based infection, you should scrutinize your antivirus software as well as its update schedule on individual computers. You may also want to assess whether viruses and Trojans are filtered at the mail server and if so, how it's done.
Post-Mortem Discussion Points
- Could the problem have been identified faster?
- Could you have realized it was a security incident sooner?
- Could you have stopped the problem earlier?
- What would have helped speed up any of these processes?
- Are you lacking a run book? A process? A tool? The right skill set? The right people on-call?
- Do you have sufficient resources to handle these attacks? Do you have enough people to look at the system logs, the firewall logs, and Intrusion Detection System (IDS) reports?
- Are you using software to analyze these logs and pull out relevant data to minimize the mind-crushing boredom of going through each by hand?
Each kind of incident may have a best response, but what is best can vary based on the company's network architecture and information technology design. Consider what might be needed to improve the prevention, detection, and response processes. The first answers that come to mind to solve these problems are more personnel, more education, and better tools.
Did any one person coordinate your company's response to this incident? If you don't have an incident coordinator, consider training several employees to handle this job. Designating a single point of contact for updates is very helpful. For instance, the coordinator can collect data from the various people working on the problem and report it to upper management. This means staff members only have to report to one person, instead of dozens calling them to ask for a status. One person can maintain a timeline and update the trouble ticket, keeping a consistent voice in customer and/or company communication. If you do have an incident coordinator, work with this person to train others. One person cannot be on call constantly to handle any incident that might occur.
Were You Targeted?
Was your company targeted specifically or was this a random attack? The answer could be crucial for future prevention, but difficult to determine. If the attack was targeted, it could be because of your company's politics or affiliations. In the current climate, companies supporting Genetically Modified Organisms (GMO) or the World Trade Organization (WTO) are common targets. Anyone seen as supporting spam may also become targets. High-profile partnerships with controversial organizations can often bring negative attention. For instance, the Electronic Disturbance Theater, which created the Floodnet program, was active through 2003 publicizing various causes by using Floodnet to overwhelm high-profile Web sites. Mexican government Web sites were targeted in support of labor and indigenous rights in Mexico, and biotech-related Web sites were attacked to protest GMO foods. But many attacks are never publicized.
Your company may also have been targeted due to a particular individual—many DoS attacks occur when one person gets angry while chatting with another over Internet Relay Chat. He may sign up the target of his ire for a barrage of e-mail lists, give his e-mail address to spammers, or launch a DoS attack against his IP address, which also happens to be one of your company's IP addresses. Some employees run whole IRC servers on their company's networks, which will incur many attacks. Most ISPs and many companies can tell war stories about any of these types of attack, although very few accounts are actually published. Data privately gathered from chat channels shows that these trends persist despite changes in technology and politics.
Look at the target of the attack and study the possible reasons that it was chosen. Was it your Web page? If so, it's likely the attack was directed at your company specifically. Run some searches and look for calls to action against your company. Consider any public announcements your company has made recently.
Was the target your e-mail server? If so, is it possible that your marketing department sent out a large number of unsolicited e-mails, particularly ones that might be considered spam? Often if a company doesn't provide double opt-in (usually using a confirmation e-mail to make sure that the owner of the e-mail address really wants to be on your mailing list), it can find itself the target of much anger. If you haven't sent out e-mail for a long time, people may have forgotten that they provided their e-mail addresses to your company. Or perhaps your mail server was exploited to relay spam, framing your company for the deed. Spam, actual or perceived, makes people cranky.
Was the target an individual user's IP address? Talk to that user about what he or she was doing at the time the attack commenced. If necessary, mirror the hard drive of the machine and perform a forensic analysis if it wasn't already done during the incident handling process.
Hopefully, if a particular machine encouraged the attack, was targeted by the attack, or started the viral or worm infection, then that machine has already been taken offline and had its hard drive mirrored and examined. If this happens so often that you don't have time to look at all of these problem machines, then you've got a larger issue on your hands.
If the target was a particular machine, look closely at the logs, checking to see whether there were any messages left in them. Sometimes attackers design their attacks so that their angry comments appear in the log files.
Was it a random e-mail that started an infection in your company? Chances are that this is a nonspecific attack, although if it's the type of Trojan that grabs files off user's computers and e-mails them to a specific address, it is possible that a competitor hoped to get sensitive documents from your company. Do you have a copy of the Trojan or virus? If so, either have someone on staff look at the code of the malware or run some searches online and read through any documentation of the malware's innards compiled by a reputable security team. When a piece of malware spreads across the Internet and achieves a certain amount of notoriety, it's commonplace that a number of individuals and teams will go over the code line by line and annotate it or write a general report about what nefarious goals the malware is trying to accomplish.
More Discussion Topics
What was the impact of this incident on the company? If there was no actual damage but the company either disconnected itself from the Internet or was forcibly denied service to the Internet, the only real damage might be to the company's reputation. But a loss of trust on the part of customers or investors can very easily translate to loss of business.
In some intrusions files weren't deleted, but the attacker might have made copies. When files are deleted, you may be able to recover them either from the hard drive or backup media. When files have been copied, there's the possibility that intellectual property has been stolen, which is difficult to detect.
It's always important that your employees have a thorough understanding of why security should be crucial to each of them and what the impact to the whole company could be when they choose to ignore or circumvent security.
Closing the Meeting
The meeting may not come to any sort of natural close, especially if discussions become heated, so make a list of action items. Assign items to the people from whom you need information (such as e-mail timestamps and log files). Follow up with individuals and continue the summary and discussion via e-mail or other group-viewable software. Be willing to provide summarized updates on a regular basis, especially for upper management, but don't overwhelm your constituency with superfluous e-mail.
In closing the meeting, schedule any necessary companywide changes, like a general changing of passwords, which is especially important if the attacker managed to get onto your networks and watch your traffic go by. Even if some of your questions remain unanswered, end the meeting promptly. You and your team will have more than enough work going through the data you've gathered in the meeting and working on creating or improving incident response processes.
If there are questions or issues whose resolution need active participation from multiple groups, you will have to call another meeting. But if you just need confirmations or input from various groups, try to limit follow-up to e-mail or personal calls to each group. Your main concern should be to address gaps in the various processes, to outline problems (like a lack of tools or a communication failure), and to document any other issues that slow or impede incident response. When reporting your findings, focus on identifying areas for improvement, not on placing blame.
Follow up by educating employees, especially the incident coordinators. Having a group of people who know all the processes and who can guide the various parts of the company to cooperate in response to an issue is important. Work with incident coordinators to fix processes or create new ones. They may also be able to help educate the rest of the company on these processes. You definitely want everyone in the organization to understand at least where to report a suspected problem or concern.
The Big Picture
At the heart of all these issues is one important question: do you have an incident response process that works? The answer is probably yes, although you might not believe it at first. An incident response process can be anything from the phone number of your ISP written on a whiteboard (because you can't access their Web page if your connection is down) to a complex set of steps to follow in an emergency.
Instead of fighting to overlay a whole new process onto a set of people who are probably already working too hard, integrate incident recognition and response handling into the daily work procedure. Going back to the example of ISPs finding DoS attacks, notice that their responses to such events were worked into the normal processes of the operators.
If you've never had a security incident, but you want to apply the lessons of this article, consider having a drill. Invent a fictional security event. Keep it as simple as possible and see how the processes work. You can even conduct a post mortem on the drill. It may be difficult to get people to take you seriously, but it helps if you have the support of management.
Turn your security incident from a possible disaster into a galvanizing event. Let it energize your company and encourage it to improve its incident response processes. It can show your company where the flaws are, not so blame can be apportioned, but instead to allow problems to be fixed. And when you're finished, you'll be prepared for next time. And there will be a next time.
Kelly J. Cooper is a CISSP with nine years experience in the Internet Service Provider business, specializing in operations security and incident response. She is a founding member of The CooperCain Group, Inc..
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.