Recovering a Customer from an Active Directory 'Denial of Service'

Article
02/20/2014

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Background

I got the call one afternoon - I heard "one of our customers has been hacked" and promptly dropped what I was doing. I asked the person on the other end of the phone what was currently down and what they'd done so far to restore service, but it sounded bad, really bad. Email was down, no one could login to their workstations - even the domain controllers refused all logins. The Active Directory-dependent web application had been down all day, and if it couldn't be revived by the next morning, there would be serious consequences for the entire business.

I raced over to the customer site and sat down with their surprisingly calm and rational triage team. We walked through the symptoms, steps they'd taken to resolve the issue, and what they thought had happened:

The first sign of trouble was that their Exchange email was unavailable as of 4am that day
Their chief sysadmin came in and discovered he couldn't log into any of the three DCs interactively, using any of the administrative accounts at his disposal
The employees weren't able to log into the domain when they got into work that morning
Systems left logged in from the day before were still usable, but only for local applications (e.g. MS Word)
They'd confirmed that all servers were still up and running, but all access to remote resources was failing - no new connections could be made to file shares, Exchange mailboxes, and even Web access was down
The administrators of Active Directory were unable to connect to the domain controllers using the usual MMC tools (e.g. Computer Management, Active Directory Users & Computers, Group Policy) and Terminal Server clients
Worst of all, their hosted web sites were either performing very slowly (with many "Server too busy" errors) or weren't letting users log in at all

By the time I arrived, our customer had spent the entire day on the phone with Product Support Services, and had exhausted all the phone-based diagnoses available in a situation like this. They were preparing to restore one of their domain controllers from backup tape and perform an Authoritative Restore from tape, when I arrived.

How did we proceed to fix the issue?

My customer hadn't determined the root cause of the problem, nor had they found a way to restore services at this point - our goals at this point were to:

Get the web sites up and authenticating users once again
Restore corporate network access
Determine the root cause of the problem
Take steps to prevent future outages

I was pretty sure the problem originated with the Security Policy templates that are stored in the SYSVOL (and are identical in layout to the security templates that you can create using the Security Templates MMC), but didn't know what had actually happened (we didn't necessarily suspect a targeted attack at the time). I had brought along with me a copy of WinPE, which we booted up on one of the affected DCs, and searched the %systemroot%\SYSVOL directory for all *.inf files.

We discovered that two of the GptTmpl.inf files - specifically, the two templates that were stored under the GUIDs for Default Domain Controller Policy and Default Domain Policy GPOs - had been altered only the previous night (and in fact, altered within three minutes of each other). We immediately loaded them into Notepad (which is available in the WinPE boot image), and quickly confirmed the problem:

Two entries in the security template of the Default Domain Policy had been altered - the SeInteractiveLogonRight and SeNetworkLogonRight entries were both now blank
Two entries in the security template of the Default Domain Controllers Policy had been altered - the SeInteractiveLogonRight was now blank, and the SeDenyInteractiveLogonRight was now populated with *S-1-1-0 [i.e. the Everyone group]

Fortunately the AD systems administration team had performed a file-level backup the previous night (before this change had been introduced), so we simply copied the known good templates over these altered templates; this is possible because WinPE mounts the NTFS partitions (including the one containing the SYSVOL) as full writeable partitions. We also incremented the "version=" parameter in the corresponding gpt.ini files so that all Active Directory clients (i.e. all Windows 2000 member computers) would consider the policies to be new and would download & apply them locally.

We rebooted the domain controller and within about an hour we had the network back up and running.

What happened?

In a situation where precise administrator-level changes have been made, it is difficult to be sure (a) how access was precisely gained, (b) how the changes were mde and (c) what other undiscovered changes still exist. Audit records (to show where entry was made and what tools were used to make the changes) may have been erased by someone with full Domain Admin privileges, so looking into logs after the fact may be unsuccessful (as it was in this case). As in cases of virus or Trojan attacks, it's impossible to be sure that additional changes weren't made, and that "backdoors" hadn't been left behind for attackers to exploit at a later date.

However, we can make a few educated guesses:

It appears that password-based VPN services were used to gain access to the network, since physical access controls were tightly controlled for those servers and there were no direct network access links available (either via dial-up modems or inbound router access)
There were many accounts with Domain Administrators membership whose passwords had not been changed after a disgruntled system administrator left - this included service accounts for Exchange Server
All domain controllers had been left with default membership in the "access this computer from the network" privilege and the Terminal Server remote access

There are a number of recommendations that were made to this customer, some that were specific to the kind of situation they'd just experienced, but many that should be taken in advance of any such network compromise. The following were just the highlights, however there are many more recommendations that are available in the documents noted at the end of this article.

Change all passwords for admin-level accounts simultaneously - this includes accounts in the Administrators, Domain Administrators, Account Operators groups, or any other group or user who has been granted "Change Password" permissions on any User Objects in any domain in Active Directory
- This ensures that the attacker cannot use an unchanged account to create new accounts or to reverse password changes on newly-changed accounts
- The result of this would be to contain any sweeping powers the attacker may have been using
Tighten remote access privileges for as many users as you can, especially for unused accounts
- this can be revoked using the "Dial-in tab" on each specific user account in Active Directory Users & Computers, or
- through the Remote Access Policies that can be set more globally through the Internet Authentication Service that comes with Windows 2000
Start proactively monitoring admin-level access to the network
- E.G. audit all Logon and Account Logon successes and failures on your key servers, and use tools such as Microsoft Operations Manager to collect and report on audited Event Log activity - alerting when unusual activity takes place, such as
Turn off unnecessary remote access services
- E.G. Remove all the modems that allowed inbound calls, and/or reconfigure RAS properties to use the IAS Server Remote Access Policies
Follow the Windows 2000 Server Baseline Security Policy best practices checklist
Tighten the privileges on servers and domain controllers to restrict at least the "Access this computer from the network" privilege
- E.G. grant the "Deny access to this computer from the network" privilege to a custom local group on each server which contains the administrators who should *not* have access to each server (leaving only those who should have such access via the "Access this computer from the network" privilege which is normally granted to the Everyone or Authenticated Users group)
Tighten the permissions on the RDP-TCP connection in Terminal Services Manager, to permit only the bare minimum necessary functionality
- E.G. grant only "Guest Access" (= the Logon permission) to a custom local group you create which contains the
Develop and follow a patch management strategy
- E.G. configure and run the Microsoft Software Update Service or Microsoft Systems Management Server to distribute all security-related patches to all servers
E.G. Run the Microsoft Baseline Security Analyser on a regular basis to monitor servers for any unsuccessfully applied patches, and for any misconfigured security settings that can be audited by MBSA

Conclusion

If you've read this far, chances are you're thinking, "Thats a lot of work". GOOD - you're not one of the ones I'm worried about. The ones I'm worried about is the person thinking "this'll never happen to me, and though this sounds like a good idea, I don't have time to worry about this". Please don't kid yourself, and don't give a key business enabler (Active Directory) less than it's due.

As your organization grows and evolves, technologies like Active Directory will become more and more critical to the smooth running of the operations - and any loss of availability will mean greater consequences for your business. Consider the risk of losing your Active Directory for an hour, or a day - what will happen if people can't read their email, or gain access to those "single sign-on business" applications, or even sign on to their workstations? In such an environment, you really can't afford not to protect your technological investments.