Chapter 5 - Protection

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Updated : October 11, 2001

Microsoft® Exchange 2000 Server

This chapter is part of the Exchange 2000 Server Operations Guide.

To protect your Microsoft® Exchange 2000 Server environment from failure, you need good protection from intrusion and attack, along with a documented and tested disaster recovery procedure to cope with failure. The chapter shows how to ensure that your server running Exchange is protected against these potential failures.

On This Page

Introduction
Protection Against Hacking
Anti-virus Measures
Disaster Recovery Procedures
Summary

Introduction

By its very nature, Microsoft® Exchange 2000 Server has a public face. You will be offering e-mail and other functionality to a large number of users. In many cases those users will not only be able to collaborate with other users in their own company, but also with others across the Internet. This high visibility makes it potentially more subject to attack than other services. You need to make sure that Exchange is well protected against potential attacks, including hacking attempts and viruses.

This chapter also examines disaster recovery scenarios. If you are to meet your service level agreements (SLAs) on availability, you must first ensure that your system is down as infrequently as possible. This is covered in Chapter 2 - Capacity and Availability Management," but you must also make sure that if you do suffer downtime, it is kept to the bare minimum required to restore service. Disaster recovery procedures for Exchange 2000 are detailed in this chapter.

Chapter Start Point

At the start of this chapter, you should be familiar with basic security concepts and different types of backup and restore hardware.

Chapter End Point

By the end of this chapter, you should be aware of appropriate measures to take when guarding against hacker attacks and e-mail bound viruses. You will also be aware of disaster recovery procedures in the event of a failure.

Chapter Sections

This chapter covers the following procedures:

  • Protection against hacking

  • Anti-virus protection

  • Disaster recovery

  • Recovery testing

  • Backup

  • Restore

Protection Against Hacking

Whenever you consider protecting your organization against malicious attack, it is worth recalling one of the golden (and most disillusioning) rules of security: the majority of attacks on a network security come from inside. The reasons for this are obvious. Security is typically more relaxed on the inside of an organization than on the outside, and employees generally have far more knowledge of the workings of a company than outsiders.

Security of an e-mail system is extremely important, because of the power associated with it. Envisage a scenario where an unhappy employee (it is possible that even your company contains some of these people) manages to gain access to their managers e-mail account. The unhappy employee then sends various e-mails posing as their manager, authorizing various decisions that adversely affect the company (and thus their managers position).

To gain access to another person's e-mail account you need to either log in as that person, or gain administrative access to Active Directory™ service, allowing you to grant send as and receive as permissions on the mailbox. (Specifically, you require Account Operator or greater access on the user object and Exchange administrative permissions on the mailbox itself to make the changes.)

The problem with the former method of attack is that it is almost impossible for operations to spot, as the user is successfully logging in as the other party. However, there are steps you can take. In particular, you should have a method for users to report any unusual activity with their e-mail accounts, and you should teach the users how to report any such activity. Typically this would be to notify the help desk. Any reported unusual activity on e-mail should be treated as a security violation and investigated immediately.

Mailboxes that are being accessed by someone other than the primary mailbox owner are reported in the Event Log. Wherever possible, you should ensure that you are notified whenever a security descriptor on a mailbox is chanted. If you are able to also maintain a list of users who should be able to access each mailbox, then you will be able to compare any changes against this list. At the very least, you should try and collect Event Log information that you can consult in the event of a security problem.

To keep your Exchange Server computers secure, look carefully at group memberships. One of the most critical groups you should monitor is the Exchange Domain Servers Group. Any user or computer account that is a member of the Exchange Domain Servers account has full control of the Exchange Organization, so it is extremely important to secure membership of this group. You should also ensure that the membership of the Built-in/Administrators group on the Exchange Server computers is also tightly locked down. Members of this group automatically have Send As permissions on all mailboxes for that server. The most efficient way to control membership of these groups is through Group Policy.

You would also be advised to audit for configuration changes to Exchange. A good change and configuration management system ensures that no changes are made to the system which have not been pre-authorized. So, regular checks of your Event Logs (or any other monitoring system you have chosen) allow you to see if unauthorized changes have been made.

Your Exchange operations department should ensure that it receives security bulletins from Microsoft. To receive these bulletins, visit the following Web site:

https://www.microsoft.com/technet/security/bulletin/notify.mspx

In cases where a security breach has been exposed and a new hot fix needs to be applied, the change should generally be considered urgent and should travel through the change configuration process accordingly.

One of the best ways of protecting against malicious use of e-mail is to use Key Management Server. This allows you to digitally sign and seal messages so that you can determine if a mail has actually come from the person who claims to send it and that the mail has not been altered in transit. Of course for this to work, the security of Key Management Server itself is paramount. Your operations practice should ensure very high security for this server, controlling very tightly who is in the local groups on the server. A password is used to start the Key Management Server and this should be kept on a floppy disk, physically separate from the server after the service has been started.

Of course, you still need to protect your Exchange Server computer against external attack. The rest of this section examines what you need to consider when you are operating one or more firewalls in your environment.

Firewall Operations

Exchange can exist in a variety of different firewall configurations. As part of your planning and deployment you will have chosen how to deploy your firewall solutions around Exchange. Possible deployments could include a single firewall in front of servers running Exchange, to multiple firewalls in front of and behind front-end servers.

Firewall configuration is typically rather complex, so it is very important that operations personnel have a good idea as to exactly how firewalls are configured within their organization, what they should keep out and what they should let in, when they are correctly operating. In a multi-firewall environment, the firewalls are generally manufactured by a number of different vendors, which can make management issues even more complex.

The responsibilities of the operations department in these circumstances will include the following:

  • Ensuring that the firewalls continue to operate properly, implementing authorized changes, and ensuring that there no unintended effects of authorized changes

  • Ensuring that firewalls only admit the traffic they are supposed to

  • Monitoring to detect hacker intrusion

  • Managing security breaches

Maintaining Firewall Availability

Unless your firewall(s) are up and running properly you will be unable to exchange e-mail messages with the outside world. You should therefore place high importance on maintaining the availability of your firewalls. Monitor your firewall availability just as you would monitor the servers running Exchange and ensure that in the event of a failure, notifications are sent to the appropriate parties.

Typically when an Exchange Server computer sends messages via Simple Mail Transfer Protocol (SMTP), it actually drops the e-mail message at the firewall, which in turn forwards it to an external SMTP server. This means that, as far as Exchange is concerned, the message is delivered the moment it is sent to the firewall. So, if the firewall fails, Exchange does not detect this as a problem and reroutes messages accordingly. If a firewall is down for any period of time, you must manually alter the configuration of your environment to make sure that messages route to other firewalls (this typically involves altering the configuration of the SMTP Gateway on the affected route). Make sure that you plan for firewall failure through training, exercises, and drills so you can recover your environment quickly.

Ensuring That Only the Correct Traffic Passes Through the Firewall

Your firewall is only secure if it remains configured as it should be over time. A typical configuration of Exchange is to use front-end servers and have them sitting on a screened subnet with one firewall in front and one behind them, protecting the back-end servers. For example, envisage a situation where you use a front-end server for Microsoft Outlook® Web Access, which you perform over a Secure Sockets Layer (SSL) connection. Table 5.1 shows which ports need to be let through each firewall. (This scenario does not deal with outgoing SMTP mail):

Table 5.1 Port-to-Firewall Configuration Example

Source

Destination

Service

Protocol and port

Internet/External

Screened Subnet

HTTPS

TCP 443

Screened Subnet

Internal/Private Network

DNS

TCP, UDP 53

Screened Subnet

Internal/Private Network

HTTP

TCP 80

Screened Subnet

Internal/Private Network

RPC EP Mapper

TCP 135

Screened Subnet

Internal/Private Network

KERBEROS

TCP UDP 88

Screened Subnet

Internal/Private Network

LDAP

TCP 389

Screened Subnet

Internal/Private Network

NETLOGON

TCP 445

Screened Subnet

Internal/Private Network

DSAccess (GC)

TCP 3268

Screened Subnet

Internal/Private Network

TCP High Ports

TCP 1024+

You should regularly check your firewalls to ensure that the settings have not been altered to allow traffic that should not pass. The outside firewall should only be allowing traffic on port 443 specifically to the front-end servers, and only these front-end servers should be allowed to communicate with the back-end servers on the ports you have defined. You may also want to perform network monitoring to monitor the nature of the traffic that goes through the firewall.

Monitoring Against Hacker Intrusion

No matter how good your firewall setup is, there is still a risk that a hacker may manage to infiltrate it. You should ensure that you have a good intrusion detection system in place to notify you of any firewall breach, and you should make sure that you always have the ability to shut down services if necessary.

Dealing With Security Breaches

In the event of security breach, your priority should be to protect the system. In the majority of corporate e-mail systems, the stores will contain extremely sensitive information and should be protected. This means that, in the case of a security risk, the initial response may be to prevent access to the internal network from the outside world. Provided you manage to catch the intrusion early enough, you will still in most cases be able to allow internal mail to flow.

Once you have contained the breach, you should inform firewall vendors and/or Microsoft about the nature of the breach, so that they can come up with a fix. At this point you can revert the system to its state prior to the breach and apply the fixes supplied.

Anti-virus Measures

As part of your planning and deployment of Exchange 2000, you will have put in place appropriate measures against virus attack. However, regardless of how much protection you put in place, it is quite possible that viruses may affect Exchange. It is therefore very important that you have measures to deal with this possibility.

You are likely to be protecting against viruses at several levels. These may include at the firewall level, outside or at the SMTP Gateway, at each Exchange Server and at the client level. You should of course bear in mind that non e-mail bound viruses can affect Exchange, so all your servers running Exchange should be protected against viruses in the same way that clients are.

Virus scanning at the gateway means scanning each inbound message (and perhaps all outbound messages as well) to detect and clean any infected content. Several vendors provide such software. The neatest technical solution is to use an anti-virus product that integrates as an SMTP event sink. Some vendors do not integrate with the SMTP engine and require the use of the vendor's proprietary SMTP engine. These solutions can work very well, but troubleshooting an additional type of SMTP engine adds complexity to your troubleshooting procedures.

Like gateway virus scanning, a number of vendors provide Exchange Server scanning software. These products scan and disinfect content at the Exchange server and come in one of two varieties:

Scanning software based on the Anti Virus API (AVAPI V2.0). This kind of anti-virus software scans and disinfects virus-laden content before it is added to the Exchange information store.

Scanning software that uses undocumented Exchange store interfaces. These products generally work well, but there is additional support risk in using these products because they use an unsupported interface. If there is a store-related incident on a server with this product, Microsoft Product Support Services (PSS) will recommend that the anti-virus software be disabled early in troubleshooting.

Both gateway-based and information store-based scanning products should provide an automated mechanism for updating the virus scanning patterns. Having timely updates is the best way to ensure that your Exchange implementation remains free of new nasty viruses. Additionally, some scanning products offer the optional use of more than one scanning engine, further increasing the likelihood of catching a virus before it infects your systems.

As part of your operations you must ensure the following:

  • The virus protection is completely up to date at all levels.

  • You have defined procedures in the event of a virus infection.

  • You have a mechanism for handling attachments that pose a virus risk.

Staying Current

New viruses are constantly emerging, and they have the potential to spread worldwide within a period of hours. If you are not fully up to date in your protection, then you run a real risk of viruses infecting your organization. Your operations procedure should ensure that all areas where you scan for viruses are fully up to date. You must make sure that you receive regular security updates from your anti-virus vendors.

In some cases you will receive a warning about a new virus before an update to your anti-virus software is proposed. The first thing to do here is to verify that the virus is genuine. Many problems are in fact caused by hoax virus notifications. Ensure that the virus is a genuine problem by checking with your anti-virus vendors. After you have verified that the virus is indeed a genuine threat, you should notify users so they know what to do if they receive e-mail messages that may contain the virus. You should have a pre-defined mechanism that the user base is fully aware of as to how they should report any suspected viruses. As a short term measure for dealing with this eventuality, most anti-virus software will allow you to block messages with particular subject lines or from particular sources. This can act as a blocking mechanism until you receive an update for your anti-virus software.

Dealing With Virus Infection

Assuming the worst does happen, and you are infected with a virus, the steps you take next are extremely important. You should of course issue an advisory to the user community, so that they know what to do if they receive the virus. You should also notify any partners that you regularly associate with, along with the anti-virus vendors themselves.

Your biggest threat in spreading viruses is the user community, who are, incidentally, your best weapon in defending against the viruses after they have attacked. It is vital that you find a way of communicating with all users, in such a way that all are likely to listen and take notice. If there is a new virus threat, you should e-mail a high-priority message to the users detailing the threat and the recommended action. Make sure that the subject line of the message prominently displays the nature of the threat. You should also advertise the problem prominently on your intranet and use any real-time notification system you have to notify the users, such as voicemail or public address systems. You should even consider having a mechanism in place for putting posters at public parts of your building, for example receptions and elevators. If the users know what to do when they receive a particular message, you can severely restrict the flow of the messages within and outside your organization.

After you have notified the relevant parties, you must do all you can to ensure that the virus does not spread. If a fix is not yet available, in a worse case scenario, this could involve restricting the flow of e-mail within your organization and outside of it (i.e. disabling connectors and possibly network connections).

As soon as a fix is available, you must have a mechanism for deploying updates from each of the virus vendors. In some cases, you may use the e-mail system as a means of distributing hot fixes to local administrators, but in this case, you must have an alternative mechanism, because it is possible that you have had to shut down e-mail communication between servers to prevent the virus from spreading.

Blocking Attachments at the Client

One of the best ways of protecting against virus infection is to block particular attachments from running. Attachments may be blocked at the server level, but they may also be blocked at the client. You can install a security patch on Microsoft Outlook 98 and Outlook 2000. This patch is built into Outlook 2000 Service Release 2 and Outlook 2002 (a component of Microsoft Office XP). The effect of this to prevent certain attachments from running directly from the client (instead they must be saved first) and to prevent other attachments (those considered more dangerous) from being downloaded at all.

Table 5.2 shows the attachments that are prevented from running.

Table 5.2 File Extensions and File Types

File Extension

File Type

.ade

Microsoft Access project extension

.adp

Microsoft Access project

.bas

Microsoft Visual Basic class module

.bat

Batch file

.chm

Compiled HTML help file

.cmd

Microsoft Windows NT® command script

.com

Microsoft MS-DOS program

.cpl

Control Panel extension

.crt

Security certificate

.exe

Executable

.hlp

Help file

.hta

HTML program

.inf

Setup information

.ins

Internet naming service

.isp

Internet communication settings

.js

Javascript file

.jse

Javascript encoded script file

.lnk

Shortcut

.mdb

Microsoft Access program

.mde

Microsoft Access MDE database

.msc

Microsoft Common Console document

.msi

Microsoft Windows Installer package

.msp

Microsoft Windows Installer patch

.mst

Microsoft Visual Test source files

.pcd

Photo CD image, Microsoft Visual compiled script

.pif

Shortcut to MS-DOS programs

.reg

Registration entries

.scr

Screen saver

.sct

Windows script component

.shb

Shell Scrap object

.shs

Shell Scrap object

.url

Internet shortcut

.vb

VBscript file

.vbe

VBscript encoded script file

.vbs

VBscript file

Note: Not all attachments considered to be dangerous are blocked by this patch. For example, the Microsoft Access file types .mda and .mdz are not blocked, nor are zipped versions of any of the above files.

It is good practice to quarantine all suspect content, where it can be examined individually before deciding whether it can be safely passed on or not.

While this security patch can be useful in preventing the use of unauthorized attachments, it is important to remember that for it to work across the user community, it depends on everyone using a client with the patch. Therefore, to be fully protected you would need to ensure not only that MAPI clients each contained the patch, but also prevent access via POP3, IMAP4, or HTTP.

For more information about the Outlook Security Patch, see the knowledge base article 262631.

Many organizations prohibit the receipt of scripts written in Microsoft Visual Basic® Scripting Edition (VBScripts) through e-mail. If you choose to do this, it will not prohibit those who want to receive and run VBScripts from doing so, for they can simply ask the sender to use a different file extension and then change it back to .vbs on arrival. It will, however, prevent the running of VBScripts that have not been pre-arranged. If you wish to go further in preventing the effects of VBScripts, you will need to prevent them from running at the client at all.

Again, the best way of dealing with the threat of attachments is to educate the user community.

Disaster Recovery Procedures

Chapter 2, "Capacity and Availability Management," examined ways of minimizing system failures. As mentioned there, to reduce overall downtime you need to look at how frequently a system is down, alongside how long it takes to bring a system back up again. For more information on Availability Management, refer to Chapter 2.

It makes no sense to have a sound, well-exercised backup strategy unless it's matched with a similarly mature recovery strategy. Backups are meaningless if you can't restore service using them.

Exactly how you perform disaster recovery will depend on how much money you are willing to spend alongside which backup products you want to use. Third parties offer a variety of solutions, including mailbox level backup, and even message level backup in some cases. Another alternative is real-time byte level replication. You will find a list of third party vendors offering backup solutions for Exchange 2000 at the following Web site:

https://www.microsoft.com/exchange/partners/E2Ksolutions.asp

Whichever tools you use, the Operations Manager will need to ensure that the disaster recovery procedures meet the following criteria:

  • Backup is performed regularly and reliably.

  • Your data is protected against fire/theft/natural disaster.

  • Recovery can be performed reliably and quickly.

  • You regularly do test restorations on an offline server to ensure that backups are being correctly created.

  • You regularly run disaster recovery drills to keep skill levels high and ensure that your recovery procedures are up to date.

Backing Up

The rest of this section assumes that you are using Windows NT Backup as your backup and restore software. However, much of the information contained here is relevant whichever backup solution you choose to adopt.

One of the major considerations when performing a backup is how long it will take. When performing an online backup of stores on your servers running Exchange, you suspend other online maintenance activities. Therefore, to allow appropriate time for the other online maintenance activities to occur each night, you need to minimize the length of time backup takes. Also, if it takes a long time to back up these servers, it will almost certainly take a long time to restore them, and to meet your SLAs you will want to keep your recovery times to a minimum.

Your servers running Exchange will potentially consist of multiple stores and storage groups. When backing up stores online, your backup utility will ensure that the appropriate .stm, .edb and .log files are backed up. Although you can back up stores individually, you should back up storage group by storage group. Each store within a storage group is backed up in series, one immediately after the other. You therefore need to size your stores to ensure that you can back up all of the stores in a storage group within the backup window you have defined. By contrast, you can back up your storage groups in parallel, so if you do this, you will not necessarily add to the length of time your backup takes.

One of the best ways of minimizing your backup time is to back up to disk rather than to tape. You should then perform a file backup of the resulting files to an offsite location. Typically these files would be backed up to disk, and multiple copies of the disks made in separate locations. Backing up to disk also has the benefit of ensuring that your restore times are quicker if your disk backup is available at restore time.

Another way of reducing backup times is to perform incremental and/or differential backups. However, wherever possible you should not perform these, because they are likely to increase restore time, and keeping restore times to a minimum is critical in meeting SLAs.

The Operations Manager must ensure that the backups are safely stored in locations that are well protected from natural disaster, fire, and theft.

Note: Think carefully about the offsite storage location you choose. In particular, you may encounter legal difficulties if you choose to store data in another country. Some Exchange data (such as Outlook contacts) are considered private data and may be covered under data protection acts of certain countries.

To make sure you have full recovery of your server running Exchange, it is not enough to simply back up the stores and log files. Exchange is unusable without Active Directory, so you must ensure that Active Directory is being backed up properly, even if the backup itself is not part of your remit.

In Exchange System Manager, when you make configuration changes for a server in the Protocols container, most of those changes are written to the Microsoft Internet Information Services (IIS) metabase (some of the same information is also kept in Active Directory.) So as well as backing up Active Directory, you should also back up the IIS metabase successfully.

Back up the metabase by using the Internet Service Manager Microsoft Management Console (MMC) snap-in will simply back up the metabase.bin files. However, you will also need metabase and installation-specific security keys to allow you to start the metabase upon restore. These are backed up when you back up the system state of the server.

The metabase changes frequently during routine Exchange operations, so you should back up your metabase as often as you back up your server running Exchange. Successful backup of the metabase will prevent you from having to reconfigure settings when restoring the server.

If you are making use of the Key Management Service (KMS) functionality in Exchange to give you secure e-mail, it is absolutely vital that you successfully back up the right components. If you do not, you could end up losing mail across your entire enterprise. You will need to ensure that the following components are backed up:

  • Certification Authority (CA) certificates for each of your CA servers

  • The passwords protecting the CA certificates

  • The KMS database itself

You will also need to ensure that the KMS database startup password is kept in a secure location.

If Exchange 2000 is co-existing with Microsoft Exchange Server 5.5, you may also want to back up the Site Replication Service (SRS). However, this is not essential, and is beyond the scope of this guide.

Finally, you must ensure that you have a complete record of the setup of each of your servers running Exchange. This will allow you to configure similar hardware specifications for similar servers. All of this information would normally be present in the Configuration Management Database. For more information about change and configuration management, see Chapter 3.

Offline Backup

As well as an online backup of Exchange, it may be appropriate under some circumstances to take an offline backup. An offline backup is not always possible in many production environments, because taking databases offline affects whether you can meet your service level agreements. However, where they can be performed, offline backups provide a useful alternative method for restoring an Exchange server if an online backup does not work as it should.

Each Exchange 2000 store consists of an .edb file and an .stm file. With up to 20 stores, public and private, on a server, you need to back up, up to 40 files per server. When you shut down the information store correctly, all the log file information is written to the corresponding .edb and .stm databases, so you do not need to back up the log files when backing up offline.

Disk Imaging

You may want to consider taking a disk image of the server immediately prior to installing Exchange on it. This will allow a recovery server to be built very quickly with all the appropriate settings in the event of a complete server failure.

Restoring

To ensure a swift restore of Exchange 2000, you will need the following items:

  • Available Hardware.

  • Microsoft Windows 2000 Server and Exchange 2000 Server software, plus any appropriate service packs and hot fixes.

  • Any other required Microsoft or third-party software.

  • Full drive backups of the system drives and other logical drives where critical applications or data are installed.

  • System state backups.

  • Exchange database backups. Along with backups of the information store database, you may also need backups of ancillary databases such as the SRS databases and KMS databases.

There are obviously different levels of restore which you may have to perform, varying from recovering single messages to recovering the Exchange Configuration Database (in other words recovering Active Directory).

Recovering Individual Messages

Exchange 2000 has a setting to allow deleted item retention time. By default, this is set to zero. The easiest way of allowing the recovery of individual messages is to increase this value. While there is backup software available offering individual message recovery, you may be advised to set a uniform value for mail item retention time and offer that value as your SLA on message-level retention.

Recovering Lost Mailboxes

Exchange 2000 also has a setting to allow deleted mailbox retention time. The default for this is set at 30 days (although the policy default is set to zero). When you delete an Exchange 2000 mailbox, the mailbox contents are no longer immediately deleted from the information store database. Instead they are preserved for the time you have defined. During the time that the deleted mailbox is on the disconnected list, you can connect that mailbox to another user.

To connect an Exchange 2000 mailbox to another user:

  1. Start Exchange System Manager.

  2. Locate the database that contains the disconnected mailbox, and then click the Mailboxes object for the database.

  3. If the mailbox is not already marked as disconnected, right-click the Mailboxes object, and then force the Mailbox Cleanup Agent to run by clicking Run Cleanup Agent.

  4. Right-click the disconnected mailbox, and then click Reconnect. A dialog box is displayed in which you can choose the new mailbox owner.

Once again, you would be advised to define your SLAs so that mailbox recovery is not possible outside the period of time you specified in the Administrator program. While mailbox recovery is possible outside of this time span, dependent upon your backup software, you may have to restore an entire Exchange database to a server in a different Windows 2000 forest to get the appropriate missing mailbox.

Recovering Exchange Stores and Storage Groups

In this scenario, there has been a problem with the database, for example a corruption in one of the databases and you need to restore them from backup. The rest of your environment has not been affected.

Before doing anything else here, if you are recovering from tape, you should ensure that you have made copies of your existing database files. It may be that when you have recovered from tape, you discover that the tape is bad. You may be able to recover the files that you archived by using troubleshooting techniques, even though the database has a problem. If you always make sure that you never let your database drive become more than half full, then you can quickly save a copy of a database that "crashes" on the same logical drive, dramatically decreasing the time it takes to copy the files, and therefore your recovery time.

When you come to do the restore, you will need to ensure that the information store service is started, and the databases you want to restore are dismounted. You will need to select a temporary folder for the restore. This will contain restored log and patch files, alongside restore.env, a binary file which ensures that the log and patch files are replayed properly at the end of the restore.

Assuming you are recovering a full backup (as opposed to incremental or differential) you should ensure that you select the Last Restore Set in the backup set. This ensures that the log files and patch files are replayed after the restore, taking you more or less to the point of failure. You will not be able to mount the databases unless the last restore set option is checked, so if you forget to do this, you will either have to run the restore again, or use Eseutil to specify that it is the last restore set.

If you are restoring multiple storage groups simultaneously, you must specify a different temporary folder for each one. This ensures that the different restore.envs do not overwrite one another.

Restoring an offline backup is not generally recommended because it does not allow you to roll forward to the current status. However, it can be useful in a situation where a restore from an online backup has failed. The important thing to realize here is that the .edb and .stm files should be regarded as one and restored together in the same directory. Also, when performing the restore, you should delete all log and database files on the recovery server before copying over the new ones. The recovery server will create its own new log files when you start the services up again.

Full Exchange Server Recovery

In any area where a server running Exchange is liable to fail, you will need hardware to perform the restore. Lack of redundant hardware can often be the most significant factor in downtime resulting from a full server failure. If you standardize your hardware for each Exchange 2000 server role, you can significantly reduce the number of standby Exchange 2000 computers required, and make recovery more of a standard procedure. In most cases the standby servers will need to be physically located at the data centers, so reducing the number of data centers can also reduce the amount of redundant hardware required to ensure fast recovery time.

Even if your Exchange Server computer suffers a catastrophic hardware failure, you should not lose the majority of its configuration information because this is stored in Active Directory and Active Directory will be available on many other servers.

You will need to ensure that you can quickly build the server to a point where you can put Exchange on to it. This means that the server will need to be running the version of Windows 2000 that Exchange was previously running on, have the same name and be a member of the same domain that the previous server was. One of the fastest ways of achieving this is to use disk imaging (as mentioned in a previous section).

After you have done this, it is not enough to simply re-install Exchange. This would fail, because the Exchange configuration information is already in Active Directory and a reinstall would try (and fail) to overwrite it. So instead, you should run setup with the /disasterrecovery switch. This switch assumes that the configuration information for Exchange is already in place and just installs the program files and registry settings. It searches Active Directory for information about the Exchange Server Object and reconfigures local settings according to what it finds.

When running the /disasterrecovery switch, it is important to ensure that you are aware of which components are installed on the system, because you need to explicitly state these when performing the recovery. This information should be in your Configuration Management database, but it will also be visible, of course, under the server object in Active Directory.

After you have recovered the Exchange Server, it will then be a matter of recovering stores (as described earlier), recovering the IIS metabase and potentially recovering the SRS, KMS, and CA databases.

Recovering Exchange After Active Directory Failure

One of your main concerns in operations should be to ensure that this never happens. As you will have already noticed, Exchange 2000 is completely dependent on Active Directory. If you are to fully protect your Exchange environment, you should do all you can to ensure that Active Directory is as resilient as possible.

However, this does not mean that an Active Directory failure makes Exchange 2000 completely unusable. You should be able to recover an Exchange Organization, provided you have access to information about the Exchange configuration, including the Exchange storage group and store names for each server, plus a key item, known as the legacyExchangeDN attribute for each administrative group. Although this attribute is designed to allow servers running Exchange 5.5 and Exchange 2000 to communicate with one another, the attribute is used by all servers running Exchange, and the legacyExchangeDN of the server must match that of its administrative group. It is worth noting however that this procedure across an enterprise will cost a huge amount of time, pain and money and should be avoided if at all possible.

Alternate Server Recovery

You may wish to perform alternate server recovery for a number of reasons. One of the most common is that you need to perform some sort of database maintenance and you want to check that the maintenance will not cause any problems with the database (you should note however, that unless the hardware of the alternate server is identical, you cannot guarantee that the same results will occur in the live environment). Another reason for alternate server recovery is to recover a mailbox that has expired from the Exchange Server.

Alternate server recovery is like rebuilding Exchange when you have lost Active Directory except on a much smaller scale. If you are recovering Exchange stores onto another server while the original server exists, the second server must be in a separate Windows 2000 forest. You will need to know the storage group and database names on the original server and the legacyExchangeDN of the administrative group the server belongs to.

You may end up performing alternate server recovery quite regularly. For example, you may undergo this process and then perform offline defragmentation of the database. If the offline defragmentation produces significant reduction in the database size, you could then take the production server offline to defragment it (depending on the terms of your SLA).

If alternate server recovery is a regular part of your operations routine (and it probably should be, as discussed in the next section), you should have a separate Windows 2000 forest set up permanently. This forest would contain administrative groups with all the correct legacyExchangeDNs already set up, preventing you from having to recreate it each time you did the alternate server restore.

Recovery Testing

The key to reducing downtime during a restore procedure is to assume that system failure will happen and that you are fully prepared for it when it does. This involves having hardware, software, and backup sets available. It also involves having staff trained in restore procedures available at all times.

In training your staff, you should note that restoring to another online server while the first is online is a very difficult procedure to the majority of disaster recovery procedures, because you have to recover to a different forest under those circumstances. The best way to simulate the type of restore you may have to perform in an emergency is using a test network that is completely separate from the main network. This allows you to simulate anything from failures of stores, to total hardware failures of servers and learn what to do under those circumstances.

However, this does not mean you should not perform alternate server restores. These restores tell us other information, such as the fact that the backup software/tapes/storage procedures are working properly, and that a particular live database can be backed up and restored with no hitches. After all, there is no point in having highly trained staff to do a restore if the restores themselves will fail because of faulty tapes. You should ensure that every one of your databases has been restored to an alternate server at least once every six months.

The Operations Manager should be responsible for ensuring that the organization is fully prepared for disaster recovery. This involves regular restores being performed for each Exchange server and each backup device, by the staff who would be involved in restores when they are actively required.

Summary

In an ideal world, Exchange would never suffer problems. However, we live in a world of very diverse hardware and software, viruses and hackers, so it is inevitable that sometimes you are going to run into difficulties with your Exchange configuration.

As you have seen here, to help you meet your SLAs it is vital to minimize the recovery time in the event of a failure. However, in some cases it is very important that you shut down services to protect the system, even though this will affect the availability measurement defined in your SLA.

To protect operations against the unforeseen it is important to factor in unusual circumstances in your service level agreements. For example you may have established that you have such resilient hardware and such efficient restore technology that you are able to achieve 99.998 percent uptime in your organization (around 10 minutes downtime across your organization per year). However, if a new virus hits your company because the anti-virus vendor hasn't informed you about it, then you may end up having to shut down Exchange services just to prevent more damage. You can deal with this eventuality in two ways. Either you can take a risk assessment on the effect of such unforeseen circumstances and reduce the SLA accordingly, or you can simply modify the SLA so it states that if the corporation is victim to an unforeseen hacker attack or virus attack and you are able to show that you used your best efforts to combat the problem, then this downtime will not count against the service level agreement.

Availability Management and Change Management (Chapters 2 and 3)

More Information

Microsoft Operations Framework (MOF) provides technical guidance and industry best practices that encompass the complete IT service management environment including security administration, disaster recovery (service continuity management), backup and restore (storage management), availability management, and change management.

For more information about Microsoft Operations Framework, go to the following Web site:

https://www.microsoft.com/technet/itsolutions/cits/mo/mof/default.mspx

For prescriptive MOF information on security administration, disaster recovery (service continuity management), backup and restore (storage management), availability Management, and change management, please review the detailed operations guides that can be found at the following Web site:

https://www.microsoft.com/technet/prodtechnol/windows2000serv/default.mspx