Introduction to the Exchange 2003 High Availability Guide

 

Messaging systems are mission-critical components for many companies. However, circumstances such as component failure, power outages, operator errors, and natural disasters can affect a messaging system's availability. To help prevent against such circumstances, it is crucial that companies plan and implement reliable strategies for maintaining high availability. As an added benefit, a highly available messaging system can save money by providing consistent messaging functionality to users.

Whether you are deploying a new Microsoft® Exchange Server 2003 installation or upgrading from a previous version of Exchange Server, this guide will help you plan and deploy a highly available Exchange Server 2003 messaging system.

Note

Many of the high availability recommendations in this guide are related directly to the planning recommendations in Planning an Exchange Server 2003 Messaging System. Before using this guide to implement your high availability strategy, you should first read Planning an Exchange Server 2003 Messaging System.

Who Should Read This Guide?

This guide is designed to benefit information technology (IT) professionals who are responsible for planning and designing Exchange messaging systems. These professionals include:

  • System architects   Those people who are responsible for designing the overall server infrastructure, developing server deployment strategies and policies, and contributing to networking connectivity design.

  • Information technology (IT) managers   Those people who are the technical decision makers and who manage the IT staff responsible for the infrastructure, the desktop and server deployment, and server administration and operations across sites.

  • System administrators   Those people who are responsible for planning and deploying technology for servers running Microsoft Windows Server™ 2003 or Microsoft Windows® 2000 Server and evaluating and recommending new technology solutions.

  • **Messaging administrators   **Those people who are responsible for implementing and managing organizational messaging.

What Technologies Does This Guide Cover?

The following is a list of technologies related to high availability. This guide provides general information about how these technologies relate to Exchange 2003 messaging systems. For specific information about each technology, refer to the corresponding URLs.

Terminology

Before reading this guide, familiarize yourself with the following terms.

  • Availability
    Availability refers to a level of service provided by applications, services, or systems. Highly available systems have minimal downtime, whether planned or unplanned. Availability is often expressed as the percentage of time that a service or system is available, for example, 99.9 percent for a service that is unavailable for 8.75 hours per year.
  • Fault tolerance
    Fault tolerance is the ability of a system to continue functioning when part of the system fails. Fault tolerance is achieved by designing the system with a high degree of redundancy. If any single component fails, the redundant component takes its place with no appreciable downtime.
  • Mean time between failures (MTBF)
    Mean time between failures (MTBF) is the average time interval, usually expressed in thousands or tens of thousands of hours (sometimes called power-on hours or POH), that elapse before a component fails and requires service.
  • Mean time to repair (MTTR)
    Mean time to repair (MTTR) is the average time interval, usually expressed in hours, that it takes to repair a failed component.
  • Messaging Dial Tone
    Messaging Dial Tone is a recovery strategy that provides users with temporary mailboxes so they can send and receive messages immediately after a disaster. This strategy quickly restores e-mail service in advance of recovering historical mailbox data. Typically, recovery will be completed by merging historical and temporary mailbox data.
  • Network Load Balancing (NLB)
    Available in all editions of the Windows Server 2003 operating system, Network Load Balancing (NLB) load balances incoming Internet Protocol (IP) traffic across server computers that are included in a NLB cluster. NLB enhances both the scalability and performance of IP-based programs such as Web servers, streaming media servers, firewall servers, and Exchange Server 2003 front-end servers.
  • Network-attached storage
    Network-attached storage refers to products that use a server-attached approach to data storage. In this approach, the storage hardware connects directly to the Ethernet network through Small Computer System Interface (SCSI) or Fibre Channel connections. A network-attached storage product is a specialized server that contains a file system and scalable storage. In this model, data storage is de-centralized. The network-attached storage appliance connects locally to department servers, and therefore, the data is accessible only by local servers. By removing storage access and its management from the department server, information contained on network-attached storage can be transferred more quickly because they are not competing for the same processor resources.
  • Planned downtime
    Planned downtime is downtime that occurs when an administrator shuts down the system at a scheduled time. Because the downtime is scheduled, administrators can plan for it to occur at a time that least affects productivity.
  • Redundant array of independent disks (RAID)
    Redundant array of independent disks (RAID) is a method used to standardize and categorize fault tolerant disk systems. RAID levels provide a various mixture of performance, reliability, and cost. Some servers provide three RAID levels: Level 0 (striping), Level 1 (mirroring), and Level 5 (RAID-5).
  • Reliability
    Reliability is an attribute of any computer-related component (for example, software, hardware, or network components) that consistently performs according to its specifications. You measure reliability by calculating the probability of failure for a single solution component.
  • Scalability
    Scalability refers to a measure of how well a computer, service, or application can improve to meet increasing performance demands. For server clusters, scalability is the ability to incrementally add one or more systems to an existing cluster when the overall load of the cluster exceeds its capabilities.
  • Server clustering
    A server cluster is a group of independent computers that work together to provide a common set of services. The Cluster service is the clustering feature provided with Windows Server 2003. If a cluster node (a computer in a cluster) fails, other nodes in the cluster assume the functions of the failed node.
  • Service level agreement (SLA)
    A service level agreement (SLA) is an agreement between a service provider and a customer that specifies, usually in measurable terms, what services the provider will furnish. More recently, IT departments in major enterprises have adopted the practice of writing an SLA so that services for their customers (users in other departments within the enterprise) can be measured, justified, and perhaps compared with those of outsourcing network providers.
  • Storage Area Network (SAN)
    A storage area network (SAN) is a private, sometimes high-performance network (or sub-network) that interconnects different kinds of data storage devices with associated data servers on behalf of a larger network of users. Typically, a SAN is part of the overall network of computing resources for an enterprise.
  • Unplanned downtime
    Unplanned downtime is downtime that occurs as a result of a failure (for example, a hardware failure or a system failure caused by improper server configuration). Because administrators do not know when unplanned downtime could occur, users are not notified of outages in advance (as they would be with planned downtime).
  • Windows Server 2003 clustering technologies
    Server clustering and Network Load Balancing (NLB) are two Windows Server 2003 clustering technologies that provide different availability and scalability solutions. For more information, see "Server clustering" and "Network Load Balancing" earlier in this list.

For more information about Exchange terminology, see the Exchange Server 2003 Glossary.