Preventing Downtime with Redundant Components

Article
01/28/2010

Updated : November 14, 2002

Preventing Downtime by Using Redundant Components

Hardware redundancy prevents downtime caused by hardware failures by detecting a failing component before it actually fails and bypassing a failure when it does occur. To achieve server hardware redundancy, deploy server-class hardware, a redundant storage subsystem, and redundant network cards. This chapter discusses using redundant components within a server to prevent downtime. Chapter 2, "Overcoming Barriers to High Availability" discusses the importance of redundant network infrastructure components, such as switches and routers, to ensure connectivity to the data center.

Using Server-Class Hardware

Server-class hardware monitors each critical server component for failure, notifies an administrator when a failure occurs, and includes redundant components that enable the server to work around the failure. Server-class hardware includes some or all of the following features:

Redundant power supplies — Redundant server and disk array power supplies provide a secondary power supply if the primary power supply fails.
Redundant fans — Redundant fans ensure that sufficient cooling exists inside of the server if a cooling fan fails.
Redundant storage subsystem — A redundant storage subsystem provides protection against the failure of a single disk drive or controller. (See "Using a Redundant Storage Subsystem" later in this chapter.)
Redundant memory — Redundant memory provides memory if a memory bank fails.
ECC memory — Error-correcting code (ECC) memory detects and corrects single-bit errors and takes the memory offline if a double-bit error occurs.
Redundant network interface cards — Using redundant network interface cards (NICs) ensures that clients can connect to the data center if a NIC or a network connection fails. (See "Using Redundant Network Cards" later in this chapter.)
Power-on monitoring — When the server is initially turned on, the server detects startup failure conditions, such as abnormal temperature conditions or a failed fan.
Prefailure monitoring — While the server is running, prefailure conditions are monitored. If a component, such as a power supply, hard disk, fan, or memory, is beginning to fail, an administrator is notified before the failure actually occurs. For example, a failure detected by ECC memory is corrected by the ECC memory or routed to the redundant memory, preventing a server failure. An administrator is immediately notified to rectify the memory problem.
Power failure monitoring — When a power failure occurs, system shutdown software ensures a graceful shutdown if necessary in conjunction with an uninterruptible power supply (UPS).

Advanced server-class hardware can also include the following data-protection features:

Lock-stepped processors, which are two processors that execute the same instruction stream and that crosscheck each other
End-to-end checksums on data sent to storage devices
Parity checking on internal buses

Using server-class hardware minimizes the likelihood of hardware failures and uses monitoring and redundancy to limit the effect of a failure on data-center availability. Server-class hardware provides a number of notification options, including on-screen messages, server logs, e-mail, and pager notifications. Some servers also include a battery-powered card that stores failure information that can be accessed even when a server has completely failed.

Using a Redundant Storage Subsystem

To ensure that your Microsoft SQL Server data center is highly available, disks containing the operating system, SQL Server system and user databases, and applications must be supported by a redundant storage subsystem. Even if the contents of a drive are read-only, it still takes time to replace a failed hard disk and to restore its contents. The data files and the transaction log files for these databases should be placed on separate disk arrays for maximum recoverability.

External disk arrays provide the best availability. The external disk array should contain redundant power supplies, redundant cooling fans, and redundant controllers. The disk array can be attached to the server by using either SCSI or FibreChannel. Although SCSI is less expensive, FibreChannel provides higher bandwidth and enhanced performance. The disk array can be managed by the server or managed independently in a storage area network (SAN) using third-party software. A SAN is independent of any particular server or operating system and is attached to the servers using it through a dedicated FibreChannel network.

To provide disk redundancy in the event of a site failure, third-party hardware manufacturers provide remote mirror solutions that duplicate the locally redundant storage system to a remote site and provide transactional consistency between the sites. Remote mirror solutions require a SAN at each site and a dedicated fiber cable between sites that are no more than 100 kilometers apart. Although a remote mirror provides disk redundancy to a remote site, if you want a solution that provides automatic failover of the data center to a remote site, deploy a stretch cluster. For more information about stretch clusters, see Chapter 5, "Minimizing Downtime by Using Redundant Servers."

RAID

Configure the disks in an attached disk array or a SAN as a redundant array of independent disks (RAID). RAID has different implementations, with different levels of fault tolerance and performance. The most popular and common RAID implementations are:

Striping (Raid 0)
Mirroring (Raid 1)
Striping with distributed parity (Raid 5)
Striped mirroring (Raid 10 or Raid 1+0)
Mirrored striping (Raid 0+1)

Different vendors use different terminology for implementations of striped mirroring and mirrored striping. Many use the terms interchangeably and incorrectly.

Note: Many manufacturers support additional RAID implementations providing advanced capabilities. These additional implementations are referred to in different ways, including RAID 10E, RAID 30, RAID 50, and RAID 53.

Regardless of the RAID implementation you choose, have at least one drive installed as a hot spare so that there is continuous fault tolerance when an active drive fails. Disk array controllers and SAN devices can automatically substitute a failed drive with an installed spare. Replace the failed drive immediately so that you always have a hot spare.

Striping

Striping writes logical blocks of data across two or more disks, creating a single logical volume with no redundant information between the disks. Striping is very fast and inexpensive, but it provides no fault tolerance. Do not use striping with a high-availability data center.

Mirroring

Mirroring writes all data across two or more disks, creating a single logical volume with completely redundant information on each disk. Mirroring provides a high level of availability; the mirrored volume can survive the failure of any disk in the mirror. Because each disk in the mirror provides redundancy, the available capacity of the mirrored volume is equal to the size of a single disk. (In a two-disk mirror, 50 percent of the disk capacity is used for data protection.) With mirroring, read and write performance is asymmetrical. Read performance is very fast because data is available from each disk in the mirror. Read performance can be even faster if the RAID array or SAN controller supports simultaneous reads from each member of the mirror. Writing to a mirror is slower than writing to a single drive because the data must be written to multiple disks. If the disks are on separate I/O buses using separate controllers — a process known as duplexing — you can achieve higher availability by eliminating the controller as a point of failure. Mirroring requires a minimum of two drives to implement.

Striping with Distributed Parity

Striping with distributed parity writes logical blocks of data across multiple disks, creating a single volume with parity blocks spread equally among all disk drives. Parity blocks enable recovery of the data on the RAID volume if a single disk fails. If two disks fail, however, the RAID volume fails. Striping with distributed parity is less expensive than mirroring because only the capacity equal to one disk is used for data protection.

Read performance is faster than mirroring because the data is read simultaneously from multiple disks. Disk writes are slower than mirroring because each write takes four I/O operations to create and write the data and parity blocks. These I/O operations occur consecutively; each I/O operation must wait for the previous I/O operation to complete. Because striping with distributed parity is more economical than mirroring, it is the most common RAID configuration. When a disk fails, however, read and write performance immediately decreases. Every read or write request to the failed drive initiates a verification process against all other drives. This reduction in performance continues until the failed disk is replaced and the system completely rebuilds the RAID volume.

Striped Mirroring

Striped mirroring writes logical blocks of data across two or more mirrored sets of disks, creating a single volume with no redundant information between the mirrored sets of disks. Each mirrored set of disks consists of two or more disks. Striped mirroring provides the same high level of availability as mirroring. The available capacity of a striped mirror is equal to that of each disk in the mirror set (in a two-disk mirror set, 50 percent of the disk capacity is used for data protection). Unlike mirroring, however, striped mirroring provides symmetrical performance. Read performance is very fast because data is available from each disk in the mirror. In striped mirror implementation, RAID array and SAN controllers support simultaneous reads from each member of the mirror, which maximizes read performance. Write performance is also fast because writes occur simultaneously across each mirrored set.

With striped mirroring, if a drive fails, read and write performance is slightly reduced, but only for data stored on the mirror set containing the failed disks. Read and write performance to and from the other mirrored sets is not affected. Striped mirroring can survive the failure of multiple drives, provided at least one drive in each mirrored set survives.

Striped mirroring is the best solution for a database server requiring high performance and high availability. Capacity can be a problem, however, because of the number of disks required and their cost.

Mirrored Striping

Mirrored striping writes logical blocks of data across two or more disks, creating a single logical volume with no redundant information between the disks, and then mirrors the volume. Read and write performance is very fast, but mirrored striping is not as fault tolerant as striped mirroring. When a single disk in a stripe fails, the data remains available in the other stripe, but the stripe is broken. With one broken stripe, the failure of a disk in the remaining stripe causes the data to become unavailable. In addition, when a stripe fails, read performance is slower because only one stripe is functional.

Note: Because many vendors confuse the terms RAID 10, RAID 1+0, and RAID 0+1, use the terms mirrored striping and striped mirroring. To maximize availability, use striped mirroring.

Controller Cards

Use multiple SCSI or Fibre Channel controller cards in the external array for redundancy and to enhance performance through automatic load balancing. Choose cards with enough channels to split the logical grouping of disks — that is, data and logs — to reduce I/O contention.

Not all write caching is safe for a database server to use. Be sure that your disk controller prevents the uncontrolled reset of the caching controller, has on-board battery backup, and uses mirrored or ECC memory. Do not implement write caching unless the hardware vendor guarantees that the write cache includes all features required to prevent data loss.

If you are using failover clustering, using an internal caching controller can cause data corruption unless the controller mirrors its cache to all nodes of the cluster. With an internal caching controller in a cluster, the cache in the controller on a failed node may contain completed transactions that are unknown to the surviving node (unless cache mirroring is used).

Data and Log File Placement

Place the data and log devices for each database on separate disks, using as many channels as possible to ensure redundancy in case a disk or mirrored set fails. Splitting the I/O between disks reduces performance issues caused by bottlenecks, which users can perceive as a lack of data-center availability. Use the following guidance when determining the type of storage redundancy for data and log files:

Data file drives — For databases larger than a single disk, use striped mirroring for all data files for maximum availability and maximum performance. For databases that can be contained on a single disk, use mirroring. For very large databases spanning many disks, use striping with distributed parity unless the additional cost of striped mirroring can be justified.
Transaction log file drive — Unless transaction log performance is an issue, use mirroring for all transaction log files. Where transaction log performance is an issue, use striped mirroring to increase performance. Do not use striping with distributed parity because of its poor write performance.

Note: The disks containing the operating system should be separate from the data and transaction log files. They should also be mirrored to ensure the operating system's availability.

Using Redundant Network Cards

To ensure that users can access the data center, use redundant network interface cards (NICs), and use NIC teaming to provide automatic failover between the NICs in the event of a failure. NIC teaming combines two or more physical NICs into a single logical NIC, which ensures that the data center always has an active link to the network. To use NIC teaming, connect each NIC card to a different switch on a different subnet. NIC teaming requires software from the NIC vendor, and each NIC is configured to use a common virtual IP address. When all NICs are working properly, their combined bandwidth is pooled for increased performance. When a teamed NIC begins to fail, the software stops using the failing NIC and routes all network communication over the remaining NIC or NICs. This failover process is transparent to the operating system and other devices on the network.