Adding redundancy to your storage design is critical to achieving high availability. RAID storage behind a battery-backed controller is highly recommended for all Exchange servers. There are many RAID types, and many proprietary modifications to the known RAID types. However, the four most common types used in server environments are RAID-1/0, RAID-5, RAID-6, and RAID-DP.
The following table compares RAID-1/0, RAID-5, and RAID-6 solutions based on speed, space utilization, and performance during rebuilds and failures.
Comparison of RAID solutions
|
RAID type
|
Speed
|
Capacity utilization
|
Rebuild performance
|
Disk failure performance
|
Transactional I/O performance
|
|---|
|
RAID-1/0
|
Best
|
Poor
|
Best
|
Best
|
Best
|
|
RAID-5
|
Good
|
Best
|
Poor
|
Poor
|
Poor
|
|
RAID-6*
|
Poor
|
Good
|
Poor
|
Poor
|
Poor
|
Note: |
|---|
|
*The performance for RAID-6 varies depending on disk layout, storage controller, and storage configuration. Talk to your storage vendor for detailed performance information for RAID-6 solutions.
|
RAID-1/0
RAID-1/0 is where data is striped (RAID-0) across mirrored (RAID-1) sets. RAID-0-1 is not the same as RAID-1/0, and we do not recommend RAID-0-1 for Exchange data. Transactional performance with RAID-1/0 is very good because either disk in the mirror can respond to read requests. No parity information needs to be calculated, so disk writes are efficiently handled. Each disk in the mirrored set must perform the same write.
When a disk fails in a RAID-1/0 array, write performance is not affected because there is still a member of the mirror that can accept writes. Reads are moderately affected because now only one physical disk can respond to read requests. When the failed disk is replaced, the mirror is again established, and the data must be copied or rebuilt.
RAID-5
RAID-5 involves calculating parity that can be used with surviving member data to re-create the data on a failed disk. Writing to a RAID-5 array causes up to four I/Os for each I/O to be written, and the parity calculation can consume controller or server resources. Transactional performance with RAID-5 can still be good, particularly when using a storage controller to calculate the parity.
When a disk fails in a RAID-5 array, the array is in a degraded state, and performance is less and latencies are higher. This situation occurs because most arrays spread the parity information equally across all disks in the array, and it can be combined with surviving data blocks to reconstruct data in real time. Both reads and writes must access multiple physical disks to reconstruct data on a lost disk, thereby increasing latency and reducing performance on a RAID-5 array during a failure. When the failed disk is replaced, the parity and surviving blocks are used to reconstruct the lost data, which is a lengthy process that can take hours or days. If a second member of the RAID-5 array fails during the Interim Data Recovery Mode or rebuild, the array is lost. Because of this vulnerability, RAID-6 was created.
RAID-6
RAID-6 adds an additional parity block and provides approximately double the data protection over RAID-5, but at a cost of even lower write performance. As physical disks grow larger, and consequently RAID rebuild times grow longer, in some cases RAID-6 is necessary to prevent logical unit number (LUN) failure if an uncorrectable error occurs during the rebuild, or if a second disk in the array group fails during rebuild. Due to disk capacity, some vendors support RAID-6 instead of RAID-5.
Note: |
|---|
|
For more information about the Storage Network Industry Association definition of RAID-6, see SNIA Dictionary Links. The third-party Web site information in this topic is provided to help you find the technical information you need. The URLs are subject to change without notice.
|
RAID-DP
RAID-DP from NetApp is a proprietary implementation of RAID double parity for data protection. RAID-DP falls within the Storage Network Industry Association definition of RAID-6. RAID-DP is also a trademark of NetApp.
Unlike traditional RAID-6, RAID-DP utilizes diagonal parity using two dedicated parity disks in the RAID group. RAID-DP is also similar to other RAID-6 implementations in terms of the reliability metrics and its ability to survive the loss of any two disks; however, a third disk failure will result in data loss. Whereas current RAID-6 implementations incur an I/O performance penalty as a result of introducing an additional parity block, RAID-DP is optimized in terms of reducing read I/Os due to the way the NetApp controller handles parity write operations. Unlike other storage controllers that write changes to the original location, the NetApp controller always writes data to new blocks, thus making random writes appear to be written sequentially. It is important to follow NetApp best practices for sizing the array to ensure a consistent level of performance for Exchange implementations.
Note: |
|---|
|
For more information about RAID-DP, see "RAID-DP: Network Appliance Implementation of RAID Double Parity for Data Protection" at http://www.netapp.com/library/tr/3298.pdf and "Using NETAPP RAID-DP in Exchange Server 2007 Storage Designs" at http://www.netapp.com/library/tr/3574.pdf, or contact NetApp directly. The third-party Web site information in this topic is provided to help you find the technical information you need. The URLs are subject to change without notice.
|
Selecting a RAID Type
Selecting a RAID type is a balance of capacity, transactional I/O, and failure or rebuild performance characteristics. For example, mailbox size has a large impact on capacity, while smaller form factor disks impact performance. The type of RAID to select also depends on the data being stored and the controller being used. Transaction logs are the most important data set, and good write latency is critical for server performance. When using a storage controller that is RAID agnostic, transaction logs should be placed on RAID-1 or RAID-1/0 arrays with battery-backed write cache. For more information about the importance of quick, low-latency storage for the transaction logs, see Optimizing Storage for Exchange Server 2003. Likewise, when using a storage controller that is RAID agnostic, RAID-1/0 is the ideal configuration for databases, and it works well with large capacity disks.
In Exchange Server 2003, RAID-5 provided the best capacity efficiency, although its poorer performance seldom allowed the extra space to be used. As a result, in many Exchange 2003 deployments, more physical disks were required to meet the transactional performance requirements of RAID-5 than with RAID-1/0.
With Exchange 2007, the shift of increasing database writes as a percentage of database I/O causes RAID-5 LUNs to perform worse than in Exchange 2003. However, when following the recommendations to achieve a transactional I/O reduction, RAID-5 may be a good solution. RAID-5 is useful for using high speed, smaller capacity disks. In large mailbox solutions, RAID-5 may be able to provide more transactional performance than you need, to meet the capacity requirements with less physical disks than RAID-1/0.
For both RAID-5 and RAID-6, rebuild performance can have a significant effect on storage throughput. Depending upon the storage array and configuration, this effect could cut storage throughput in half. Scheduling rebuilds outside of production hours can offset this performance drop, but doing so sacrifices reliability. In a CCR environment, you can prevent the throughput reduction affecting users by moving the Mailbox server to the passive node, thereby making it the active node. If neither option is available, additional I/O throughput should be designed into the architecture to accommodate RAID-5 or RAID-6 rebuild conditions during production hours. This additional I/O throughput can be up to twice the non-failed state I/O requirements.