Printer Friendly Version      Send     
Click to Rate and Give Feedback
TechNet
TechNet Library
Storage Design for Exchange Server 2007

How Microsoft IT Exceeds High-Availability Targets with Large Mailboxes at Low Costs Based on New Storage Designs

Technical White Paper

Published: April 7, 2008

On This Page

Executive Summary
Introduction
Business Requirements
Advantages of DAS-Based Storage Designs
Microsoft IT Storage Design
Best Practices
Conclusion
For More Information

Executive Summary

More than 18 months after the first Microsoft® Exchange Server 2007 deployment in the corporate messaging environment and more than 12 months after completing the full production rollout across the entire company, the Microsoft Information Technology (Microsoft IT) group is able to report significant benefits such as:

  • Messaging service levels exceeding high-availability targets of 99.99 percent.
  • Cost reductions in excess of $10 million per year.
  • Increased mailbox quotas by up to a factor of 10.
  • Consolidation of the initial Exchange Server 2007 base by nearly a factor of two.

Microsoft IT was able to achieve these results by taking full advantage of new storage features and input/output (I/O) improvements in Exchange Server 2007, the latest advancements in 64-bit processor technology, and direct-attached storage (DAS)–based storage solutions.

One key strategy that accounts for more than $5 million in annual cost savings involved eliminating the need for backups to tape by relying on new high-availability features in Exchange Server 2007 such as cluster continuous replication (CCR) as the first level of protection, and Microsoft System Center Data Protection Manager 2007 as the second level of protection. Microsoft IT is not required to keep data on tape for archiving or other purposes. Moreover, according to an internal study conducted in 2006, Microsoft IT realized a 74 percent reduction of storage costs per gigabyte by replacing Storage Area Network (SAN) technology with DAS technology in the Mailbox server design. CCR enabled Microsoft IT to switch from SAN to DAS, which improved Microsoft IT's ability to support employee productivity by means of large mailboxes with quotas between 500 megabytes (MB) and 2 gigabytes (GB).

Microsoft IT pursued another key strategy that focused on driving down total cost of ownership (TCO) through server consolidation. Microsoft IT has already reduced the initial Mailbox server base in the corporate messaging environment by more than 45 percent, from 62 servers (124 cluster nodes) to 34 Mailbox servers (68 cluster nodes), and consolidation efforts continue. Before and after consolidation, Microsoft employees enjoy large mailbox capacities, fast server response times, and messaging services that exceed the required high-availability level of 99.99 percent and frequently reach 99.999 percent with no extra effort.

Exchange Server 2007 enables Microsoft IT to not only lower storage costs and increase mailbox quotas, but also decrease storage complexities, regain full control over all aspects of the Mailbox server design (including the storage subsystem), eliminate maintenance overhead, and increase high availability of Mailbox servers. All storage-related issues that Microsoft IT encountered since the initial production rollout of Exchange Server 2007 were recoverable without the need for backups. There have been no critical storage-related incidents affecting Mailbox server availability across the entire corporate messaging environment for more than 18 months.

The purpose of this white paper is to share Microsoft IT knowledge, experiences, and recommendations related to the architecture and design of Exchange Server 2007 Mailbox servers. This paper is not intended to serve as a procedural guide. Although many organizations have similar requirements, each enterprise environment also has unique requirements, making it necessary to adapt the information discussed in this paper.

This white paper assumes that readers are IT architects and technical decision makers who are already familiar with Windows Server® 2003, the Active Directory® directory service, and Exchange Server. Specifically, knowledge about SAN and DAS technologies, server clustering, and the high-availability features of Exchange Server 2007 is helpful. Detailed product information is available in the Microsoft Exchange Server 2007 Technical Library at http://technet.microsoft.com/en-us/library/bb124558.aspx.

Note: For security reasons, the sample names of forests, domains, organizations, and other internal resources mentioned in this paper do not represent real resource names used within Microsoft and are for illustration purposes only.

Introduction

“We learned through bitter experience that SAN redundancies cannot fully compensate for the critical single point of failure that shared storage represents in the clustered Mailbox server architecture. Exchange Server 2007 is the first product version that eliminates this critical single point of failure through CCR. We are further advancing this technology to provide our customers with even more flexibility in Exchange Server 2007 Service Pack 1 and future releases to continue the effort to decrease costs and increase service levels.”

Perry Clarke
Product Unit Manager
Exchange Server Product Group
Microsoft Corporation

Perry Clarke, the Product Unit Manager in the Exchange Server product group who is responsible for the technologies of the Mailbox server role, still remembers the time when CCR development began. The product team held CCR as a cornerstone in the vision for Exchange Server 2007 because this technology provides compelling answers to some of the most pressing enterprise customer needs, such as supporting significantly larger mailboxes at substantially lower costs. The development team was excited about this new technology and its potential to support larger mailboxes at lower storage costs, provide shorter failover times, reduce the need for restores from backup, and noticeably decrease storage complexity and eliminate maintenance overhead. Yet to the surprise of many in the product group, Microsoft IT did not share the enthusiasm. Failover clustering was never a question, but in the beginning of 2006, Microsoft IT was skeptical about the possibilities of using CCR on DAS in the Mailbox server design.

Microsoft IT hesitated to embrace CCR on DAS primarily for the following concerns:

  • Need to protect existing IT investments   The SAN environment at Microsoft IT represents considerable investment in technology that is not easily abandoned just because new technology emerges. Microsoft IT initially did not consider the shift to CCR on DAS as inevitable. In the beginning of 2006, Microsoft IT had not yet completed its plans to increase mailbox quotas from 200 MB up to 2 GB. Therefore, it was not immediately apparent that a properly designed SAN environment that accommodates these quotas requires approximately 30 times the existing storage capacity to hold 10 times more messaging data, including corresponding hardware Volume Shadow Copy Service (VSS)–based backups. The costs to increase SAN capacities by a factor of 30 would have been forbidding, especially when taking ongoing costs for capacity and performance management into consideration. In the absence of concrete numbers, it seemed more prudent to preserve the existing investment.
  • Desire to capitalize on existing expert knowledge   It is a strong Microsoft recommendation to use dedicated storage for Exchange Server to ensure a high transaction rate with low latencies and avoid unpredictable performance behavior, yet accommodating this requirement in a shared SAN environment poses complex configuration and performance optimization challenges. In close collaboration with storage vendors, Microsoft IT engineers developed best practices and actively helped enterprise customers with their SAN optimizations for Exchange Server. Microsoft IT engineers who had gained expert knowledge in the field of SAN optimizations for Exchange Server wanted to capitalize on this.
  • Perception that DAS was not an enterprise storage technology   Prior to Microsoft Exchange 2000 Server and SANs, Parallel Small Systems Computer Interface (Parallel SCSI) was state of the art in its various standards with thick cables, 50, 68, or 80-pin connectors, and performance, compatibility, scalability, and reliability issues. Serial Attached SCSI (SAS) began to replace Parallel SCSI by 2006, but for many at Microsoft IT, DAS was still synonymous with fragile connectors, bent pins, loose electrical contacts, and thick cables connecting a maximum number of only 16 devices. It was considered impossible to install 100 or 200 DAS drives in a Mailbox server to achieve high scalability. It was likewise unthinkable that a DAS hard disk drive could be more reliable than a SAN hard disk drive. At the end of 2007, SAS technology, surpassing Fibre Channel with higher interface speeds and lower failure rates, had come to market. At the same time that SAS interface was an emerging DAS technology in the beginning of 2006, Microsoft IT started the Exchange Server 2007 production rollout.
  • Concerns that DAS would create storage silos and hidden operational costs   Another obstacle that prevented Microsoft IT from initially seeing CCR on DAS as a viable solution for Mailbox servers was the fact that DAS attaches directly to each cluster node, which creates individual storage silos. From a SAN point of view, it is an overwhelming proposition to create a large number of individual storage locations in the corporate messaging environment. In a SAN environment, ongoing costs for storage allocation, capacity management, performance management, and troubleshooting can quickly exceed the initial investment in hardware and installation. By assuming that this issue of hidden ongoing costs would also apply to DAS, Microsoft IT saw any initial DAS savings potential dwindle rapidly. Today, with the benefit of operating for more than 18 months of CCR on DAS in production, it is easy to say that DAS storage is "designed once and never touched again." However, in early 2006, Microsoft IT was unable to verify that there is truly no need for DAS capacity and performance management beyond the initial storage design. Replacing broken disks, cables, or redundant array of independent disks (RAID) controllers is merely a part of standard hardware maintenance. Downtime due to storage or other node failures is less than two minutes of failover time in a properly designed, CCR-based Mailbox server, and data loss is greatly reduced due to redundant copies of messaging databases on individual cluster nodes. In fact, when CCR on DAS is compared with shared-storage clusters on SAN, it is noticeable that there is less chance for data loss and less need for database restores from backup because CCR eliminates the data instance used by the active node as a critical single point of failure. CCR on DAS also does not create new storage silos. It merely moves the existing storage silos—which dedicated, exclusive Exchange Server storage represents in a shared SAN environment—out of the high-maintenance, high-cost environment into a low-maintenance, low-cost alternative. Microsoft IT doubted these facts because they were unverifiable at the time.
  • Belief that CCR was not enterprise ready   It is an interesting proposal for an IT organization to commit fully to an emerging technology. However, this was the case for CCR in the beginning of 2006. CCR was a cornerstone in the Exchange Server 2007 vision as a key enabler of employee productivity through large mailboxes. Yet, Microsoft IT was concerned about possible implementation difficulties and delays because CCR was still in an early beta stage. Even without delays, Microsoft IT engineers did not readily take to the idea of relying on new technology with unknown scalability and reliability characteristics that would be at the very core of large Mailbox servers in the corporate messaging environment. CCR has proved its enterprise readiness over the past 18 months, enabling Microsoft IT to maintain 10 times more data on Mailbox servers with higher service availability levels. In early 2006, the development team could not yet prove the enterprise readiness of CCR for the simple reason that there was no enterprise deployment of CCR in existence.
  • Fear that replication latencies would introduce the potential for data loss   Microsoft IT concerns also revolved around the asynchronous nature of CCR, which can result in replication latencies and potential data loss. The scenario for data loss is straightforward: If the primary node receives a message and fails before Exchange Server 2007 replicates the data to the passive node, failover occurs, the passive node becomes active, and the Mailbox server has lost that e-mail message. When the development team suggested that the transport dumpster queue on Hub Transport servers addresses this issue by retaining and redelivering recent messages as needed, Microsoft IT insisted that the product group treats this feature as an intrinsic part of CCR. Microsoft IT did not want to take any chances with lost messages. The active node must be able to request redelivery from all Hub Transport servers in the local Active Directory site, and the Hub Transport servers must redeliver promptly so that no messages are lost after a failover.

During January and February of 2006, emotions ran high between the Exchange Server product group and Microsoft IT. Decisions changed literally every day. One day Microsoft IT would agree to deploy CCR on DAS, and the next day it would revert to the plans to SAN-based single copy clusters (SCCs). In the end, the question was not settled through debate. In the middle of the heated debate, a SAN storage array failure occurred, taking down multiple Mailbox servers, and causing an outage and the loss of 8,000 production mailboxes. It took three days to bring the systems back online, and the worst news was yet to come. Through a combination of unfortunate circumstances, the most recent tape-based backups were also irretrievably lost. Microsoft IT was unable to restore the most recent data, and 8,000 users, including employees, partners, contractors, and vendors lost e-mail data. It was a horrible week for Microsoft IT and the Exchange Server product group alike. It showed not only the critical nature of shared storage as a single point of failure in the Mailbox server architecture, but also the vulnerability of an IT organization if it has to depend on tape-based backups as the primary means to recover from storage failures.

Konstantin Ryvkin is a Senior Technology Architect at Microsoft IT and a member of the Exchange Messaging Engineering team responsible for the design of the corporate messaging environment. Looking back at that time, he says that the disaster and the painful recovery made Microsoft IT a stronger IT organization. It highlighted areas where previous technology and designs were no longer meeting business needs, and it opened doors for more rapid innovation and adoption of new technology. It also renewed the spirit of Microsoft IT to be the first and foremost customer of Microsoft, deploying new technologies at full scale in the corporate messaging environment to provide real-time feedback to the product teams and then to deliver solid proof of the product's enterprise readiness to customers. Microsoft IT did not commit to CCR on DAS merely because of a storage failure on SAN-based Mailbox servers, but because it was important to demonstrate the enterprise readiness of CCR.

Despite a lingering sense of trepidation, a consensus was reached about the need to move forward. In the initial Exchange Server 2007 Mailbox server designs, Microsoft IT cautiously used CCR on DAS at a moderate scale of 2,000 mailboxes with 500-MB quotas. Six months later, that scale increased to 6,000 mailboxes on most servers, representing a total of approximately 5 terabytes of data. It has now reached more than 12 terabytes on Mailbox servers for 4,000 heavy users with 2-GB quotas. CCR on DAS is an absolute success at Microsoft IT, and the Exchange Messaging Engineering team continues to explore product capabilities with new Mailbox server designs. So far, Exchange Server 2007 has not reached its limits, yet there are factors, such as recovery time objectives (RTOs) and recovery point objectives (RPOs), that require Microsoft IT to revamp backup and disaster recovery procedures before placing more than 12 terabytes of messaging data on a Mailbox server.

Business Requirements

While preparing for the Exchange Server 2007 production rollout in 2006, Microsoft IT analyzed internal messaging statistics and trends, assessed user demographics, clarified legal and regulatory requirements, and reviewed existing service level agreements (SLAs) for messaging in order to identify important business requirements. The messaging statistics and trends clearly showed a need for increased storage capacities in the messaging environment. The mailbox count grew by approximately 75 percent over the five years prior to the deployment of Exchange Server 2007, as indicated in Table 1. It was clear that this trend would continue after the rollout. Furthermore, corporate management began to encourage employees to stop using personal folders to archive messages. Based on data gathered from surveys and during pilot projects, Microsoft IT determined that depending on job responsibilities, users would require mailbox capacities of up to 2 GB.

Table 1. Microsoft Messaging Statistics and Projections

Category

2002/2003

2003/2004

2004/2005

2005/2006

2006/2007

2007/2008

Total mailboxes

71,000

80,000

95,000

110,000

130,000

147,000

Microsoft Exchange ActiveSync® users per month

Not applicable

6,000

13,000

21,000

31,000

48,000

Outlook® Anywhere users per month

Not applicable

20,000

25,000

60,000

60,000

100,000

Internet message submissions per day (unfiltered)

6,000,000

9,000,000

11,300,000

13,000,000

13,500,000

30,000,000

Blocked message submissions per day

2,500,000

7,500,000

10,000,000

10,500,000

11,000,000

28,000,000

Maximum message size

2 MB

5 MB

10 MB

10 MB

10 MB

10 MB

E-mail volume per user per calendar day

10 MB

15 MB

15 MB

20 MB

20 MB

26 MB

Number of clustered Mailbox servers

113*

38

34

30

62

34

Typical mailbox quota

100 MB

200 MB

200 MB

200 MB

500 MB or 2 GB

500 MB or 2 GB

Total mailbox data

7 terabytes

17 terabytes

19 terabytes

22 terabytes

60 terabytes

Up to 300 terabytes

*   Mostly non-clustered Mailbox servers

Note: This table is an updated version of Table 2 from the Microsoft IT Showcase Note on IT "Going 64-bit with Microsoft Exchange Server 2007," available at http://www.microsoft.com/downloads/details.aspx?FamilyID=f31e7541-f63a-4b7d-b8d2-3794c4dc3329&DisplayLang=en.

Increased Mailbox Capacities

Facing the requirement to increase mailbox capacities by up to a factor of 10 while maintaining existing SLAs, Microsoft IT looked closely at messaging demographics to assess individual user needs more accurately. Figure 1 shows the demographic profile of the corporate messaging environment based on the categories defined in the Exchange Server 2007 product documentation. The product documentation uses an average message size of 50 kilobytes (KB), which corresponds to the typical message size in the corporate messaging environment (43 KB to 50 KB). The user profiles enabled Microsoft IT to make reasonable assumptions regarding mailbox capacity requirements.

Figure 1. Demographic messaging profile at Microsoft

The demographic reveals that not all Microsoft users require the largest mailboxes to maintain their productivity levels. In fact, only about one-third of the users are very heavy and heaviest e-mail message recipients with the largest mailbox requirements, about 15 percent of the users are heavy users that can benefit from 2-GB mailboxes, and the majority are light and medium users with moderate messaging needs. For example, most Microsoft partners and vendors with internal accounts belong to the light and medium user groups. Accordingly, Microsoft IT defined that 33 percent of the users require 2-GB mailbox quotas and 66 percent require 500-MB quotas as a starting point. Full-time employees with 500-MB mailboxes have the ability to request 2-GB mailbox quotas if needed. This arrangement directly influenced the Mailbox server designs that Microsoft IT created for the initial production rollout in 2006.

During the initial rollout in the production environment, Microsoft IT used three different Mailbox server designs to accommodate the user need for larger mailboxes. The most common server type, based on CCR on DAS, supported 2,000 users with mailbox quotas of 500 MB. The other two server types supported 2,400 and 3,600 users with 2-GB quotas. Microsoft IT deployed the server type based on CCR on SAN for 3,600 users with 2-GB quotas only during the Beta 1 stage and switched entirely to the CCR-on-DAS design for 2,400 users with the availability of Exchange Server 2007 Beta 2. For details about the three original Exchange Server 2007 Mailbox server designs, see the Note on IT "Going 64-bit with Microsoft Exchange Server 2007," available at http://www.microsoft.com/downloads/details.aspx?FamilyID=f31e7541-f63a-4b7d-b8d2-3794c4dc3329&DisplayLang=en.

Service Level Agreements

Several factors caused Microsoft IT to approach the initial server design cautiously. Most importantly, CCR was still in beta and therefore not a trusted technology yet. Additionally, the only backup solution available at the time for Microsoft IT Mailbox servers based on CCR on DAS relied on Windows® Backup, the streaming backup API with limited throughput capabilities, and the requirement to perform online backup operations on the active node. Performing online backups on the passive node requires VSS-based technology, such as System Center Data Protection Manager 2007, but that product was not available yet. Furthermore, the dual-core processors and server models that Microsoft IT used during the initial rollout limited server scalability. Microsoft IT was not yet able to place 6,000 or more mailboxes on a server without jeopardizing SLAs.

Table 2 summarizes the organization-wide SLAs with a business importance level (BIL) of Important that influence the design of Mailbox servers in the corporate production environment.

Table 2. Organization-Wide SLAs with Impact on Mailbox Server Designs

Service level definition

Resolution target

Comments

End-to-end availability of messaging services

99.99 percent or greater

This SLA gives an end-to-end view of messaging as a managed service and includes Mailbox server availability as well as the availability of Client Access servers, Active Directory, and the network infrastructure. On Mailbox servers, Microsoft IT measures the availability of system services based on stop and start events and the availability of messaging databases based on events generated by the Exchange Information Store service.

End-to-end client availability

99.5 percent or greater

This SLA defines client availability as the percentage of successful remote procedure call (RPC) activity in relationship to failed RPC activity between Microsoft Office Outlook clients or Client Access servers and Mailbox servers.

End-to-end client performance

95 percent or greater

This SLA requires RPC client/server operations between Outlook clients or Client Access servers and Mailbox servers to finish in less than two seconds.

Business will continue with messaging service

One hour or less

This SLA requires individual database restores through reseeding or from backup to finish in less than one hour.

Retention of mailbox database backups

14 days or more

This SLA enables Microsoft IT to discard Exchange Server backups after 14 days. Microsoft IT does not use database backups for archiving purposes.

Retention of deleted items

14 days or more

This SLA influences the calculation of server capacity needs, as explained later in this white paper.

 

It is important to note that Microsoft IT also maintains RPOs and RTOs as a best-effort commitment. Both the current RPO and the current RTO for complete server restorations are less than 24 hours. However, these RPOs and RTOs remain from the Exchange Server 2003 time frame and were defined long before continuous-replication technology became a reality. They do not address the new capabilities. For example, the RPOs and RTOs do not clearly define a maximum time for reseeding an entire server to re-enable full resiliency after a total node failure. This aspect currently falls on an informal basis under the complete server RTO of less than 24 hours.

The development of new RPOs and RTOs is a work in progress. Microsoft IT is also creating new disaster recovery procedures to take full advantage of CCR and possibly standby continuous replication (SCR). Microsoft IT is targeting an RTO of 12 hours and an RPO of less than one hour in case of a complete data-center loss. Although these targets seem very achievable, Microsoft IT technology architects still need to verify the new disaster recovery procedures within and across data centers with more than 12 terabytes of messaging data on a Mailbox server.

Advantages of DAS-Based Storage Designs

“With more than 18 months of production use, I can say that CCR on DAS is a great solution for a Mailbox server platform running Exchange Server 2007, especially for environments looking for highly available and cost-efficient designs. Through the deployment at Microsoft IT, CCR on DAS proved to perform exceptionally well in high-availability scenarios at the impressive scale far exceeding our initial expectations. Now nobody in the Microsoft IT Messaging team would consider reverting back to the previous SAN-based architecture—we get everything that the business demands from e-mail on the DAS-based CCR platform.”

Konstantin Ryvkin
Senior Technology Architect
Microsoft IT Exchange Messaging
Microsoft Corporation

From the business point of view, the single most important advantage of CCR on DAS over any other storage solution for Mailbox servers is the possibility to have substantially increased mailbox sizes with substantially decreased storage costs while maintaining or exceeding existing high-availability levels. For example, DAS enables Microsoft IT to support mailbox quotas of up to 2 GB with a TCO that is comparable to maintaining 200-MB mailbox sizes on SAN. Microsoft employees can use large mailbox capacities in conjunction with optimized, fast, and reliable search capabilities that are built in to Exchange Server 2007. Microsoft employees can store more than a year's worth of e-mail messages in their mailboxes and do not have to use personal folder stores (in .pst files) to move out messages as data volumes approach mailbox limits. Having all messaging data directly on the Mailbox server facilitates data maintenance, backups, and server-based content indexing and search, and reduces security risks and support costs. Having all data on the server also means Microsoft IT can apply centrally maintained policies to ensure compliance with government and company regulations, and users can access all of their messages from any capable device in the office, at home, and on the road. Outlook Anywhere, Microsoft Office Outlook Web Access, and Exchange ActiveSync are preferred in locations with Internet connections, and Unified Messaging provides access in all other locations, as long as at least a stationary or mobile phone is available.

For the technical decision makers at Microsoft IT, one of the most important initial concerns about CCR on DAS is now one of the most compelling reasons for CCR on DAS: maintaining large data volumes on Mailbox servers with high availability. With 10 times more data on the servers and an ever-increasing number of users, it becomes increasingly more difficult to rely on backup and restore operations as the primary means to recover from storage failures and maintain existing high-availability SLAs, specifically Microsoft IT RTOs that demand restoring a Mailbox server with all messaging databases in less than 24 hours. CCR shifts the focus from backups to failovers as the primary recovery mechanism after a server or storage failure and is a key element in Microsoft IT's long-term strategy around large mailboxes.

CCR on DAS provides Microsoft IT with the following advantages in the Mailbox server design (explained in more detail in subsequent sections):

  • Increased Exchange data resilience   Failover is the main method to recover from storage failures on the primary node. The primary node has a built-in application-level mechanism that replicates and maintains synchronized copies of messaging databases on separate server nodes and provides failover times of less than two minutes. The Mailbox server can continue to run on the second node with only a short service interruption, whereas none of the nodes on an SCC-based server cluster can keep the Mailbox server running after a storage failure.
  • Ongoing backup operations during regular business hours   Maintaining separate synchronized data copies on the active and the passive node implies that it is possible to perform ongoing backup operations for a Mailbox server on the passive node without affecting users accessing their mailboxes on the active node. Microsoft IT uses Data Protection Manager 2007, which is fully CCR aware and supports passive-node backup operations every 15 minutes.
  • Reduced reliance on traditional backups to restore data   With CCR-based clusters, Microsoft IT uses backup as a secondary tool to ensure the recoverability of data in the event that storage on the second node also fails. This is an unlikely scenario because Microsoft IT attends to any primary node failure immediately. This is possible because Microsoft IT continuously monitors all Mailbox servers in the corporate messaging environment by using Microsoft System Center Operations Manager 2007. In the event of a node failure, Operations Manager 2007 automatically alerts front-line operators, who take prompt action. With the Mailbox server running on the second node, Microsoft IT can repair the failed node and then reseed the messaging databases if necessary without having to rely on restoring from backups. Reseeding is seldom necessary because the Extensible Storage Engine (ESE) supports automatic recovery mechanisms based on transaction log file replay and can sustain most node failures.
  • Simplicity in the storage design and low maintenance overhead   Unlike SANs, which are complex storage systems maintained by highly specialized engineers, DAS is very straightforward and requires only minimal skills that every Exchange administrator can master. Microsoft IT uses external enclosures for hard disk drives and dedicated RAID controllers in each cluster node. It is not necessary to use identical hardware between cluster nodes. Only the drive letter assignments must match. In fact, Microsoft supports CCR-based Mailbox servers with any appropriate components and server models from the standard Hardware Compatibility List (HCL), up to the point that the cluster nodes can be from different server vendors. However, it is important to note that Microsoft IT uses identical configurations so that failover to the passive node can occur without sacrificing performance.
  • Predictable Mailbox server performance   The local nature of the DAS solution ensures optimal performance because the local cluster node can utilize 100 percent of the storage resources. There is no need for shared-capacity management because a single server application, such as Exchange Server 2007, uses the storage resources exclusively. The DAS-based Mailbox server design implicitly conforms to the Microsoft recommendation to provide dedicated storage for Exchange Server operations.

Increased Exchange Data Resilience through Redundancies

Figure 2 shows the architecture of the SAN-based server cluster configuration that Microsoft IT used prior to Exchange Server 2007 in the corporate messaging environment. Four active nodes correspond to four Exchange Server 2003 Mailbox servers, and one primary passive node is available to run one of these Mailbox servers without performance penalties after a failover. The remaining two passive nodes were less powerful and served primarily as systems to perform tape-based backup operations. As illustrated, the SAN environment is fully redundant at the hardware level from the cluster nodes all the way down to the storage media. Any single component can fail along the path to the data without interrupting the service. After a timeout of the pending I/O operation, the SAN software automatically switches to the second path and the Mailbox server continues to run on the same cluster node.

Figure 2. Microsoft IT server cluster configuration for Exchange Server 2003 Mailbox servers

Despite the complexity and redundancy of SANs, Exchange databases remain critical single points of failure. The SAN-based cluster can provide a high level of availability for the messaging databases and the server cluster can recover from hardware failures on up to three cluster nodes through a failover, but this server cluster cannot recover from a critical storage failure. When SAN storage fails for any reason, none of the seven cluster nodes can run the affected Mailbox server or servers until Microsoft IT repairs the system and restores the messaging databases from backup. Although it is unlikely that both disks in a particular mirror set simultaneously break to cause a RAID 10 failure, hardware failures and firmware issues can lead to SAN outages. Human error is also a possibility. In fact, human error is the biggest risk factor due to the complex nature of SANs. It might take a long time until a storage failure happens, but when it happens, the best strategy is to have a second, synchronized data copy readily available. CCR provides this solution.

As illustrated in Figure 3, the CCR-based architecture is very straightforward and fully redundant at the Exchange database level. Microsoft IT connects multiple storage enclosures to each cluster node and creates RAID 10 drives with mirror sets across the enclosures, as explained in more detail later in this paper. At the Exchange database level, this results in twice the redundancy in comparison to the SAN-based Mailbox server configuration illustrated in Figure 2.

Figure 3. CCR-based Mailbox server configuration

It is an interesting aspect that CCR on DAS is able to deliver twice the redundancy at the Exchange database level with less storage sophistication and complexity on the individual cluster nodes in comparison to SCC on SAN. As a Microsoft IT engineer put it, two simple storage systems can be better than one complex system. In a SAN-based configuration, every cluster node has two host bus adapters connected to separate Fibre Channel fabrics, switches, and controller pairs, and the Fibre Channel disks are dual ported. Yet with CCR on DAS, Microsoft IT uses SAS disks with a single port and only a single RAID controller in each cluster node per storage unit, as shown in Figure 3. At the Exchange database level in the overall Mailbox server design, the RAID controllers are redundant, but at the level of individual cluster nodes, this is a local single point of failure. Eliminating this local single point of failure requires a CCR on SAN configuration with two host bus adapters in each cluster node connecting each node to a separate SAN storage array. Microsoft IT used this configuration in very early Exchange Server 2007 Mailbox server designs, but switched entirely to CCR on DAS with subsequent deployments during the initial production rollout for cost reasons. Due to the required failover, CCR on DAS cannot sustain a controller failure on a cluster node without service interruption. However, considering the low failure rates at the RAID controller level and low impact on the end-to-end availability of messaging services, Microsoft IT finds the service interruption of less than two minutes of failover time acceptable because Microsoft IT deployed 90 SAS RAID controller cards and had zero controller failures during the past 12 months. For Microsoft IT, the costs to deploy CCR on SAN far outweigh the benefit of eliminating these two minutes in the unlikely event that a RAID controller fails. SCC on SAN is not an alternative because SCC on SAN cannot recover from storage failures and requires Microsoft IT to restore data from backup, whereas CCR on DAS can sustain a storage failure because the data is readily available on the second node.

Note: Microsoft IT initiates the vast majority of failovers manually by using the Move-ClusteredMailboxServer cmdlet during planned and unplanned maintenance, such as to install mandatory security patches or update driver versions on cluster nodes. CCR on SAN has no advantage over CCR on DAS in this scenario because the failovers are unavoidable and the duration of the service interruption is comparable.

Ongoing Backup Operations During Regular Business Hours

CCR technology does not automatically eliminate the need for backups. There is a lack of redundancy if the storage subsystem on the primary node of a clustered Mailbox server experiences a total failure because only one node remains with the data until Microsoft IT repairs and reseeds the databases on the affected node. Backups provide the required additional layer of protection, and the recommended approach is to perform VSS-based backups on the passive node, as illustrated in Figure 4.

Figure 4. Data Protection Manager 2007–based backup on passive nodes

Microsoft IT switched from streaming backups on the active node to software-based VSS backups on the passive node with the release of Data Protection Manager 2007. By minimizing the backup-related performance impact on the active node, Microsoft IT can accommodate more users per Mailbox server while at the same time performing backup operations in much more frequent intervals. Microsoft IT configures the Data Protection Manager server to receive transaction logs every 15 minutes. The server performs an express full backup once a day to maintain a complete and consistent image of the data on the Data Protection Manager server. The express full backup relies on block-level synchronization in conjunction with the Exchange VSS Writer to identify and replicate only those data blocks that have changed in the production databases since the last express full backup.

For backup storage on Data Protection Manager servers, Microsoft IT also uses DAS technology, specifically RAID 10 on Serial Advanced Technology Attachment (SATA) disks with an individual disk capacity of more than 500 GB. In comparison to hardware VSS backups that Microsoft IT used in conjunction with SAN-based Mailbox servers, the solution based on Data Protection Manager 2007 on DAS helps Microsoft IT reduce the complexity of the backup environment, eliminate third-party dependencies, and achieve further storage cost savings while maintaining fast recovery objectives.

Reduced Need for Restores from Backup

Data Protection Manager 2007 enables Microsoft IT to restore mailbox data from any 15-minute point in time onto the original server or to a different server. However, there is practically no need to perform restores onto the original server for disaster recovery or any other purposes, as the more than 18 months of Microsoft IT experience with CCR suggest. For Microsoft IT, restoring files from backup is a tool that can be helpful in software testing and in the validation of disaster recovery plans. Microsoft IT performs these restores to a different server to avoid affecting the Mailbox servers in the corporate messaging environment.

CCR entirely transformed the Microsoft IT approach to fast recovery. The previous method for SAN-based Mailbox servers relied on VSS clones. At midnight, Microsoft IT cloned the logical unit numbers (LUNs) for the Mailbox server to a new set of clone LUNs. Although this solution provided fast recovery of large amounts of data from backup in a matter of minutes, it required two additional LUNs for each Mailbox server LUN with associated high SAN costs and highly specialized storage engineers to perform the recovery procedures in the SAN environment. By switching to CCR for fast recovery with Exchange Server 2007, Microsoft IT eliminated these costs and dependencies. Restoring from backup is no longer the primary recovery mechanism. The fast recovery mechanism of a CCR-based Mailbox server is a straightforward failover to the passive node.

The reason why CCR effectively eliminates the need for restores from backup onto the original Mailbox servers stems from the fact that the cluster nodes in a CCR-based Mailbox server are mutual hot-standby systems. Active and passive nodes can change their roles at any time. CCR automatically reverses the replication direction to keep the messaging databases synchronized by means of transaction log shipping and replay on the passive node. This implies that any node can fail and rebuild by using the messaging databases that are still available on the other node. The cluster nodes share no hardware components. It is therefore unlikely that a storage failure on one node will affect the messaging databases on the other node. Messaging databases that are unaffected, mounted, and available online on the second node are the basis for recovery scenarios without backups.

The typical CCR-based recovery scenario includes the following four phases (depicted in Figure 5 after the list):

  1. Normal operation   The clustered Mailbox server is available and new transactions, such as due to Hub Transport servers delivering messages, result in new transaction log files on the active node. Through file system notifications, the Microsoft Exchange Replication Service on the passive node learns that new transaction logs are pending replication. The NTFS file system generates these notifications on the active node whenever the ESE closes and renames the current transaction log file with a sequence number to make room for the next transaction log. The Exchange Replication Service on the passive node copies the new transaction log files via a security-enhanced file share from the active node into the local transaction log inspection folder. Exchange Server 2007 inspects these logs and moves them to the target storage group's transaction log folder for replay into the destination mailbox database. This asynchronous process of having the passive node perform the transaction log replication helps to keep CPU and I/O load on the active node at a minimum, but it also introduces a chance for a lossy failover.
  2. Lossy failover and recovery   In the worst-case scenario where the active node fails right after a Hub Transport server delivered messages and before CCR had a chance to replicate the current transactions, failover occurs and the passive node becomes active without the most recent messages. To retrieve the missing messages, the Mailbox server requests redelivery from all Hub Transport servers in the local Active Directory site as part of the recovery procedure after a lossy failover. Hub Transport servers maintain a transport dumpster queue for each continuous replication-enabled storage group in order to retain recently delivered messages and redeliver these messages promptly to bring the Mailbox server up to date. Microsoft IT uses the Set-TransportConfig cmdlet to configure this feature on all Hub Transport servers with the MaxDumpsterSizePerStorageGroup parameter set to 15 MB and a MaxDumpsterTime value of 07.00:00:00, which corresponds to seven days. This is explained in more detail in the Microsoft IT Showcase Note on IT "Going 64-Bit with Microsoft Exchange Server 2007," available at http://www.microsoft.com/downloads/details.aspx?FamilyID=f31e7541-f63a-4b7d-b8d2-3794c4dc3329&DisplayLang=en.
  3. Node repair   At this point, the Mailbox server is up to date and running on the remaining cluster node while Microsoft IT performs the necessary repair activities on the failed node, such as replacing the public network interface card (NIC) or updating a faulty driver. The key issue is whether the node failure affected the messaging databases on the failed node. Typically, all messaging data is still intact and Microsoft IT only needs to restart the failed cluster node and catch up on transaction logs. In rare cases, node failures corrupt messaging data. In this situation, Microsoft IT only needs to reseed the affected databases from the remaining copy, which is an uncomplicated procedure that uses the ESE streaming backup API to perform online copy operations. Through reseeding, Microsoft IT can copy messaging databases individually from the currently active node to the repaired node while users are online and accessing their mailboxes. Users might notice slower server response times during the reseeding process, but the Mailbox server with all the messaging data is available and the server performance meets the Microsoft IT SLAs.
  4. Normal operation   The Mailbox server resumes normal operations after the failed node is repaired, and any affected messaging databases are reseeded. On the repaired node, Exchange Server 2007 automatically recognizes that it is now running in passive context, and CCR reverses to update the repaired passive node with all transactions that occur on the active node. There is no need to reconfigure the system. CCR simply continues to replicate the messaging data from the active node to the passive node. Furthermore, Microsoft IT does not need to perform a failback of the clustered Mailbox server to the original node because both nodes have an identical hardware and software configuration. The Mailbox server can continue to run on the second node without performance penalties, Data Protection Manager 2007 automatically switches backup processes to the current passive node, and Microsoft IT saves two minutes of failback time, which would otherwise count against the high-availability SLAs.

Figure 5. Recovering from a storage failure in a CCR-based Mailbox server

Simplicity in the Storage Design and Low Maintenance Overhead

The storage subsystem manager is an important non-technical aspect of CCR on DAS in comparison to any SAN-based Mailbox server design. In SAN-based environments, dedicated and highly specialized storage engineers perform the necessary installation, configuration, optimization, and troubleshooting tasks. In DAS-based environments, regular Exchange administrators are typically capable of performing these storage-related tasks. Having full control over all aspects of the Mailbox server design—end to end, from the disks up to the client connections—is an important advantage for the Exchange Messaging team at Microsoft IT. Yet, it also requires Microsoft IT to design the storage subsystem with simplicity in mind so that Exchange administrators can perform all configuration and maintenance tasks without the help of highly specialized storage experts.

Microsoft IT achieves simplicity in the storage design primarily through the following approaches:

  • Taking advantage of SAS technology   Unlike SAN technology that requires strict configuration, SAS offers impressive configuration flexibility at the hardware layer. SAS uses common electrical SATA connections, and an SAS enclosure can accept different types of disks. The order of the disks in an array is not important, and it is possible to expand an SAS storage subsystem by daisy-chaining storage enclosures together. Microsoft IT recently tested the flexibility of SAS in a lab environment by shutting down a cluster node after moving the clustered Mailbox server to the other node, changing the position of the SAS disks in the storage enclosures, and replacing the RAID controller. During the restart of the cluster node, the system fully recognized the RAID configuration and drive letter assignments and the cluster node was immediately operational again. According to Seagate data sheets, SAS technology has emerged as an enterprise technology with small form factor (SFF) 2.5-inch disks surpassing Fibre Channel disks with an annualized failure rate (AFR) of 0.55 percent (versus 0.62 percent) while disk capacities continue to increase and hardware prices continue to fall. New generations of SAS RAID controllers appear every 12 to 18 months and typically come along with new server generations, whereas the product cycle of SAN counterparts is about three to four years. The SAN innovation rate is slower due to the higher technical complexities in comparison to DAS technology. For example, SFF disks appeared on the market more than two years ago, yet SAN vendors are still not able to integrate this technology into their systems. With CCR on DAS, Microsoft IT takes full advantage of the most dynamic products and technologies in the storage market.
  • Standardizing the storage layout by creating universal storage building blocks   To further minimize hardware maintenance complexities, help ensure reliability, and provide scalability, Microsoft IT developed a standardized storage layout that uses universal storage building blocks for scaling production Mailbox servers. A universal storage building block, or USBB as Microsoft IT calls it for short, is a self-contained unit of two physical storage enclosures, combined to provide database and transaction log drives for the specified set of Exchange databases. The number of drives—identified through individual LUNs at the SCSI level—that Microsoft IT can use in a USBB depends on the capacity of the storage enclosures and the required number of LUNs to provide the desired data capacity and I/O performance. The USBB count per server depends on the number of mailboxes and the mailbox quotas that Microsoft IT wants to maintain on the server. However, the conceptual storage layout per USBB remains unchanged. Among other things, this makes it easy to monitor the drives and replace failing disks. An operator can easily identify each disk through its position in the storage subsystem, as indicated in Figure 6, which shows a USBB configuration with 25 disks per enclosure. This particular USBB provides three RAID 10 LUNs for messaging databases, and one RAID 10 LUN for all transaction logs combined.

Figure 6. A universal storage building block for CCR on DAS

Predictable Mailbox Server Performance

An issue that Microsoft IT occasionally notices in large customer deployments on SAN-based Mailbox servers concerns the use of shared storage for messaging databases. Whereas Microsoft IT Mailbox server designs for Exchange Server 2003 always followed product recommendations to use dedicated storage arrays, customers occasionally ignore this recommendation and share the storage hardware between different types of server applications, such as Microsoft SQL Server® and Exchange Server, in an attempt to maximize capacity utilization. However, if Microsoft SQL Server databases are stored on the same physical media as Exchange Server databases, running large SQL Server jobs can lead to non-Exchange load surges and impaired Exchange Server performance. Microsoft IT engineers call this issue hot-spot contention to indicate poor performance of the storage subsystem resulting from different usage patterns caused by different server applications accessing the same storage media, as illustrated in Figure 7.

Figure 7. Different usage patterns on shared storage

The database I/O of Exchange Server 2007 is composed of large numbers of random page requests using a page size of 8 KB, whereas other server applications might access data more sequentially and in larger contiguous blocks. If all this data resides on the same physical media, the heads of the hard disk drive must frequently move out of the Exchange data region to service the non-Exchange data requests. The result is an unpredictable sharp decrease in Exchange Server performance due to increased response times at random intervals. For example, an Exchange read request that might have taken 8 milliseconds (ms) might now take 108 ms because the disk heads spend 100 ms retrieving non-Exchange data between the individual page requests of Exchange Server.

Exchange Server administrators cannot analyze this problem because the non-Exchange LUN is not visible in the Exchange server configuration. Likewise, the Exchange Server LUN is invisible on the computer running SQL Server. Exchange Server administrators and SQL Server administrators have no idea that they share the same physical storage media. Furthermore, the storage engineer who created the LUNs is unaware of the Exchange Server and SQL Server requirements. As far as the SAN environment is concerned, the storage engineer followed common best practices to counterbalance SAN costs. All systems show an optimized configuration, and yet users occasionally complain about slow Office Outlook clients in online mode.

Hot-spot contention is hard to locate, and it keeps reappearing in customer environments because there is no certainty in a SAN that the system configuration does not change over time. All too often, unaware storage engineers cannot resist the temptation to optimize available storage capacities. CCR on DAS puts an ultimate end to this problem by taking Exchange Server storage out of the shared SAN environment. It puts full control over the storage design into the hands of Exchange Server experts and eliminates unpredictable system behavior due to the side effects of SAN capacity optimization.

“Microsoft IT is our ultimate touchstone of enterprise readiness. From Microsoft IT we learn not only how our technology performs under real-world conditions, but also what issues and constraints system engineers face when designing production-grade Mailbox servers. No test lab can deliver this valuable insight. It fuels our development activities and ensures that we stay focused on actual current and future needs of our customers.”

Matt Gossage
Sr. Program Manager
Exchange Server Product Group
Microsoft Corporation

Microsoft IT Storage Design

In comparison to previous product versions, Exchange Server 2007 provides increased design flexibility because this 64-bit messaging system takes full advantage of available processor and memory resources and includes numerous architectural advancements that help to lower the I/O footprint. Among other things, Exchange Server 2007 includes a database buffer cache that can substantially reduce the need for data reads from disk during normal operation. The size of the database buffer cache depends on the physical amount of random access memory (RAM) present in the system; therefore, the physical amount of memory directly influences the I/O footprint. This relationship between memory and I/O footprint opens new opportunities to achieve optimal server response times by balancing memory with disk performance. This design flexibility is especially noticeable when comparing Microsoft IT Mailbox server designs with the results of the Exchange 2007 Mailbox Server Role Storage Requirements Calculator. The storage requirements calculator performs calculations based on product recommendations, while Microsoft IT takes advantage of excess I/O performance in the storage subsystem to go beyond product recommendations. Microsoft IT arrives at acceptable response times with less memory per Mailbox server than recommended because the storage subsystem includes far more disks for capacity reasons than required for performance and can therefore compensate for the increased I/O footprint that results from having less RAM per user on the server. The Exchange 2007 Mailbox Server Role Storage Requirements Calculator and detailed information about its use are available on the Microsoft Exchange Team Blog at http://msexchangeteam.com/archive/2007/01/15/432207.aspx.

Kyryl Perederiy, Senior Systems Engineer in the Microsoft IT Exchange Messaging team, is responsible for the Mailbox server designs. He emphasizes that customers evaluating the Microsoft IT designs should keep in mind that Microsoft IT currently exceeds recommended configurations in its Mailbox server designs in an effort to help the product group verify performance capabilities of Exchange Server 2007 under real-world conditions. Specifically, Microsoft IT Mailbox servers use 1 MB to 2 MB of RAM per user instead of the recommended 3.5 MB to 5 MB, but Microsoft IT Mailbox servers still perform well because the DAS-based storage design of Microsoft IT provides plenty of headroom to make up for the difference, as explained later in this paper. By taking advantage of excess I/O capabilities, Microsoft IT was able to optimize memory capacities and switch from expensive high-end enterprise hardware to mainstream enterprise server models, such as a dual socket quad-core Intel Xeon processor X5355 server model with only eight slots for fully buffered dual inline memory modules (FB-DIMMs) to support 6,000 users per server on average. With a maximum available module size of 4 GB, this mainstream server model has a maximum capacity of 32 GB of memory, but 4-GB DIMMs currently do not offer an attractive capacity/price ratio for Microsoft IT. For cost reasons, Microsoft IT uses 2-GB memory modules—in other words, a total of 16 GB of memory.

Microsoft IT emphasizes the following aspects in the storage design for Exchange Server 2007 Mailbox servers:

  • Reliability   Microsoft IT uses enterprise storage equipment in Mailbox servers and specifically pays attention to unrecoverable errors per bits read, AFRs, and manufacturer warranties. It is also noteworthy that Microsoft IT prefers SFF 2.5-inch disks to the large form factor (LFF) 3.5-inch disks because of the lower power requirements, lower cost per gigabyte, and higher performance and reliability, as well as less heat emission and less vibration. Less heat and less vibration result in less degradation over time. Hard disk drives of choice are SFF SAS disks with one bad sector per 10E16 bits read, an AFR of 0.55 percent, and a three-year warranty. SATA disks are still not as reliable as SAS disks. However, Microsoft IT is beginning to consider SATA because reliability and performance are increasingly meeting enterprise standards, the capacity/price ratio is attractive, and CCR deemphasizes the need for highly reliable disks in the storage subsystem on each cluster node.
  • Availability   Redundancies are a primary means in the Mailbox server design to help ensure high availability. At the hardware layer, Microsoft IT uses multiple external storage enclosures with redundant power supplies and controller connections. As mentioned earlier in this paper, Microsoft IT mirrors the hard disk drives across enclosures and then includes these mirrors in stripe sets to implement a RAID 10 configuration. In this configuration, an entire enclosure can fail without affecting the availability of the storage subsystem. Microsoft IT prefers RAID 10 to RAID 5 because RAID 10 provides a higher level of redundancy and can tolerate multiple disk failures, whereas RAID 5 can only tolerate a single disk failure. RAID 5–based Mailbox server designs are still in an experimental stage at Microsoft IT. With the exception of the RAID controller, Microsoft IT ensures redundancies for all storage components on each cluster node, including enclosures, power supplies, hard disk drives, and cables. CCR provides the necessary redundancies at the Exchange database level and compensates very well for the missing RAID controller redundancy at the individual cluster nodes.
  • Performance   Microsoft IT uses RAID 10 also because RAID 10 performs better than RAID 5. With RAID 10, write operations are free of parity calculations, and RAID 10 uses more disks than RAID 5 to deliver the same capacity, which benefits I/O performance when the storage subsystem is being designed for capacity. For example, it takes six 146-GB disks to build a RAID 10 drive with 438 GB of raw capacity. RAID 5 only requires four 146-GB disks to accomplish the same. Assuming SFF enterprise disks with a spindle speed of 10,000 revolutions per minute (RPM), capable of performing approximately 160 I/O operations per second (IOPS), the RAID 10 drive can handle 960 read IOPS (6 * 160 = 960), whereas the RAID 5 drive can only reach 640 read IOPS (4 * 160 = 640). Disk throughput and server response times are important design criteria for Microsoft IT because slow Mailbox servers affect employee productivity. Response times above 20 ms are tolerable only for short periods of time on a highly loaded system, but not on an ongoing basis because users will notice slow Office Outlook client behavior in online mode, such as when switching folders or composing new messages. Especially with large mailboxes, users need to be able to work fast and perform reliable searches to locate information quickly.
  • Capacity   Performance requirements determine the minimum number of disks in the storage subsystem, yet more disks might be required to meet capacity needs. To ensure adequate storage capacity, Microsoft IT calculates maximum database sizes based on the number of mailboxes and the desired quotas, and then adds additional capacity for database overhead, content indexing overhead, and unexpected database growth. If the number of disks required to meet capacity needs exceeds the minimum number of disks required to ensure I/O performance, Microsoft IT calls the storage design capacity bound. On the other hand, if fewer disks are required to provide the necessary capacities, but more disks are needed to reach the required I/O performance, the design is performance bound. Due to the low I/O requirements of Exchange Server 2007 and the high performance of SFF SAS disks, Microsoft IT Mailbox server designs are currently capacity bound, and Microsoft IT uses any excess I/O performance to counterbalance low memory conditions, as mentioned earlier. This picture might change in the future if Microsoft IT decides to switch to RAID 5 with SFF SAS or use larger and slower SATA disks.
  • Costs   For Microsoft IT, cost efficiency is an integral part of enterprise readiness. Multiple solutions may meet Microsoft IT's reliability, performance, and capacity requirements, include similar support options and tools (such as management packs), and be on the standard data center hardware list. Microsoft IT chooses the least expensive technology for the corporate messaging environment to demonstrate the cost savings potential of Exchange Server 2007. High-end enterprise server models provide the flexibility to match product recommendations in the Mailbox server design, yet Microsoft IT deliberately selects mainstream enterprise hardware to build scaled-up servers with the lowest possible budget. CCR on DAS is a key enabler to drive down costs to unprecedented levels, and RAID 5 with SFF SAS disks or RAID 10 on SATA disks might offer further opportunities to continue this effort.
  • Simplicity   Microsoft IT capitalizes on the potential of CCR on DAS to simplify the storage design through straightforward RAID configurations and a standardized storage layout based on USBBs. Among other things, simplicity helps to keep maintenance overhead and TCO at a minimum. Exactly the same operations procedures apply to all Mailbox servers in the corporate messaging environment regardless of mailbox numbers and quotas. Simplicity also helps to ensure stable Mailbox server performance in the event of a component failure. For example, Microsoft IT does not use a hot spare in the RAID 10 configuration, so the RAID controller cannot automatically phase out a failed disk and rebuild the RAID 10 array. Disk failures decrease the RAID 10 performance to a certain degree, because fewer disks remain to handle the I/O load; yet rebuilding a RAID array affects I/O performance even more noticeably. By scheduling hardware maintenance during non-business hours, Microsoft IT minimizes the impact of disk failures on server performance. Manually replacing hard disk drives is not a problem for Microsoft IT. Most Microsoft data centers are staffed 24 hours per day, seven days per week. In the event of a disk failure, an IT specialist is available to replace the affected disk at the next appropriate maintenance window.
  • Recoverability   Hardware and database redundancies enable Microsoft IT to ensure system recoverability in the event of component failures as well as in the event of entire node failures. Furthermore, Microsoft IT relies on Data Protection Manager 2007 to help ensure recoverability in the event of a loss of both cluster nodes. RTO objectives demand individual database restoration in less than one hour. Accordingly, Microsoft IT distributes the mailbox data over a large number of messaging databases per server to keep the maximum file size of each individual messaging database below 200 GB. The maximum file size corresponds to the number of mailboxes and the desired quotas plus database overhead, but excluding content indexing overhead and reserve for unexpected database growth. Keeping individual database files below 200 GB also helps Exchange Server complete online maintenance cycles regularly, which is critical to keep the Mailbox server healthy and the system performance stable.
  • Scalability   All messaging trends point upward at Microsoft. Message volumes are continuously rising, and the number of users almost doubled since the Exchange Server 2003 time frame. Mailbox sizes are steadily growing, and even 2-GB quotas might soon be too small for agile users. CCR on DAS enables Microsoft IT to keep pace with these trends by increasing the scalability of the Mailbox servers in the corporate messaging environment. For example, hardware maintenance on the passive node does not affect online users. It is possible to replace server hardware when new processor technology becomes available or upgrade the storage subsystem on one node while the other node keeps the Mailbox server available to users. A typical SAS-based RAID controller can support up to 100 disks. Mainstream 2U enterprise server models can support up to three RAID controllers, and large 4U high-end enterprise servers can include up to seven RAID controllers, putting the ceiling at 700 disks. It is unlikely that Microsoft IT will reach this scalability limit in the near future.

Note: Microsoft IT system engineers have contributed to the Exchange 2007 Mailbox Server Role Storage Requirements Calculator and strongly recommend this design tool to customers. In addition, third-party vendors and original equipment manufacturers (OEMs) who want to develop and test storage solutions for Exchange Server 2007 should consider participating in the Microsoft Exchange Solution Reviewed Program (ESRP) – Storage v2.0. Detailed information about ESRP is available at http://technet.microsoft.com/en-us/exchange/bb412164.aspx.

Mailbox Server Performance

“The ultimate test of the storage design is the failover. When thousands of concurrent users hit that fresh active node, all read requests must go to disk because a cold database buffer cache cannot deliver the data. That decisive moment shows how much our Mailbox servers benefit from the high performance available with DAS.”

Kyryl Perederiy
Sr. Systems Engineer
Microsoft IT Exchange Messaging
Microsoft Corporation

In a reliable network environment with sufficient net-available bandwidth, processor, memory, and storage subsystem are the main components that influence Mailbox server performance. Processor capabilities have the most significant impact. Microsoft IT used dual-core processors during the initial production rollout, which limits server scalability to 2,000–3,000 mailboxes. Currently, Microsoft IT uses server models with two quad-core Intel Xeon X5355 processors (a total of eight processor cores) to realize an increased density of 6,000 mailboxes per server for heavy users. Microsoft IT continues to monitor the processor market to take advantage of new models as soon as they become available at reasonable pricing.

Memory capacities are not as critical for Microsoft IT because Exchange Server 2007 Service Pack 1 (SP1) includes further ESE enhancements and it is possible to counterbalance memory deficiency with disk I/O performance. According to the product recommendation of 3.5 MB to 5 MB of RAM per mailbox plus 2 GB of RAM per server, a server with 6,000 mailboxes requires 24 GB to 32 GB of memory (6,000 * 3.5 or 5 MB / 1024 + 2 GB = 22.50 GB or 31.30 GB). Memory requirements increase further with more mailboxes, yet even the product group does not recommend more than 32 GB per Mailbox server to remain cost efficient. As mentioned earlier, Microsoft IT optimized memory capacities by taking advantage of excess I/O performance and supports 6,000 heavy users and up to twice as many medium users in one scalability pilot with 16 GB of RAM. It is important to note, however, that these server designs are for users with 500-MB quotas. For heavy users that require 2-GB quotas, mostly full-time employees, Microsoft IT uses server designs for up to 4,000 mailboxes on the same server platform with two quad-core Intel Xeon X5355 processors.

Figure 8 illustrates how Exchange Server 2007 uses the available physical memory to lower I/O requirements by caching frequently accessed data, such as the Inbox folder view for the top-most messages, calendar information, and any message-processing rules for each user. The less often the storage engine needs to reload this information, the lower the I/O footprint. In combination with further ESE enhancements, such as delayed write operations to avoid repeated writes to the same object on disk, Exchange Server 2007 puts less I/O demand on the storage subsystem than any previous Exchange Server version. Microsoft IT measured performance requirements by monitoring the performance counters Disk Transfers/sec, Disk Reads/sec, and Disk Writes/sec. Microsoft IT determined during the initial production rollout that Microsoft employees typically generate approximately 0.27 IOPS to 0.4 IOPS in a read/write mix of 1:1 on Mailbox servers with 5 MB of RAM per user.

Figure 8. Database buffer cache and database I/O

Now, more than 12 months after the initial production rollout, Microsoft IT uses significantly less memory in Mailbox servers. It follows that I/O requirements increase because Exchange Serv