On This Page Executive Summary Introduction Business Requirements Advantages of DAS-Based Storage
Designs Microsoft IT Storage Design Best Practices Conclusion For More Information Executive Summary More than 18 months after the first Microsoft® Exchange
Server 2007 deployment in the corporate messaging environment and more than
12 months after completing the full production rollout across the entire company,
the Microsoft Information Technology (Microsoft IT) group is able to report significant
benefits such as:
- Messaging service levels exceeding high-availability targets of 99.99
percent.
- Cost reductions in excess of $10 million per year.
- Increased mailbox quotas by up to a factor of 10.
- Consolidation of the initial Exchange Server 2007 base by nearly
a factor of two.
Microsoft IT was able to achieve these results by taking full advantage of new storage
features and input/output (I/O) improvements in Exchange Server 2007, the latest
advancements in 64-bit processor technology, and direct-attached storage (DAS)–based
storage solutions.
One key strategy that accounts for more than $5 million in annual cost savings involved
eliminating the need for backups to tape by relying on new high-availability features
in Exchange Server 2007 such as cluster continuous replication (CCR) as the
first level of protection, and Microsoft System Center Data Protection Manager 2007
as the second level of protection. Microsoft IT is not required to keep data on
tape for archiving or other purposes. Moreover, according to an internal study conducted
in 2006, Microsoft IT realized a 74 percent reduction of storage costs per
gigabyte by replacing Storage Area Network (SAN) technology with DAS technology
in the Mailbox server design. CCR enabled Microsoft IT to switch from SAN to DAS,
which improved Microsoft IT's ability to support employee productivity by means
of large mailboxes with quotas between 500 megabytes (MB) and 2 gigabytes
(GB).
Microsoft IT pursued another key strategy that focused on driving down total cost
of ownership (TCO) through server consolidation. Microsoft IT has already reduced
the initial Mailbox server base in the corporate messaging environment by more than
45 percent, from 62 servers (124 cluster nodes) to 34 Mailbox servers (68 cluster
nodes), and consolidation efforts continue. Before and after consolidation, Microsoft
employees enjoy large mailbox capacities, fast server response times, and messaging
services that exceed the required high-availability level of 99.99 percent and frequently
reach 99.999 percent with no extra effort.
Exchange Server 2007 enables Microsoft IT to not only lower storage costs and
increase mailbox quotas, but also decrease storage complexities, regain full control
over all aspects of the Mailbox server design (including the storage subsystem),
eliminate maintenance overhead, and increase high availability of Mailbox servers.
All storage-related issues that Microsoft IT encountered since the initial production
rollout of Exchange Server 2007 were recoverable without the need for backups.
There have been no critical storage-related incidents affecting Mailbox server availability
across the entire corporate messaging environment for more than 18 months.
The purpose of this white paper is to share Microsoft IT knowledge, experiences,
and recommendations related to the architecture and design of Exchange Server 2007
Mailbox servers. This paper is not intended to serve as a procedural guide. Although
many organizations have similar requirements, each enterprise environment also has
unique requirements, making it necessary to adapt the information discussed in this
paper.
This white paper assumes that readers are IT architects and technical decision makers
who are already familiar with Windows Server® 2003, the Active Directory® directory
service, and Exchange Server. Specifically, knowledge about SAN and DAS technologies,
server clustering, and the high-availability features of Exchange Server 2007
is helpful. Detailed product information is available in the Microsoft Exchange
Server 2007 Technical Library at http://technet.microsoft.com/en-us/library/bb124558.aspx. Note: For security reasons, the sample names of
forests, domains, organizations, and other internal resources mentioned in this
paper do not represent real resource names used within Microsoft and are for illustration
purposes only.
“We learned through bitter experience that SAN redundancies cannot fully compensate
for the critical single point of failure that shared storage represents in the clustered
Mailbox server architecture. Exchange Server 2007 is the first product version
that eliminates this critical single point of failure through CCR. We are further
advancing this technology to provide our customers with even more flexibility in
Exchange Server 2007 Service Pack 1 and future releases to continue the effort
to decrease costs and increase service levels.”
Perry Clarke
Product Unit Manager
Exchange Server Product Group
Microsoft Corporation Perry Clarke, the Product Unit Manager in the Exchange
Server product group who is responsible for the technologies of the Mailbox server
role, still remembers the time when CCR development began. The product team held
CCR as a cornerstone in the vision for Exchange Server 2007 because this technology
provides compelling answers to some of the most pressing enterprise customer needs,
such as supporting significantly larger mailboxes at substantially lower costs.
The development team was excited about this new technology and its potential to
support larger mailboxes at lower storage costs, provide shorter failover times,
reduce the need for restores from backup, and noticeably decrease storage complexity
and eliminate maintenance overhead. Yet to the surprise of many in the product group,
Microsoft IT did not share the enthusiasm. Failover clustering was never a question,
but in the beginning of 2006, Microsoft IT was skeptical about the possibilities
of using CCR on DAS in the Mailbox server design.
Microsoft IT hesitated to embrace CCR on DAS primarily for the following concerns: - Need to protect existing IT investments The
SAN environment at Microsoft IT represents considerable investment in technology
that is not easily abandoned just because new technology emerges. Microsoft IT initially
did not consider the shift to CCR on DAS as inevitable. In the beginning of 2006,
Microsoft IT had not yet completed its plans to increase mailbox quotas from 200
MB up to 2 GB. Therefore, it was not immediately apparent that a properly designed
SAN environment that accommodates these quotas requires approximately 30 times the
existing storage capacity to hold 10 times more messaging data, including corresponding
hardware Volume Shadow Copy Service (VSS)–based backups. The costs to increase SAN
capacities by a factor of 30 would have been forbidding, especially when taking
ongoing costs for capacity and performance management into consideration. In the
absence of concrete numbers, it seemed more prudent to preserve the existing investment.
- Desire to capitalize on existing expert
knowledge It is a strong Microsoft recommendation to use
dedicated storage for Exchange Server to ensure a high transaction rate with low
latencies and avoid unpredictable performance behavior, yet accommodating this requirement
in a shared SAN environment poses complex configuration and performance optimization
challenges. In close collaboration with storage vendors, Microsoft IT engineers
developed best practices and actively helped enterprise customers with their SAN
optimizations for Exchange Server. Microsoft IT engineers who had gained expert
knowledge in the field of SAN optimizations for Exchange Server wanted to capitalize
on this.
- Perception that
DAS was not an enterprise storage technology Prior to Microsoft
Exchange 2000 Server and SANs, Parallel Small Systems Computer Interface (Parallel
SCSI) was state of the art in its various standards with thick cables, 50,
68, or 80-pin connectors, and performance, compatibility, scalability, and reliability
issues. Serial Attached SCSI (SAS) began to replace Parallel SCSI by 2006, but for
many at Microsoft IT, DAS was still synonymous with fragile connectors, bent pins,
loose electrical contacts, and thick cables connecting a maximum number of only
16 devices. It was considered impossible to install 100 or 200 DAS drives in a Mailbox
server to achieve high scalability. It was likewise unthinkable that a DAS hard
disk drive could be more reliable than a SAN hard disk drive. At the end of 2007,
SAS technology, surpassing Fibre Channel with higher interface speeds and lower
failure rates, had come to market. At the same time that SAS interface was an emerging
DAS technology in the beginning of 2006, Microsoft IT started the Exchange Server 2007
production rollout.
- Concerns that DAS would create storage
silos and hidden operational costs Another obstacle that
prevented Microsoft IT from initially seeing CCR on DAS as a viable solution for
Mailbox servers was the fact that DAS attaches directly to each cluster node, which
creates individual storage silos. From a SAN point of view, it is an overwhelming
proposition to create a large number of individual storage locations in the corporate
messaging environment. In a SAN environment, ongoing costs for storage allocation,
capacity management, performance management, and troubleshooting can quickly exceed
the initial investment in hardware and installation. By assuming that this issue
of hidden ongoing costs would also apply to DAS, Microsoft IT saw any initial DAS
savings potential dwindle rapidly. Today, with the benefit of operating for more
than 18 months of CCR on DAS in production, it is easy to say that DAS storage is
"designed once and never touched again." However, in early 2006, Microsoft
IT was unable to verify that there is truly no need for DAS capacity and performance
management beyond the initial storage design. Replacing broken disks, cables, or
redundant array of independent disks (RAID) controllers is merely a part of standard
hardware maintenance. Downtime due to storage or other node failures is less than
two minutes of failover time in a properly designed, CCR-based Mailbox server, and
data loss is greatly reduced due to redundant copies of messaging databases on individual
cluster nodes. In fact, when CCR on DAS is compared with shared-storage clusters
on SAN, it is noticeable that there is less chance for data loss and less need for
database restores from backup because CCR eliminates the data instance used by the
active node as a critical single point of failure. CCR on DAS also does not create
new storage silos. It merely moves the existing storage silos—which dedicated, exclusive
Exchange Server storage represents in a shared SAN environment—out of the high-maintenance,
high-cost environment into a low-maintenance, low-cost alternative. Microsoft IT
doubted these facts because they were unverifiable at the time.
- Belief that CCR was not enterprise
ready It is an interesting proposal for an IT organization
to commit fully to an emerging technology. However, this was the case for CCR in
the beginning of 2006. CCR was a cornerstone in the Exchange Server 2007 vision
as a key enabler of employee productivity through large mailboxes. Yet, Microsoft
IT was concerned about possible implementation difficulties and delays because CCR
was still in an early beta stage. Even without delays, Microsoft IT engineers did
not readily take to the idea of relying on new technology with unknown scalability
and reliability characteristics that would be at the very core of large Mailbox
servers in the corporate messaging environment. CCR has proved its enterprise readiness
over the past 18 months, enabling Microsoft IT to maintain 10 times more data on
Mailbox servers with higher service availability levels. In early 2006, the development
team could not yet prove the enterprise readiness of CCR for the simple reason that
there was no enterprise deployment of CCR in existence.
- Fear that replication latencies would
introduce the potential for data loss Microsoft IT concerns
also revolved around the asynchronous nature of CCR, which can result in replication
latencies and potential data loss. The scenario for data loss is straightforward:
If the primary node receives a message and fails before Exchange Server 2007
replicates the data to the passive node, failover occurs, the passive node becomes
active, and the Mailbox server has lost that e-mail message. When the development
team suggested that the transport dumpster queue on Hub Transport servers addresses
this issue by retaining and redelivering recent messages as needed, Microsoft IT
insisted that the product group treats this feature as an intrinsic part of CCR.
Microsoft IT did not want to take any chances with lost messages. The active node
must be able to request redelivery from all Hub Transport servers in the local Active
Directory site, and the Hub Transport servers must redeliver promptly so that no
messages are lost after a failover.
During January and February of 2006, emotions ran high between the Exchange Server
product group and Microsoft IT. Decisions changed literally every day. One day Microsoft
IT would agree to deploy CCR on DAS, and the next day it would revert to the plans
to SAN-based single copy clusters (SCCs). In the end, the question was not settled
through debate. In the middle of the heated debate, a SAN storage array failure
occurred, taking down multiple Mailbox servers, and causing an outage and the loss
of 8,000 production mailboxes. It took three days to bring the systems back online,
and the worst news was yet to come. Through a combination of unfortunate circumstances,
the most recent tape-based backups were also irretrievably lost. Microsoft IT was
unable to restore the most recent data, and 8,000 users, including employees, partners,
contractors, and vendors lost e-mail data. It was a horrible week for Microsoft
IT and the Exchange Server product group alike. It showed not only the critical
nature of shared storage as a single point of failure in the Mailbox server architecture,
but also the vulnerability of an IT organization if it has to depend on tape-based
backups as the primary means to recover from storage failures.
Konstantin Ryvkin is a Senior Technology Architect at Microsoft IT and a member
of the Exchange Messaging Engineering team responsible for the design of the corporate
messaging environment. Looking back at that time, he says that the disaster and
the painful recovery made Microsoft IT a stronger IT organization. It highlighted
areas where previous technology and designs were no longer meeting business needs,
and it opened doors for more rapid innovation and adoption of new technology. It
also renewed the spirit of Microsoft IT to be the first and foremost customer of
Microsoft, deploying new technologies at full scale in the corporate messaging environment
to provide real-time feedback to the product teams and then to deliver solid proof
of the product's enterprise readiness to customers. Microsoft IT did not commit
to CCR on DAS merely because of a storage failure on SAN-based Mailbox servers,
but because it was important to demonstrate the enterprise readiness of CCR.
Despite a lingering sense of trepidation, a consensus was reached about the need
to move forward. In the initial Exchange Server 2007 Mailbox server designs,
Microsoft IT cautiously used CCR on DAS at a moderate scale of 2,000 mailboxes with 500-MB
quotas. Six months later, that scale increased to 6,000 mailboxes on most servers,
representing a total of approximately 5 terabytes of data. It has now reached more
than 12 terabytes on Mailbox servers for 4,000 heavy users with 2-GB quotas. CCR
on DAS is an absolute success at Microsoft IT, and the Exchange Messaging Engineering
team continues to explore product capabilities with new Mailbox server designs.
So far, Exchange Server 2007 has not reached its limits, yet there are factors,
such as recovery time objectives (RTOs) and recovery point objectives (RPOs), that
require Microsoft IT to revamp backup and disaster recovery procedures before placing
more than 12 terabytes of messaging data on a Mailbox server.
While preparing for the Exchange Server 2007 production rollout in 2006, Microsoft
IT analyzed internal messaging statistics and trends, assessed user demographics,
clarified legal and regulatory requirements, and reviewed existing service level
agreements (SLAs) for messaging in order to identify important business requirements.
The messaging statistics and trends clearly showed a need for increased storage
capacities in the messaging environment. The mailbox count grew by approximately
75 percent over the five years prior to the deployment of Exchange Server 2007,
as indicated in Table 1. It was clear that this trend would continue after the rollout.
Furthermore, corporate management began to encourage employees to stop using personal
folders to archive messages. Based on data gathered from surveys and during pilot
projects, Microsoft IT determined that depending on job responsibilities, users
would require mailbox capacities of up to 2 GB.
Table 1. Microsoft Messaging Statistics and Projections |
Category |
2002/2003 |
2003/2004 |
2004/2005 |
2005/2006 |
2006/2007 |
2007/2008 | |
Total mailboxes |
71,000 |
80,000 |
95,000 |
110,000 |
130,000 |
147,000 | |
Microsoft Exchange ActiveSync® users per month
|
Not applicable |
6,000 |
13,000 |
21,000 |
31,000 |
48,000 | |
Outlook® Anywhere users per month |
Not applicable |
20,000 |
25,000 |
60,000 |
60,000 |
100,000 | |
Internet message submissions per day (unfiltered) |
6,000,000 |
9,000,000 |
11,300,000 |
13,000,000 |
13,500,000 |
30,000,000 | |
Blocked message submissions per day |
2,500,000 |
7,500,000 |
10,000,000 |
10,500,000 |
11,000,000 |
28,000,000 | |
Maximum message size |
2 MB |
5 MB |
10 MB |
10 MB |
10 MB |
10 MB | |
E-mail volume per user per calendar day |
10 MB |
15 MB |
15 MB |
20 MB |
20 MB |
26 MB | |
Number of clustered Mailbox servers |
113* |
38 |
34 |
30 |
62 |
34 | |
Typical mailbox quota |
100 MB |
200 MB |
200 MB |
200 MB |
500 MB or 2 GB |
500 MB or 2 GB | |
Total mailbox data |
7 terabytes |
17 terabytes |
19 terabytes |
22 terabytes |
60 terabytes |
Up to 300 terabytes |
* Mostly non-clustered Mailbox servers
Facing the requirement to increase mailbox capacities by up to a factor of 10
while maintaining existing SLAs, Microsoft IT looked closely at messaging demographics
to assess individual user needs more accurately. Figure 1 shows the demographic
profile of the corporate messaging environment based on the categories defined in
the Exchange Server 2007 product documentation. The product documentation uses
an average message size of 50 kilobytes (KB), which corresponds to the typical
message size in the corporate messaging environment (43 KB to 50 KB). The user profiles
enabled Microsoft IT to make reasonable assumptions regarding mailbox capacity requirements. .gif)
Figure 1. Demographic messaging profile at Microsoft
The demographic reveals that not all Microsoft users require the largest mailboxes
to maintain their productivity levels. In fact, only about one-third of the users
are very heavy and heaviest e-mail message recipients with the largest mailbox requirements,
about 15 percent of the users are heavy users that can benefit from 2-GB mailboxes,
and the majority are light and medium users with moderate messaging needs. For example,
most Microsoft partners and vendors with internal accounts belong to the light and
medium user groups. Accordingly, Microsoft IT defined that 33 percent of the users
require 2-GB mailbox quotas and 66 percent require 500-MB quotas as a starting point.
Full-time employees with 500-MB mailboxes have the ability to request 2-GB mailbox
quotas if needed. This arrangement directly influenced the Mailbox server designs
that Microsoft IT created for the initial production rollout in 2006.
During the initial rollout in the production environment, Microsoft IT used three
different Mailbox server designs to accommodate the user need for larger mailboxes.
The most common server type, based on CCR on DAS, supported 2,000 users with mailbox
quotas of 500 MB. The other two server types supported 2,400 and 3,600 users
with 2-GB quotas. Microsoft IT deployed the server type based on CCR on SAN for
3,600 users with 2-GB quotas only during the Beta 1 stage and switched entirely
to the CCR-on-DAS design for 2,400 users with the availability of Exchange Server 2007
Beta 2. For details about the three original Exchange Server 2007 Mailbox server
designs, see the Note on IT "Going 64-bit with Microsoft Exchange Server 2007,"
available at
http://www.microsoft.com/downloads/details.aspx?FamilyID=f31e7541-f63a-4b7d-b8d2-3794c4dc3329&DisplayLang=en.
Several factors caused Microsoft IT to approach the initial server design cautiously.
Most importantly, CCR was still in beta and therefore not a trusted technology yet.
Additionally, the only backup solution available at the time for Microsoft IT Mailbox
servers based on CCR on DAS relied on Windows® Backup, the streaming backup API
with limited throughput capabilities, and the requirement to perform online backup
operations on the active node. Performing online backups on the passive node requires
VSS-based technology, such as System Center Data Protection Manager 2007, but
that product was not available yet. Furthermore, the dual-core processors and server
models that Microsoft IT used during the initial rollout limited server scalability.
Microsoft IT was not yet able to place 6,000 or more mailboxes on a server without
jeopardizing SLAs.
Table 2 summarizes the organization-wide SLAs with a business importance level (BIL)
of Important that influence the design of Mailbox servers in the corporate production
environment.
Table 2. Organization-Wide SLAs with Impact on Mailbox Server Designs |
Service level definition |
Resolution target |
Comments | |
End-to-end availability of messaging services
|
99.99 percent or greater |
This SLA gives an end-to-end view of messaging as a managed service and includes
Mailbox server availability as well as the availability of Client Access servers,
Active Directory, and the network infrastructure. On Mailbox servers, Microsoft
IT measures the availability of system services based on stop and start events and
the availability of messaging databases based on events generated by the Exchange
Information Store service. | |
End-to-end client availability |
99.5 percent or greater |
This SLA defines client availability as the percentage of successful remote procedure
call (RPC) activity in relationship to failed RPC activity between Microsoft Office
Outlook clients or Client Access servers and Mailbox servers. | |
End-to-end client performance
|
95 percent or greater |
This SLA requires RPC client/server operations between Outlook clients or Client
Access servers and Mailbox servers to finish in less than two seconds. | |
Business will continue with messaging service |
One hour or less |
This SLA requires individual database restores through reseeding or from backup
to finish in less than one hour. | |
Retention of mailbox database backups |
14 days or more |
This SLA enables Microsoft IT to discard Exchange Server backups after 14 days.
Microsoft IT does not use database backups for archiving purposes. | |
Retention of deleted items |
14 days or more |
This SLA influences the calculation of server capacity needs, as explained later
in this white paper. |
It is important to note that Microsoft IT also maintains RPOs and RTOs as a best-effort
commitment. Both the current RPO and the current RTO for complete server restorations
are less than 24 hours. However, these RPOs and RTOs remain from the Exchange Server 2003
time frame and were defined long before continuous-replication technology became
a reality. They do not address the new capabilities. For example, the RPOs and RTOs
do not clearly define a maximum time for reseeding an entire server to re-enable
full resiliency after a total node failure. This aspect currently falls on an informal
basis under the complete server RTO of less than 24 hours.
The development of new RPOs and RTOs is a work in progress. Microsoft IT is also
creating new disaster recovery procedures to take full advantage of CCR and possibly
standby continuous replication (SCR). Microsoft IT is targeting an RTO of 12 hours
and an RPO of less than one hour in case of a complete data-center loss. Although
these targets seem very achievable, Microsoft IT technology architects still need
to verify the new disaster recovery procedures within and across data centers with
more than 12 terabytes of messaging data on a Mailbox server.
“With more than 18 months of production use, I can say that CCR on DAS is a great
solution for a Mailbox server platform running Exchange Server 2007, especially
for environments looking for highly available and cost-efficient designs. Through
the deployment at Microsoft IT, CCR on DAS proved to perform exceptionally well
in high-availability scenarios at the impressive scale far exceeding our initial
expectations. Now nobody in the Microsoft IT Messaging team would consider reverting
back to the previous SAN-based architecture—we get everything that the business
demands from e-mail on the DAS-based CCR platform.”
Konstantin Ryvkin
Senior Technology Architect
Microsoft IT Exchange Messaging
Microsoft Corporation From the business point of view, the single most important advantage of CCR
on DAS over any other storage solution for Mailbox servers is the possibility to
have substantially increased mailbox sizes with substantially decreased storage
costs while maintaining or exceeding existing high-availability levels. For example,
DAS enables Microsoft IT to support mailbox quotas of up to 2 GB with a TCO
that is comparable to maintaining 200-MB mailbox sizes on SAN. Microsoft employees
can use large mailbox capacities in conjunction with optimized, fast, and reliable
search capabilities that are built in to Exchange Server 2007. Microsoft employees
can store more than a year's worth of e-mail messages in their mailboxes and do
not have to use personal folder stores (in .pst files) to move out messages as data
volumes approach mailbox limits. Having all messaging data directly on the Mailbox
server facilitates data maintenance, backups, and server-based content indexing
and search, and reduces security risks and support costs. Having all data on the
server also means Microsoft IT can apply centrally maintained policies to ensure
compliance with government and company regulations, and users can access all of
their messages from any capable device in the office, at home, and on the road.
Outlook Anywhere, Microsoft Office Outlook Web Access, and Exchange ActiveSync are
preferred in locations with Internet connections, and Unified Messaging provides
access in all other locations, as long as at least a stationary or mobile phone
is available.
For the technical decision makers at Microsoft IT, one of the most important initial
concerns about CCR on DAS is now one of the most compelling reasons for CCR on DAS:
maintaining large data volumes on Mailbox servers with high availability. With 10
times more data on the servers and an ever-increasing number of users, it becomes
increasingly more difficult to rely on backup and restore operations as the primary
means to recover from storage failures and maintain existing high-availability SLAs,
specifically Microsoft IT RTOs that demand restoring a Mailbox server with all messaging
databases in less than 24 hours. CCR shifts the focus from backups to failovers
as the primary recovery mechanism after a server or storage failure and is a key
element in Microsoft IT's long-term strategy around large mailboxes.
CCR on DAS provides Microsoft IT with the following advantages in the Mailbox server
design (explained in more detail in subsequent sections):
- Increased Exchange data resilience Failover
is the main method to recover from storage failures on the primary node. The primary
node has a built-in application-level mechanism that replicates and maintains synchronized
copies of messaging databases on separate server nodes and provides failover times
of less than two minutes. The Mailbox server can continue to run on the second node
with only a short service interruption, whereas none of the nodes on an SCC-based
server cluster can keep the Mailbox server running after a storage failure.
- Ongoing backup operations during regular
business hours Maintaining separate synchronized data copies
on the active and the passive node implies that it is possible to perform ongoing
backup operations for a Mailbox server on the passive node without affecting users
accessing their mailboxes on the active node. Microsoft IT uses Data Protection
Manager 2007, which is fully CCR aware and supports passive-node backup operations
every 15 minutes.
- Reduced reliance on traditional backups
to restore data With CCR-based clusters, Microsoft IT uses
backup as a secondary tool to ensure the recoverability of data in the event that
storage on the second node also fails. This is an unlikely scenario because Microsoft
IT attends to any primary node failure immediately. This is possible because Microsoft
IT continuously monitors all Mailbox servers in the corporate messaging environment
by using Microsoft System Center Operations Manager 2007. In the event of a
node failure, Operations Manager 2007 automatically alerts front-line operators,
who take prompt action. With the Mailbox server running on the second node, Microsoft
IT can repair the failed node and then reseed the messaging databases if necessary
without having to rely on restoring from backups. Reseeding is seldom necessary
because the Extensible Storage Engine (ESE) supports automatic recovery mechanisms
based on transaction log file replay and can sustain most node failures.
- Simplicity in the storage design and
low maintenance overhead Unlike SANs, which are complex
storage systems maintained by highly specialized engineers, DAS is very straightforward
and requires only minimal skills that every Exchange administrator can master. Microsoft
IT uses external enclosures for hard disk drives and dedicated RAID controllers
in each cluster node. It is not necessary to use identical hardware between cluster
nodes. Only the drive letter assignments must match. In fact, Microsoft supports
CCR-based Mailbox servers with any appropriate components and server models from
the standard Hardware Compatibility List (HCL), up to the point that the cluster
nodes can be from different server vendors. However, it is important to note that
Microsoft IT uses identical configurations so that failover to the passive node
can occur without sacrificing performance.
- Predictable Mailbox server performance The
local nature of the DAS solution ensures optimal performance because the local cluster
node can utilize 100 percent of the storage resources. There is no need for
shared-capacity management because a single server application, such as Exchange
Server 2007, uses the storage resources exclusively. The DAS-based Mailbox
server design implicitly conforms to the Microsoft recommendation to provide dedicated
storage for Exchange Server operations.
Figure 2 shows the architecture of the SAN-based server cluster configuration that
Microsoft IT used prior to Exchange Server 2007 in the corporate messaging
environment. Four active nodes correspond to four Exchange Server 2003 Mailbox
servers, and one primary passive node is available to run one of these Mailbox servers
without performance penalties after a failover. The remaining two passive nodes
were less powerful and served primarily as systems to perform tape-based backup
operations. As illustrated, the SAN environment is fully redundant at the hardware
level from the cluster nodes all the way down to the storage media. Any single component
can fail along the path to the data without interrupting the service. After a timeout
of the pending I/O operation, the SAN software automatically switches to the second
path and the Mailbox server continues to run on the same cluster node. .gif)
Figure 2. Microsoft IT server cluster configuration for Exchange Server 2003
Mailbox servers
Despite the complexity and redundancy of SANs, Exchange databases remain critical
single points of failure. The SAN-based cluster can provide a high level of availability
for the messaging databases and the server cluster can recover from hardware failures
on up to three cluster nodes through a failover, but this server cluster cannot
recover from a critical storage failure. When SAN storage fails for any reason,
none of the seven cluster nodes can run the affected Mailbox server or servers until
Microsoft IT repairs the system and restores the messaging databases from backup.
Although it is unlikely that both disks in a particular mirror set simultaneously
break to cause a RAID 10 failure, hardware failures and firmware issues can
lead to SAN outages. Human error is also a possibility. In fact, human error is
the biggest risk factor due to the complex nature of SANs. It might take a long
time until a storage failure happens, but when it happens, the best strategy is
to have a second, synchronized data copy readily available. CCR provides this solution.
As illustrated in Figure 3, the CCR-based architecture is very straightforward and
fully redundant at the Exchange database level. Microsoft IT connects multiple storage
enclosures to each cluster node and creates RAID 10 drives with mirror sets
across the enclosures, as explained in more detail later in this paper. At the Exchange
database level, this results in twice the redundancy in comparison to the SAN-based
Mailbox server configuration illustrated in Figure 2.
.gif)
Figure 3. CCR-based Mailbox server configuration
It is an interesting aspect that CCR on DAS is able to deliver twice the redundancy
at the Exchange database level with less storage sophistication and complexity on
the individual cluster nodes in comparison to SCC on SAN. As a Microsoft IT engineer
put it, two simple storage systems can be better than one complex system. In a SAN-based
configuration, every cluster node has two host bus adapters connected to separate
Fibre Channel fabrics, switches, and controller pairs, and the Fibre Channel disks
are dual ported. Yet with CCR on DAS, Microsoft IT uses SAS disks with a single
port and only a single RAID controller in each cluster node per storage unit, as
shown in Figure 3. At the Exchange database level in the overall Mailbox server
design, the RAID controllers are redundant, but at the level of individual cluster
nodes, this is a local single point of failure. Eliminating this local single point
of failure requires a CCR on SAN configuration with two host bus adapters in each
cluster node connecting each node to a separate SAN storage array. Microsoft IT
used this configuration in very early Exchange Server 2007 Mailbox server designs,
but switched entirely to CCR on DAS with subsequent deployments during the initial
production rollout for cost reasons. Due to the required failover, CCR on DAS cannot
sustain a controller failure on a cluster node without service interruption. However,
considering the low failure rates at the RAID controller level and low impact on
the end-to-end availability of messaging services, Microsoft IT finds the service
interruption of less than two minutes of failover time acceptable because Microsoft
IT deployed 90 SAS RAID controller cards and had zero controller failures during
the past 12 months. For Microsoft IT, the costs to deploy CCR on SAN far outweigh
the benefit of eliminating these two minutes in the unlikely event that a RAID controller
fails. SCC on SAN is not an alternative because SCC on SAN cannot recover from storage
failures and requires Microsoft IT to restore data from backup, whereas CCR on DAS
can sustain a storage failure because the data is readily available on the second
node. Note: Microsoft IT initiates the vast majority
of failovers manually by using the Move-ClusteredMailboxServer
cmdlet during planned and unplanned maintenance, such as to install mandatory
security patches or update driver versions on cluster nodes. CCR on SAN has no advantage
over CCR on DAS in this scenario because the failovers are unavoidable and the duration
of the service interruption is comparable.
CCR technology does not automatically eliminate the need for backups. There is a
lack of redundancy if the storage subsystem on the primary node of a clustered Mailbox
server experiences a total failure because only one node remains with the data until
Microsoft IT repairs and reseeds the databases on the affected node. Backups provide
the required additional layer of protection, and the recommended approach is to
perform VSS-based backups on the passive node, as illustrated in Figure 4. .gif)
Figure 4. Data Protection Manager 2007–based backup on passive nodes
Microsoft IT switched from streaming backups on the active node to software-based
VSS backups on the passive node with the release of Data Protection Manager 2007.
By minimizing the backup-related performance impact on the active node, Microsoft
IT can accommodate more users per Mailbox server while at the same time performing
backup operations in much more frequent intervals. Microsoft IT configures the Data
Protection Manager server to receive transaction logs every 15 minutes. The server
performs an express full backup once a day to maintain a complete and consistent
image of the data on the Data Protection Manager server. The express full backup
relies on block-level synchronization in conjunction with the Exchange VSS Writer
to identify and replicate only those data blocks that have changed in the production
databases since the last express full backup.
For backup storage on Data Protection Manager servers, Microsoft IT also uses DAS
technology, specifically RAID 10 on Serial Advanced Technology Attachment (SATA)
disks with an individual disk capacity of more than 500 GB. In comparison to
hardware VSS backups that Microsoft IT used in conjunction with SAN-based Mailbox
servers, the solution based on Data Protection Manager 2007 on DAS helps Microsoft
IT reduce the complexity of the backup environment, eliminate third-party dependencies,
and achieve further storage cost savings while maintaining fast recovery objectives.
Data Protection Manager 2007 enables Microsoft IT to restore mailbox data from
any 15-minute point in time onto the original server or to a different server. However,
there is practically no need to perform restores onto the original server for disaster
recovery or any other purposes, as the more than 18 months of Microsoft IT experience
with CCR suggest. For Microsoft IT, restoring files from backup is a tool that can
be helpful in software testing and in the validation of disaster recovery plans.
Microsoft IT performs these restores to a different server to avoid affecting the
Mailbox servers in the corporate messaging environment.
CCR entirely transformed the Microsoft IT approach to fast recovery. The previous
method for SAN-based Mailbox servers relied on VSS clones. At midnight, Microsoft
IT cloned the logical unit numbers (LUNs) for the Mailbox server to a new set of
clone LUNs. Although this solution provided fast recovery of large amounts of data
from backup in a matter of minutes, it required two additional LUNs for each Mailbox
server LUN with associated high SAN costs and highly specialized storage engineers
to perform the recovery procedures in the SAN environment. By switching to CCR for
fast recovery with Exchange Server 2007, Microsoft IT eliminated these costs
and dependencies. Restoring from backup is no longer the primary recovery mechanism.
The fast recovery mechanism of a CCR-based Mailbox server is a straightforward failover
to the passive node.
The reason why CCR effectively eliminates the need for restores from backup onto
the original Mailbox servers stems from the fact that the cluster nodes in a CCR-based
Mailbox server are mutual hot-standby systems. Active and passive nodes can change
their roles at any time. CCR automatically reverses the replication direction to
keep the messaging databases synchronized by means of transaction log shipping and
replay on the passive node. This implies that any node can fail and rebuild by using
the messaging databases that are still available on the other node. The cluster
nodes share no hardware components. It is therefore unlikely that a storage failure
on one node will affect the messaging databases on the other node. Messaging databases
that are unaffected, mounted, and available online on the second node are the basis
for recovery scenarios without backups.
The typical CCR-based recovery scenario includes the following four phases (depicted
in Figure 5 after the list): - Normal operation The clustered
Mailbox server is available and new transactions, such as due to Hub Transport servers
delivering messages, result in new transaction log files on the active node. Through
file system notifications, the Microsoft Exchange Replication Service on the passive
node learns that new transaction logs are pending replication. The NTFS file system
generates these notifications on the active node whenever the ESE closes and renames
the current transaction log file with a sequence number to make room for the next
transaction log. The Exchange Replication Service on the passive node copies the
new transaction log files via a security-enhanced file share from the active node
into the local transaction log inspection folder. Exchange Server 2007 inspects
these logs and moves them to the target storage group's transaction log folder for
replay into the destination mailbox database. This asynchronous process of having
the passive node perform the transaction log replication helps to keep CPU and I/O
load on the active node at a minimum, but it also introduces a chance for a lossy
failover.
- Lossy failover and recovery In
the worst-case scenario where the active node fails right after a Hub Transport
server delivered messages and before CCR had a chance to replicate the current transactions,
failover occurs and the passive node becomes active without the most recent messages.
To retrieve the missing messages, the Mailbox server requests redelivery from all
Hub Transport servers in the local Active Directory site as part of the recovery
procedure after a lossy failover. Hub Transport servers maintain a transport dumpster
queue for each continuous replication-enabled storage group in order to retain recently
delivered messages and redeliver these messages promptly to bring the Mailbox server
up to date. Microsoft IT uses the Set-TransportConfig
cmdlet to configure this feature on all Hub Transport servers with the
MaxDumpsterSizePerStorageGroup parameter set to 15 MB and a
MaxDumpsterTime value of 07.00:00:00, which corresponds to seven days. This
is explained in more detail in the Microsoft IT Showcase Note on IT "Going
64-Bit with Microsoft Exchange Server 2007," available at
http://www.microsoft.com/downloads/details.aspx?FamilyID=f31e7541-f63a-4b7d-b8d2-3794c4dc3329&DisplayLang=en.
- Node repair At this point, the
Mailbox server is up to date and running on the remaining cluster node while Microsoft
IT performs the necessary repair activities on the failed node, such as replacing
the public network interface card (NIC) or updating a faulty driver. The key issue
is whether the node failure affected the messaging databases on the failed node.
Typically, all messaging data is still intact and Microsoft IT only needs to restart
the failed cluster node and catch up on transaction logs. In rare cases, node failures
corrupt messaging data. In this situation, Microsoft IT only needs to reseed the
affected databases from the remaining copy, which is an uncomplicated procedure
that uses the ESE streaming backup API to perform online copy operations. Through
reseeding, Microsoft IT can copy messaging databases individually from the currently
active node to the repaired node while users are online and accessing their mailboxes.
Users might notice slower server response times during the reseeding process, but
the Mailbox server with all the messaging data is available and the server performance
meets the Microsoft IT SLAs.
- Normal operation The Mailbox server
resumes normal operations after the failed node is repaired, and any affected messaging
databases are reseeded. On the repaired node, Exchange Server 2007 automatically
recognizes that it is now running in passive context, and CCR reverses to update
the repaired passive node with all transactions that occur on the active node. There
is no need to reconfigure the system. CCR simply continues to replicate the messaging
data from the active node to the passive node. Furthermore, Microsoft IT does not
need to perform a failback of the clustered Mailbox server to the original node
because both nodes have an identical hardware and software configuration. The Mailbox
server can continue to run on the second node without performance penalties, Data
Protection Manager 2007 automatically switches backup processes to the current
passive node, and Microsoft IT saves two minutes of failback time, which would otherwise
count against the high-availability SLAs.
.gif)
Figure 5. Recovering from a storage failure in a CCR-based Mailbox server
The storage subsystem manager is an important non-technical aspect of CCR on DAS
in comparison to any SAN-based Mailbox server design. In SAN-based environments,
dedicated and highly specialized storage engineers perform the necessary installation,
configuration, optimization, and troubleshooting tasks. In DAS-based environments,
regular Exchange administrators are typically capable of performing these storage-related
tasks. Having full control over all aspects of the Mailbox server design—end to
end, from the disks up to the client connections—is an important advantage for the
Exchange Messaging team at Microsoft IT. Yet, it also requires Microsoft IT to design
the storage subsystem with simplicity in mind so that Exchange administrators can
perform all configuration and maintenance tasks without the help of highly specialized
storage experts.
Microsoft IT achieves simplicity in the storage design primarily through the following
approaches: - Taking advantage of SAS technology Unlike
SAN technology that requires strict configuration, SAS offers impressive configuration
flexibility at the hardware layer. SAS uses common electrical SATA connections,
and an SAS enclosure can accept different types of disks. The order of the disks
in an array is not important, and it is possible to expand an SAS storage subsystem
by daisy-chaining storage enclosures together. Microsoft IT recently tested the
flexibility of SAS in a lab environment by shutting down a cluster node after moving
the clustered Mailbox server to the other node, changing the position of the SAS
disks in the storage enclosures, and replacing the RAID controller. During the restart
of the cluster node, the system fully recognized the RAID configuration and drive
letter assignments and the cluster node was immediately operational again. According
to Seagate data sheets, SAS technology has emerged as an enterprise technology with
small form factor (SFF) 2.5-inch disks surpassing Fibre Channel disks with an annualized
failure rate (AFR) of 0.55 percent (versus 0.62 percent) while disk capacities continue
to increase and hardware prices continue to fall. New generations of SAS RAID controllers
appear every 12 to 18 months and typically come along with new server generations,
whereas the product cycle of SAN counterparts is about three to four years. The
SAN innovation rate is slower due to the higher technical complexities in comparison
to DAS technology. For example, SFF disks appeared on the market more than two years
ago, yet SAN vendors are still not able to integrate this technology into their
systems. With CCR on DAS, Microsoft IT takes full advantage of the most dynamic
products and technologies in the storage market.
- Standardizing the storage layout by
creating universal storage building blocks To further minimize
hardware maintenance complexities, help ensure reliability, and provide scalability,
Microsoft IT developed a standardized storage layout that uses universal storage
building blocks for scaling production Mailbox servers. A universal storage building
block, or USBB as Microsoft IT calls it for short, is a self-contained unit of two
physical storage enclosures, combined to provide database and transaction log drives
for the specified set of Exchange databases. The number of drives—identified through
individual LUNs at the SCSI level—that Microsoft IT can use in a USBB depends on
the capacity of the storage enclosures and the required number of LUNs to provide
the desired data capacity and I/O performance. The USBB count per server depends
on the number of mailboxes and the mailbox quotas that Microsoft IT wants to maintain
on the server. However, the conceptual storage layout per USBB remains unchanged.
Among other things, this makes it easy to monitor the drives and replace failing
disks. An operator can easily identify each disk through its position in the storage
subsystem, as indicated in Figure 6, which shows a USBB configuration with 25 disks
per enclosure. This particular USBB provides three RAID 10 LUNs for messaging
databases, and one RAID 10 LUN for all transaction logs combined.
.gif)
Figure 6. A universal storage building block for CCR on DAS
An issue that Microsoft IT occasionally notices in large customer deployments on
SAN-based Mailbox servers concerns the use of shared storage for messaging databases.
Whereas Microsoft IT Mailbox server designs for Exchange Server 2003 always
followed product recommendations to use dedicated storage arrays, customers occasionally
ignore this recommendation and share the storage hardware between different types
of server applications, such as Microsoft SQL Server® and Exchange Server, in an
attempt to maximize capacity utilization. However, if Microsoft SQL Server databases
are stored on the same physical media as Exchange Server databases, running large
SQL Server jobs can lead to non-Exchange load surges and impaired Exchange Server
performance. Microsoft IT engineers call this issue
hot-spot contention to indicate poor performance of the storage subsystem
resulting from different usage patterns caused by different server applications
accessing the same storage media, as illustrated in Figure 7. .gif)
Figure 7. Different usage patterns on shared storage
The database I/O of Exchange Server 2007 is composed of large numbers of random
page requests using a page size of 8 KB, whereas other server applications
might access data more sequentially and in larger contiguous blocks. If all this
data resides on the same physical media, the heads of the hard disk drive must frequently
move out of the Exchange data region to service the non-Exchange data requests.
The result is an unpredictable sharp decrease in Exchange Server performance due
to increased response times at random intervals. For example, an Exchange read request
that might have taken 8 milliseconds (ms) might now take 108 ms because the disk
heads spend 100 ms retrieving non-Exchange data between the individual page requests
of Exchange Server.
Exchange Server administrators cannot analyze this problem because the non-Exchange
LUN is not visible in the Exchange server configuration. Likewise, the Exchange
Server LUN is invisible on the computer running SQL Server. Exchange Server administrators
and SQL Server administrators have no idea that they share the same physical storage
media. Furthermore, the storage engineer who created the LUNs is unaware of the
Exchange Server and SQL Server requirements. As far as the SAN environment is concerned,
the storage engineer followed common best practices to counterbalance SAN costs.
All systems show an optimized configuration, and yet users occasionally complain
about slow Office Outlook clients in online mode.
Hot-spot contention is hard to locate, and it keeps reappearing in customer environments
because there is no certainty in a SAN that the system configuration does not change
over time. All too often, unaware storage engineers cannot resist the temptation
to optimize available storage capacities. CCR on DAS puts an ultimate end to this
problem by taking Exchange Server storage out of the shared SAN environment. It
puts full control over the storage design into the hands of Exchange Server experts
and eliminates unpredictable system behavior due to the side effects of SAN capacity
optimization.
“Microsoft IT is our ultimate touchstone of enterprise readiness. From Microsoft
IT we learn not only how our technology performs under real-world conditions, but
also what issues and constraints system engineers face when designing production-grade
Mailbox servers. No test lab can deliver this valuable insight. It fuels our development
activities and ensures that we stay focused on actual current and future needs of
our customers.”
Matt Gossage
Sr. Program Manager
Exchange Server Product Group
Microsoft Corporation
In comparison to previous product versions, Exchange Server 2007 provides increased
design flexibility because this 64-bit messaging system takes full advantage of
available processor and memory resources and includes numerous architectural advancements
that help to lower the I/O footprint. Among other things, Exchange Server 2007
includes a database buffer cache that can substantially reduce the need for data
reads from disk during normal operation. The size of the database buffer cache depends
on the physical amount of random access memory (RAM) present in the system; therefore,
the physical amount of memory directly influences the I/O footprint. This relationship
between memory and I/O footprint opens new opportunities to achieve optimal server
response times by balancing memory with disk performance. This design flexibility
is especially noticeable when comparing Microsoft IT Mailbox server designs with
the results of the Exchange 2007 Mailbox Server Role Storage Requirements Calculator.
The storage requirements calculator performs calculations based on product recommendations,
while Microsoft IT takes advantage of excess I/O performance in the storage subsystem
to go beyond product recommendations. Microsoft IT arrives at acceptable response
times with less memory per Mailbox server than recommended because the storage subsystem
includes far more disks for capacity reasons than required for performance and can
therefore compensate for the increased I/O footprint that results from having less
RAM per user on the server. The Exchange 2007 Mailbox Server Role Storage Requirements
Calculator and detailed information about its use are available on the Microsoft
Exchange Team Blog at
http://msexchangeteam.com/archive/2007/01/15/432207.aspx.
Kyryl Perederiy, Senior Systems Engineer in the Microsoft IT Exchange Messaging
team, is responsible for the Mailbox server designs. He emphasizes that customers
evaluating the Microsoft IT designs should keep in mind that Microsoft IT currently
exceeds recommended configurations in its Mailbox server designs in an effort to
help the product group verify performance capabilities of Exchange Server 2007
under real-world conditions. Specifically, Microsoft IT Mailbox servers use 1 MB
to 2 MB of RAM per user instead of the recommended 3.5 MB to 5 MB,
but Microsoft IT Mailbox servers still perform well because the DAS-based storage
design of Microsoft IT provides plenty of headroom to make up for the difference,
as explained later in this paper. By taking advantage of excess I/O capabilities,
Microsoft IT was able to optimize memory capacities and switch from expensive high-end
enterprise hardware to mainstream enterprise server models, such as a dual socket
quad-core Intel Xeon processor X5355 server model with only eight slots for fully
buffered dual inline memory modules (FB-DIMMs) to support 6,000 users per server
on average. With a maximum available module size of 4 GB, this mainstream server
model has a maximum capacity of 32 GB of memory, but 4-GB DIMMs currently do
not offer an attractive capacity/price ratio for Microsoft IT. For cost reasons,
Microsoft IT uses 2-GB memory modules—in other words, a total of 16 GB of memory.
Microsoft IT emphasizes the following aspects in the storage design for Exchange
Server 2007 Mailbox servers: - Reliability Microsoft
IT uses enterprise storage equipment in Mailbox servers and specifically pays attention
to unrecoverable errors per bits read, AFRs, and manufacturer warranties. It is
also noteworthy that Microsoft IT prefers SFF 2.5-inch disks to the large form factor
(LFF) 3.5-inch disks because of the lower power requirements, lower cost per gigabyte,
and higher performance and reliability, as well as less heat emission and less vibration.
Less heat and less vibration result in less degradation over time. Hard disk drives
of choice are SFF SAS disks with one bad sector per 10E16 bits read, an AFR of 0.55
percent, and a three-year warranty. SATA disks are still not as reliable as SAS
disks. However, Microsoft IT is beginning to consider SATA because reliability and
performance are increasingly meeting enterprise standards, the capacity/price ratio
is attractive, and CCR deemphasizes the need for highly reliable disks in the storage
subsystem on each cluster node.
- Availability Redundancies
are a primary means in the Mailbox server design to help ensure high availability.
At the hardware layer, Microsoft IT uses multiple external storage enclosures with
redundant power supplies and controller connections. As mentioned earlier in this
paper, Microsoft IT mirrors the hard disk drives across enclosures and then includes
these mirrors in stripe sets to implement a RAID 10 configuration. In this
configuration, an entire enclosure can fail without affecting the availability of
the storage subsystem. Microsoft IT prefers RAID 10 to RAID 5 because
RAID 10 provides a higher level of redundancy and can tolerate multiple disk
failures, whereas RAID 5 can only tolerate a single disk failure. RAID 5–based
Mailbox server designs are still in an experimental stage at Microsoft IT. With
the exception of the RAID controller, Microsoft IT ensures redundancies for all
storage components on each cluster node, including enclosures, power supplies, hard
disk drives, and cables. CCR provides the necessary redundancies at the Exchange
database level and compensates very well for the missing RAID controller redundancy
at the individual cluster nodes.
- Performance Microsoft
IT uses RAID 10 also because RAID 10 performs better than RAID 5.
With RAID 10, write operations are free of parity calculations, and RAID 10
uses more disks than RAID 5 to deliver the same capacity, which benefits I/O
performance when the storage subsystem is being designed for capacity. For example,
it takes six 146-GB disks to build a RAID 10 drive with 438 GB of raw
capacity. RAID 5 only requires four 146-GB disks to accomplish the same. Assuming
SFF enterprise disks with a spindle speed of 10,000 revolutions per minute
(RPM), capable of performing approximately 160 I/O operations per second (IOPS),
the RAID 10 drive can handle 960 read IOPS (6 * 160 = 960), whereas the RAID 5
drive can only reach 640 read IOPS (4 * 160 = 640). Disk throughput and server response
times are important design criteria for Microsoft IT because slow Mailbox servers
affect employee productivity. Response times above 20 ms are tolerable only
for short periods of time on a highly loaded system, but not on an ongoing basis
because users will notice slow Office Outlook client behavior in online mode, such
as when switching folders or composing new messages. Especially with large mailboxes,
users need to be able to work fast and perform reliable searches to locate information
quickly.
- Capacity Performance
requirements determine the minimum number of disks in the storage subsystem, yet
more disks might be required to meet capacity needs. To ensure adequate storage
capacity, Microsoft IT calculates maximum database sizes based on the number of
mailboxes and the desired quotas, and then adds additional capacity for database
overhead, content indexing overhead, and unexpected database growth. If the number
of disks required to meet capacity needs exceeds the minimum number of disks required
to ensure I/O performance, Microsoft IT calls the storage design
capacity bound. On the other hand, if fewer disks are required to provide
the necessary capacities, but more disks are needed to reach the required I/O performance,
the design is performance bound. Due to
the low I/O requirements of Exchange Server 2007 and the high performance of
SFF SAS disks, Microsoft IT Mailbox server designs are currently capacity bound,
and Microsoft IT uses any excess I/O performance to counterbalance low memory conditions,
as mentioned earlier. This picture might change in the future if Microsoft IT decides
to switch to RAID 5 with SFF SAS or use larger and slower SATA disks.
- Costs For Microsoft
IT, cost efficiency is an integral part of enterprise readiness. Multiple solutions
may meet Microsoft IT's reliability, performance, and capacity requirements, include
similar support options and tools (such as management packs), and be on the standard
data center hardware list. Microsoft IT chooses the least expensive technology for
the corporate messaging environment to demonstrate the cost savings potential of
Exchange Server 2007. High-end enterprise server models provide the flexibility
to match product recommendations in the Mailbox server design, yet Microsoft IT
deliberately selects mainstream enterprise hardware to build scaled-up servers with
the lowest possible budget. CCR on DAS is a key enabler to drive down costs to unprecedented
levels, and RAID 5 with SFF SAS disks or RAID 10 on SATA disks might offer
further opportunities to continue this effort.
- Simplicity Microsoft
IT capitalizes on the potential of CCR on DAS to simplify the storage design through
straightforward RAID configurations and a standardized storage layout based on USBBs.
Among other things, simplicity helps to keep maintenance overhead and TCO at a minimum.
Exactly the same operations procedures apply to all Mailbox servers in the corporate
messaging environment regardless of mailbox numbers and quotas. Simplicity also
helps to ensure stable Mailbox server performance in the event of a component failure.
For example, Microsoft IT does not use a hot spare in the RAID 10 configuration,
so the RAID controller cannot automatically phase out a failed disk and rebuild
the RAID 10 array. Disk failures decrease the RAID 10 performance to a
certain degree, because fewer disks remain to handle the I/O load; yet rebuilding
a RAID array affects I/O performance even more noticeably. By scheduling hardware
maintenance during non-business hours, Microsoft IT minimizes the impact of disk
failures on server performance. Manually replacing hard disk drives is not a problem
for Microsoft IT. Most Microsoft data centers are staffed 24 hours per day, seven
days per week. In the event of a disk failure, an IT specialist is available to
replace the affected disk at the next appropriate maintenance window.
- Recoverability Hardware
and database redundancies enable Microsoft IT to ensure system recoverability in
the event of component failures as well as in the event of entire node failures.
Furthermore, Microsoft IT relies on Data Protection Manager 2007 to help ensure
recoverability in the event of a loss of both cluster nodes. RTO objectives demand
individual database restoration in less than one hour. Accordingly, Microsoft IT
distributes the mailbox data over a large number of messaging databases per server
to keep the maximum file size of each individual messaging database below 200 GB.
The maximum file size corresponds to the number of mailboxes and the desired quotas
plus database overhead, but excluding content indexing overhead and reserve for
unexpected database growth. Keeping individual database files below 200 GB
also helps Exchange Server complete online maintenance cycles regularly, which is
critical to keep the Mailbox server healthy and the system performance stable.
- Scalability All
messaging trends point upward at Microsoft. Message volumes are continuously rising,
and the number of users almost doubled since the Exchange Server 2003 time
frame. Mailbox sizes are steadily growing, and even 2-GB quotas might soon be too
small for agile users. CCR on DAS enables Microsoft IT to keep pace with these trends
by increasing the scalability of the Mailbox servers in the corporate messaging
environment. For example, hardware maintenance on the passive node does not affect
online users. It is possible to replace server hardware when new processor technology
becomes available or upgrade the storage subsystem on one node while the other node
keeps the Mailbox server available to users. A typical SAS-based RAID controller
can support up to 100 disks. Mainstream 2U enterprise server models can support
up to three RAID controllers, and large 4U high-end enterprise servers can include
up to seven RAID controllers, putting the ceiling at 700 disks. It is unlikely that
Microsoft IT will reach this scalability limit in the near future.
Note: Microsoft IT system engineers have contributed
to the Exchange 2007 Mailbox Server Role Storage Requirements Calculator and
strongly recommend this design tool to customers. In addition, third-party vendors
and original equipment manufacturers (OEMs) who want to develop and test storage
solutions for Exchange Server 2007 should consider participating in the Microsoft
Exchange Solution Reviewed Program (ESRP) – Storage v2.0. Detailed information about
ESRP is available at
http://technet.microsoft.com/en-us/exchange/bb412164.aspx. Mailbox Server Performance
“The ultimate test of the storage design is the failover. When thousands of concurrent
users hit that fresh active node, all read requests must go to disk because a cold
database buffer cache cannot deliver the data. That decisive moment shows how much
our Mailbox servers benefit from the high performance available with DAS.”
Kyryl Perederiy
Sr. Systems Engineer
Microsoft IT Exchange Messaging
Microsoft Corporation
In a reliable network environment with sufficient net-available bandwidth, processor,
memory, and storage subsystem are the main components that influence Mailbox server
performance. Processor capabilities have the most significant impact. Microsoft
IT used dual-core processors during the initial production rollout, which limits
server scalability to 2,000–3,000 mailboxes. Currently, Microsoft IT uses server
models with two quad-core Intel Xeon X5355 processors (a total of eight processor
cores) to realize an increased density of 6,000 mailboxes per server for heavy users.
Microsoft IT continues to monitor the processor market to take advantage of new
models as soon as they become available at reasonable pricing.
Memory capacities are not as critical for Microsoft IT because Exchange Server 2007
Service Pack 1 (SP1) includes further ESE enhancements and it is possible to counterbalance
memory deficiency with disk I/O performance. According to the product recommendation
of 3.5 MB to 5 MB of RAM per mailbox plus 2 GB of RAM per server, a server
with 6,000 mailboxes requires 24 GB to 32 GB of memory (6,000 * 3.5 or 5
MB / 1024 + 2 GB = 22.50 GB or 31.30 GB). Memory requirements
increase further with more mailboxes, yet even the product group does not recommend
more than 32 GB per Mailbox server to remain cost efficient. As mentioned earlier,
Microsoft IT optimized memory capacities by taking advantage of excess I/O performance
and supports 6,000 heavy users and up to twice as many medium users in one scalability
pilot with 16 GB of RAM. It is important to note, however, that these server
designs are for users with 500-MB quotas. For heavy users that require 2-GB
quotas, mostly full-time employees, Microsoft IT uses server designs for up to 4,000
mailboxes on the same server platform with two quad-core Intel Xeon X5355 processors.
Figure 8 illustrates how Exchange Server 2007 uses the available physical memory
to lower I/O requirements by caching frequently accessed data, such as the Inbox
folder view for the top-most messages, calendar information, and any message-processing
rules for each user. The less often the storage engine needs to reload this information,
the lower the I/O footprint. In combination with further ESE enhancements, such
as delayed write operations to avoid repeated writes to the same object on disk,
Exchange Server 2007 puts less I/O demand on the storage subsystem than any
previous Exchange Server version. Microsoft IT measured performance requirements
by monitoring the performance counters Disk Transfers/sec, Disk Reads/sec, and Disk
Writes/sec. Microsoft IT determined during the initial production rollout that Microsoft
employees typically generate approximately 0.27 IOPS to 0.4 IOPS in a read/write
mix of 1:1 on Mailbox servers with 5 MB of RAM per user. .gif)
Figure 8. Database buffer cache and database I/O
Now, more than 12 months after the initial production rollout, Microsoft IT uses
significantly less memory in Mailbox servers. It follows that I/O requirements increase
because Exchange Serv |