Exchange Server 2007 Design and Architecture at Microsoft
How the Microsoft Information Technology organization designed the corporate Exchange
Server 2007 environment
Technical White Paper
Published: November 9, 2007
|
Situation
|
Solution
|
Benefits
|
Products & Technologies
|
|
With Exchange Server 2003, Microsoft IT streamlined the messaging environment through
server and site consolidation. Server clusters based on Windows Clustering and a
highly available, shared storage solution based on storage area network (SAN) technology
helped to ensure 99.99 percent availability. However, the costs and general limitations
associated with the platforms and technologies used in the Exchange Server 2003
environment prevented Microsoft IT from efficiently meeting emerging messaging and
business needs.
|
By replacing all Exchange Server 2003 servers with servers running Exchange Server
2007, Microsoft IT created new opportunities to drive down costs and system complexities
while at the same time increasing security and deploying new features not available
in previous versions of Exchange Server.
|
- Increased reliability through new high-availability technologies, such as
Cluster Continuous Replication.
- Larger mailbox sizes through Mailbox servers equipped with cost-efficient
storage solutions.
- Reduced total cost of ownership (TCO) through cost-efficient storage solutions
and elimination of tape backups.
- Further reduced TCO by replacing third-party unified messaging systems with
Unified Messaging servers in the Exchange organization.
- Increased protection against attacks, spam, and malicious messages through
Edge Transport servers.
- Reduced topology complexities and improved regulatory compliance through Hub
Transport servers.
- Enhanced remote access and mobility options through Client Access servers.
|
- Windows Server 2003
- Active Directory
- Microsoft Exchange Server 2007
- Microsoft Office Outlook 2007
- Enterprise Storage Technologies
|
Contents
Executive Summary
Microsoft Information Technology (Microsoft IT) maintains a
complex Microsoft® Exchange Server environment consisting of several geographic
locations and multiple Active Directory® forests. There are 16 data centers,
four of which host Exchange Mailbox servers, to support more than 515 office locations
in 102 countries with 121,000 users, including managers, employees, contractors,
business partners, and vendors. Site and server consolidation conducted with Microsoft
Exchange Server 2003 and new deployment features available in Microsoft Exchange
Server 2007 in combination with proven planning, design, and deployment methodologies
enabled Microsoft IT to transition this environment to Exchange Server 2007
in less than eight months. Microsoft IT decommissioned the last Mailbox servers
running Exchange Server 2003 in the corporate Active Directory forest shortly
after Microsoft released the new Exchange Server release to manufacturing (RTM)
version on December 7, 2006.
This technical white paper discusses the Exchange Server 2007 architectures,
designs, and technologies that Microsoft IT chose for the corporate environment
and the strategies, procedures, successes, and practical experiences that Microsoft
IT gained during the planning and design phase. In addition to common planning and
design tasks typical for many Exchange Server deployment projects, such as server
design, high-availability implementation, and capacity planning, transitioning a
complex messaging environment to run on Exchange Server 2007 also entails specific
planning considerations regarding directory integration, routing topology, Internet
connectivity, client access technologies, and unified messaging (UM).
The most important benefits Microsoft IT achieved with the production rollout of
Exchange Server 2007 included a substantial reduction of hardware, storage,
and backup costs while maintaining 99.99 percent availability of messaging services.
New features, such as cluster continuous replication (CCR), enabled Microsoft IT
to reduce single points of failure in the messaging environment, increase Mailbox
server resilience from storage failures, and eliminate tape backups to reduce costs.
Exchange Server 2007 also enabled Microsoft IT to overcome the scalability
limitations of the 32-bit platform and increase mailbox quotas in the corporate
production environment from 200 megabytes (MB) to 500 MB and 2 gigabytes (GB). Microsoft
IT also lowered maintenance overhead and associated costs by replacing third-party
unified messaging systems with Exchange Server 2007-based Unified Messaging
servers. Other important benefits included increased fault tolerance and simplified
server maintenance through load-balanced server configurations for middle-tier services
such as Client Access, Hub Transport, and Unified Messaging server roles. Microsoft
IT also increased messaging protection through Microsoft Forefront™ Security for
Exchange Server installed on all Hub Transport and Edge Transport servers.
This paper contains information for business and technical decision makers who are
planning to deploy Exchange Server 2007. This paper assumes the audience is
already familiar with the concepts of the Windows Server® 2003 operating
system, Active Directory, and previous versions of Exchange Server. A high-level
understanding of the new features and technologies included in Exchange Server 2007
is also helpful. Detailed product information is available in the Microsoft Exchange
Server 2007 Technical Library at
http://technet.microsoft.com/en-us/library/bb124558.aspx.
Note: For security reasons, the sample names of forests, domains, organizations,
and other internal resources mentioned in this paper do not represent real resource
names used within Microsoft and are for illustration purposes only.
Introduction
From the earliest days, e-mail messaging has been an important communication tool
for Microsoft. Microsoft established the first company-wide messaging environment
in July 1982 based on Microsoft XENIX (a UNIX version for the Intel 8088 platform).
This environment evolved over more than a decade into a large and distributed infrastructure
that was increasingly difficult to manage. By migrating to Microsoft Exchange Server
version 4.0 in 1996, and subsequently upgrading to Microsoft Exchange Server
version 5.0 and Microsoft Exchange Server version 5.5 in 1997, Microsoft
IT achieved significant improvements in terms of functionality, maintainability,
and reliability. At the beginning of the new millennium, Microsoft IT operated a
messaging environment that included approximately 200 Exchange 5.5 servers
in more than 100 server locations with approximately 59,000 users.
Many things changed for Microsoft IT with the upgrade to Microsoft Exchange 2000
Server, released in October 2000. Exchange 2000 Server so tightly integrated
with the TCP/IP infrastructure, the Windows® operating system, and Active Directory
that the Exchange Messaging team no longer could manage the messaging environment
as an isolated infrastructure. A fundamental organizational change was necessary,
which manifested itself in a new approach that viewed Microsoft IT as a provider
of essential business services. Keeping Exchange servers running was no longer sufficient.
The Exchange Messaging team now owned messaging as a service, which included all
components upon which Exchange 2000 Server depended, such as Active Directory.
The shift of Microsoft IT toward a service-focused IT organization was also noticeable
in the designs and service level agreements (SLAs) that Microsoft IT established
with the rollout of Exchange Server 2003, released in October 2003. New
technologies, such as support for multi-node server clusters and cached Exchange
mode available in Microsoft Office Outlook® 2003, enabled Microsoft IT
to concentrate Mailbox servers in four data centers and reduce the number of Exchange
servers, including those for special purposes, from approximately 200 to roughly
100. The number of Mailbox servers dropped from 118 to 36. Individual Mailbox servers
hosted up to 4,000 mailboxes per active cluster node with quotas of 200 MB
per mailbox. Consolidating the corporate messaging environment yielded overall cost
savings of $20 million USD in the fiscal year 2003 alone.
Microsoft IT designed the Exchange Server 2003 environment for the scalability
and availability requirements of a fast-growing company. The consolidation included
upgrades to the network infrastructure and the deployment of large Mailbox servers
to support 85,000 mailboxes in total. Business-driven SLAs demanded 99.99 percent
availability, including both unplanned outages and planned downtime for maintenance,
patching, and so forth. To comply with the SLAs, Microsoft IT deployed almost 100
percent of the Mailbox servers in server clusters by using Windows Clustering
and a highly available, shared storage subsystem based on storage area network (SAN)
technology.
In 2006, prior to Exchange Server 2007, the environment had grown to include
130,000 mailboxes that handled 6 million internal messages and received 1 million
legitimate message submissions from the Internet daily. On average, every user sent
and received approximately 100 messages daily, amounting to an average e-mail volume
per user per day of 25 MB. As the demand for greater mailbox limits increased, new
technologies and cost-efficient storage solutions such as direct access storage
(DAS) were necessary to increase the level of messaging services in the corporate
environment.
"Our mission is to deliver value by enabling people with innovative and reliable
information technology solutions that seamlessly integrate with, and improve how
people work."
Jim DuBois
General Manager, MSIT
Microsoft Corporation
Reasons for Microsoft IT to Upgrade
A global survey of 1,400 chief information officers (CIOs) conducted by Gartner
Executive Programs in 2006 indicated the focus of IT is increasingly shifting from
cost-cutting to improving productivity, performance, and competitiveness. Following
a decade of unrestrained growth in IT and approximately five years of consolidation
and cutbacks thereafter, IT departments are now in a better position to refocus
their strategies on future-oriented business goals.
Microsoft is a good example. After a period of cost-cutting through server and site
consolidation, Microsoft IT used the deployment of Exchange Server 2007 as
an opportunity to shift gears and focus on implementing solutions that help improve
productivity and competitiveness, streamline system administration, and increase
messaging protection beyond the levels already possible with Exchange Server 2003
Service Pack 2.
For the deployment of Exchange Server 2007, Microsoft IT defined the following
key objectives:
- Increase employee productivity This included increasing
mailbox quotas by as much as 1,000 percent, from 200 MB to 500 MB and
2 GB, and deploying new productivity features available in Exchange Server 2007,
such as unified messaging, which enables users to receive all messages in their
mailboxes, including e-mail, voice mail, and fax messages. In addition to desktop
and portable clients, users can use standard telephones to access these messages.
Increasing employee productivity also included deploying Microsoft Office Outlook 2007
as the primary messaging client so users can benefit from new and advanced information
management features such as instant search, managed folders, and more.
- Increase operational efficiency This included reducing administrative
overhead associated with maintaining the messaging environment through features
that are directly available in Exchange Server 2007, such as Exchange Management
Shell. Based on the Microsoft .NET Framework, Exchange Management Shell enables
Microsoft IT to create custom scripts that facilitate mundane deployment tasks,
such as applying a consistent set of configuration settings per server role across
multiple servers.
- Decrease security risks This included deploying Edge Transport
servers and Forefront Security for Exchange Server in the perimeter network to increase
security and messaging protection and to reduce the number of legitimate messages
incorrectly identified as spam (false positives). This also included encrypting
all internal server-to-server message traffic using Transport Layer Security (TLS)
to help protect confidentiality for messages in transit.
- Decrease costs This included redesigning server architectures
and backup solutions for high availability to meet challenging SLAs. In redesigning
server architectures, Microsoft IT heavily focused on incorporating the features
directly available in Exchange Server 2007, replacing expensive SAN storage
with a more cost-efficient DAS solution for CCR-based Mailbox server clusters, and
eliminating tape backups. All of these considerations resulted in significant cost
savings. From backup changes alone, Microsoft IT realized a cost savings of approximately
5 million per year.
Note: The Technical Case Study "Enterprise Messaging with Microsoft Exchange
Server 2007" (http://technet.microsoft.com/en-us/library/bb687782.aspx)
provides detailed information about the business benefits and advantages that Microsoft
IT realized with the transition to Exchange Server 2007 in the corporate production
environment.
Environment
Prior to Exchange Server 2007
The Microsoft IT deployment options and design decisions for the transition to Exchange
Server 2007 heavily depended on the characteristics of the existing network,
Active Directory, and the messaging environment. Among other things, it was important
to perform the transition from Exchange Server 2003 to Exchange Server 2007
without service interruptions or data loss. Like any organization operating a large
messaging environment, Microsoft IT also had to take a phase of coexistence into
account, because it was not possible to transition the entire environment in one
gigantic step. To understand the Microsoft IT design decisions in detail, it is
necessary to review the environment in which Microsoft IT performed the transition.
Figure 1 illustrates the locations of the data centers that contain Mailbox servers
in the corporate production environment and the overall wide area network (WAN)
connections between them. Concerning the WAN backbone, it is important to note Microsoft
IT deliberately designed the network links to exceed capacity requirements. Only
10 percent of the theoretically available network bandwidth is dedicated to messaging
traffic. The vast majority of the bandwidth is for non-messaging purposes to support
the Microsoft product development teams.
.jpg)
Figure 1. The Exchange Server 2003 environment at Microsoft on March 31, 2006
Note: In addition to the data centers shown in Figure 1, there is one more
important site with Exchange servers in North America: Silicon Valley. The Exchange
servers in Silicon Valley provide a redundant path for sending and receiving Internet
mail. This site does not contain Mailbox servers.
Network
Infrastructure
Each Microsoft data center is responsible for a different region defined along geographical
boundaries. Within each region, network connectivity between offices and the data
center varies widely. For example, a high-speed metropolitan area network (MAN)
based on Gigabit Ethernet and Synchronous Optical Network (SONET) connects more
than 70 buildings to the Redmond data center. These are the office buildings on
and off the main campus in the Puget Sound area. In other regions, such as Asia
and the South Pacific, Internet-connected offices, or ICOs for short, are more dominant.
Broadband connectivity solutions, such as digital subscriber line (DSL) or cable
modems, provide significant cost savings over leased lines as long as the decrease
in performance and maintainability is acceptable. Microsoft IT uses this type of
connectivity primarily for regional sales and marketing offices.
Figure 2 summarizes the typical regional connectivity scenarios at Microsoft. It
is important to note there are no Mailbox servers outside the data center, whereas
Active Directory domain controllers may exist in high-availability buildings and
medium-sized offices for local handling of user authentication, authorization, and
application requests.
.jpg)
Figure 2. Regional connectivity scenarios at Microsoft
For regional connectivity, Microsoft IT relies on a mix of Internet-based and privately
owned/leased connections, as follows:
- Regional data centers and main campus The main campus
and regional data centers are connected together in a privately owned WAN based
on frame relay, asynchronous transfer mode (ATM), clear channel ATM links, and SONET
links.
- Office buildings with standard or high availability requirements Office
buildings connect to regional data centers over fiber-optic network links with up
to eight wavelength division multiplexing (WDM) channels per fiber pair.
- Regional offices with up to 150 employees Regional offices
use a persistent broadband connection or leased line to a local Internet service
provider (ISP) and then access their regional data centers through a transparent
virtual private network over Internet Protocol security (VPN/IPSec) tunnels.
- Mobile users These use a dial-up or non-persistent broadband
connection to a local ISP, and then access their mailboxes through VPN/IPsec tunnels,
or by using Microsoft Exchange ActiveSync®, remote procedure call (RPC) over
Hypertext Transfer Protocol (RPC over HTTP, also known as Outlook Anywhere), or
Microsoft Office Outlook Web Access over secure HTTP (HTTPS) connections.
Directory
Infrastructure
Like many IT organizations that must keep business units strictly separated for
legal or other reasons, Microsoft IT has implemented an Active Directory environment
with multiple forests. Each forest provides a clear separation of resources and
establishes strict boundaries for effective security isolation. At Microsoft, some
forests exist for legal reasons, others correspond to business divisions within
the company, and yet others are dedicated to product development groups for early
pre-release deployments and testing without interfering with the rest of the corporate
environment. For example, by maintaining separate product development forests, Microsoft
IT can prevent uncontrolled Active Directory schema changes in the Corporate Forest.
The most important forests at Microsoft have the following purposes:
- Corporate Seventy percent of the resources used at Microsoft
reside in this forest. The Corporate Forest includes approximately 140 domain controllers.
The Active Directory database is 11 GB in size, with over 1 million directory objects,
including users, groups, organizational units, workstations, servers, domain controller
accounts, and printer objects.
- Corporate Staging Microsoft IT uses this forest to stage
software images, gather performance metrics, and create deployment documentation.
- Exchange Development Microsoft uses this forest for running
pre-release Exchange Server versions in a limited production environment. Users
within this forest use beta or pre-beta versions in their daily work to help identify
issues prior to the release of the product. Microsoft IT manages and monitors this
forest, while the Exchange Server development group hosts the mailboxes in this
forest to validate productivity scenarios.
- Extranet Microsoft IT has implemented this forest to provide
business partners with access to corporate resources. There are approximately 30,000
user accounts in this forest.
- MSN® MSN is an online content provider through Internet
portals such as msn.com. Microsoft IT manages this forest jointly with the MSN technology
team.
- MSNBC MSNBC is a news service and a joint venture between
Microsoft and NBC Universal News. Legal reasons require Microsoft to maintain a
separate forest for MSNBC. Microsoft IT manages this forest jointly with the MSNBC
technology team.
- Test Extranet This forest enables the Extranet Technology
team to test new solutions for partner collaboration without interfering with the
Extranet Forest. Microsoft IT manages this forest jointly with the Extranet Technology
team.
- Windows Deployment Microsoft IT created this forest to launch
pilot projects during the Windows Server 2003 deployment phase as a pre-staging
environment prior to deployment and feature configuration in the Corporate Forest.
It is a limited production forest. Users within this forest use beta or pre-beta
software in their daily work to help product development groups identify and eliminate
design flaws and other issues.
- Windows Legacy This forest is used as a test environment
for compatibility testing of previous Windows Server versions with Exchange Server
(specifically Microsoft Windows 2000 service pack testing).
Note: Microsoft IT maintains a common global address list (GAL) across all
relevant forests that contain Exchange Server organizations by using Active Directory
GAL management agents, available in Microsoft Identity Integration Server (MIIS) 2003.
Domains in the Corporate Forest
Microsoft IT implemented nine domains in the Corporate Forest, separated into geographic
regions. At the time of the production rollout, all domains in the Corporate Forest
operated at the Windows Server 2003 functional level and contained between
seven and 30 domain controllers. The domain controllers are 64-bit multi-processor
systems with 16 GB of random access memory (RAM). The Microsoft IT Active Directory
database in the Corporate Forest is approximately 11 GB in size. With 16 GB of RAM,
domain controllers can load the entire 11 GB Active Directory database into
memory, which provides good performance for Exchange Server and other applications
that extensively perform directory lookups by using Lightweight Directory Access
Protocol (LDAP).
Note: Microsoft IT does not use domains to decentralize user account administration.
The human resources (HR) department centrally manages the user accounts, including
e-mail address information, in a separate line-of-business (LOB) application. The
HR system provides advanced business logic not readily available in Active Directory
to enforce consistency and compliance. It is the authoritative source of user account
information, synchronized with Active Directory through MIIS 2003.
Active Directory Sites
Overall, the corporate production environment (that is, the Corporate Forest) includes
202 Active Directory sites in a hub-and-spoke topology that closely mirrors the
network infrastructure. The authoritative source of IP address and subnet information
necessary for the Active Directory site definitions is an infrastructure database
that the Microsoft IT network team maintains. Using MIIS 2003, Microsoft IT
provisions site and subnet objects in Active Directory based on the data from the
IP address and subnet infrastructure database and ensures in this way an accurate
Active Directory site topology that mirrors the network layout. The MIIS solution
automatically calculates all site links during the import into Active Directory.
Based on this information, Knowledge Consistency Checker (KCC) updates the replication
topology for the forest. Microsoft IT does not maintain the Active Directory replication
topology manually.
Site Topology and Exchange Server 2003
A highly granular site and subnet topology provides advantages in terms of Exchange
Server communication. Although Exchange Server 2003 relies primarily on routing
groups to describe the physical network topology, communication with Active Directory
and client referral logic is site-aware. The Directory Service Access (DSAccess)
component prefers to communicate with domain controllers and global catalog servers
in the local Active Directory site. Through an Active Directory site topology that
closely mirrors the physical network infrastructure, Microsoft IT confines DSAccess
communication to local network segments.
Figure 3 shows the Active Directory sites in the Corporate Forest that are relevant
for the Exchange Server organization.
.jpg)
Figure 3. Exchange Server 2003 Active Directory sites and site links at Microsoft
As shown in Figure 3, there are four sites with Mailbox servers and a fifth site
with Internet mail gateway servers running Exchange Server 2003. The remaining
Active Directory sites, ADSITE_REDMOND and ADSITE_NORTH CAROLINA, contain infrastructure
servers and domain controllers to handle authentication requests from client workstations
and LOB applications, but no Exchange servers. Although these sites are not very
relevant for Exchange Server 2003, they influence the future Exchange Server 2007
design, as explained later in this white paper. With respect to the Exchange Server 2007
design, it is important to note that all site links are bidirectional and IP-based.
Note: The Active Directory site topology at Microsoft mirrors the network
layout of the corporate production environment, with ADSITE_REDMOND as the hub site
in a hub-and-spoke arrangement of sites and site links. In contrast, the routing
topology of Exchange Server 2003, defined through a hub-and-spoke topology
of routing groups and routing group connectors, used a routing group containing
the servers that physically resided in the ADSITE_REDMOND-EXCHANGE site. This difference
is significant for the design of Exchange Server 2007 message routing, discussed
later in this paper.
Dedicated Exchange Site Design
It is also worth mentioning that the Active Directory site named ADSITE_REDMOND-EXCHANGE
contains only Exchange servers and domain controllers configured as global catalog
servers. Microsoft created this dedicated site design during the Exchange 2000
Server time frame to provide its Exchange 2000 and 2003 servers with
exclusive access to highly available Active Directory servers, shielded from client
authentication and other application traffic. For detailed information about the
implementation of a dedicated Active Directory site for Exchange Server 2003,
see the Microsoft IT Showcase Note on IT "Creating an Active Directory Site for
Exchange Server," available for download at
http://www.microsoft.com/downloads/details.aspx?FamilyID=6b263452-7a61-4253-9c9e-b337cb80d460&DisplayLang=en.
Microsoft IT continues to use the dedicated Exchange site in its Exchange Server 2007
environment for the following reasons:
- Performance assessments Exclusive Active Directory servers
provide an opportunity to gather targeted performance data. Based on this data,
Microsoft IT and developers can assess the impact of Exchange Server versions and
service packs on domain controllers in a genuine large-scale production environment.
- Windows Server 2008 domain controllers Microsoft IT
maintained Windows Server 2008 Beta domain controllers in the corporate production
environment. Microsoft IT decided to separate the Windows Server 2008 deployment
from the messaging environment by using Active Directory sites without Exchange
servers, such as ADSITE_REDMOND.
Warning: Implementing dedicated Active Directory sites for Exchange Server
increases the complexity of the directory replication topology and the required
number of domain controllers in the environment. To maximize the return on investment
(ROI), customers should weigh the business and technical needs for dedicated Active
Directory sites. Due to the Exchange Server 2007 reliance on Active Directory
sites, dedicated sites can increase the complexity of Exchange Server 2007
topologies and design.
Messaging
Infrastructure
The design of an Exchange Server 2003 organization relies on two topologies:
a flat arrangement of administrative groups and a topology of routing groups and
routing group connectors. Figure 4 shows the administrative and routing group topology
of Microsoft IT's Exchange Server 2003 organization in the Corporate Forest
and connectivity to other production forests and Exchange organizations.
.jpg)
Figure 4. Administrative and routing groups in the Corporate Forest just prior to
the start of the transition to Exchange Server 2007 (March 31, 2006)
Administrative Topology
Microsoft IT uses a centralized administration model for Exchange servers. Although
there are four administrative groups (North America, Dublin, Singapore, and Sao
Paulo) in the legacy Exchange Server topology, Microsoft IT did not use custom permissions
based on administrative groups. All Exchange Server administrators are located in
Redmond and perform system administration and remote monitoring. The regional data
centers are only responsible for hardware maintenance.
Note: Microsoft data centers operate 24 hours, seven days per week. In the
event of a hardware problem, such as a disk failure, a local IT specialist is readily
available to replace the affected hardware with minimum delay. The only exception
is Sao Paulo, which observes regular business hours.
Routing Topology
The Exchange Server 2003 routing topology followed a hub/spoke architecture
that corresponded to the WAN links depicted in Figure 1 earlier in this white paper.
The data centers in Redmond, Dublin, Singapore, and Sao Paulo corresponded to routing
groups with Mailbox servers. Two additional routing groups existed with gateway
servers dedicated to external communication: RG_REDMOND PERIMETER and RG_SILICON
VALLEY PERIMETER. Silicon Valley provided Internet mail redundancy for the messaging
environment.
Between the routing groups, Microsoft IT used routing group connectors (RGCs) with
the default option Any local server can send mail over this connector. Accordingly,
all Exchange servers were able to transfer messages to adjacent routing groups directly.
Using only one physical path to each routing group and all Exchange servers as local
bridgeheads eliminated the need to communicate link state changes within and across
routing groups. It also enabled Microsoft IT to keep the number of dedicated bridgehead
servers at a minimum.
To implement the hub/spoke topology, Microsoft IT deployed four central bridgehead
servers in North America, which Microsoft IT specified as remote bridgehead servers
in the RGC configuration for the regions. Exchange servers in RG_DUBLIN, RG_SINGAPORE,
and RG_SAO PAULO that wanted to transfer messages to other routing groups could
do so through one of these bridgehead servers. In this way, the bridgehead servers
created a bifurcation point for messages addressed to recipients in multiple locations.
By splitting messages into multiple copies (bifurcation) at the latest possible
point (the bridgeheads), Microsoft IT preserved network bandwidth on WAN links.
The four central bridgehead servers also performed message routing for communication
with external entities, such as Internet destinations, partner domains, and Exchange
organizations in other forests. For communication with Exchange organizations in
other forests, Microsoft IT configured Simple Mail Transfer Protocol (SMTP) connectors
directly on the bridgeheads (see Figure 4). Messages addressed to partners or Internet
recipients, on the other hand, reached their destinations through further bridgehead
servers (for antivirus scanning) and Internet mail gateways.
The Internet mail gateways in RG_DUBLIN and RG_SINGAPORE only handled outbound Internet
mail for their local routing groups, whereas the Internet mail gateways in RG_REDMOND
PERIMETER and RG_SILICON VALLEY PERIMETER were responsible for outbound and inbound
Internet mail transfer. All inbound Internet mail messages reached Microsoft through
two redundant locations in North America, which was the most efficient configuration
for Microsoft because it has the flat e-mail domain namespace (@microsoft.com) and
the majority of mailboxes are located in its Redmond location. The Internet mail
gateways performed a series of anti-spam and other filtering checks (for example,
block-list, sender, and recipient filtering), before routing the messages to internal
bridgehead servers for virus scanning. Microsoft IT opted not to co-deploy antivirus
solutions on the 32-bit Internet mail gateway servers in order to be able to apply
anti-spam filtering first and avoid the overhead associated with virus scanning.
Additional Information: The Internet mail gateways in Redmond and Silicon
Valley received up to 13 million daily message submissions from the Internet in
normal situations, blocking 10.5 million of these as not legitimate (March 2006
statistics). During virus outbreaks on the Internet, the load occasionally exceeded
30 million e-mail submissions per day.
Server Roles per Location
Microsoft IT assigned a dedicated role to most servers in the Exchange Server 2003
organization, which allowed a more precise hardware configuration for each server
to reflect high performance and scalability demands. An exception to this rule was
the Mailbox server in Sao Paulo. Due to moderate workload, Microsoft IT was able
to consolidate server roles in this location. The server in Sao Paulo acted as a
Mailbox server, public-folder server, and bridgehead server. Moderate workload also
enabled Microsoft IT to assign multiple roles to the public-folder servers in Dublin
and Singapore. These servers assumed the responsibilities of public-folder servers
and bridgehead servers for the regions.
Table 1 lists the number of servers per role that Microsoft IT deployed in each
routing group (March 31, 2006).
Table 1. Servers per Role per Routing Group Prior to Exchange Server 2007 Transition
|
Routing group
|
Mailbox servers
(clustered)
|
Public-folder servers*
|
Bridge-head servers
|
Front-end servers
|
Gateway servers
|
Special purpose**
|
|
RG_REDMOND-EXCHANGE
|
21
|
5
|
8
|
6
|
0
|
3
|
|
RG_DUBLIN
|
6
|
2
|
2
|
2
|
0
|
|
RG_SINGAPORE
|
5
|
2
|
2
|
2
|
0
|
|
RG_SAO PAULO
|
1
|
2
|
0
|
0
|
|
RG_REDMOND PERIMETER
|
0
|
0
|
0
|
0
|
3
|
0
|
|
RG_SILICON VALLEY PERIMETER
|
0
|
0
|
0
|
0
|
3
|
0
|
* For hosting public-folder data, free-and-busy information, and offline address
books
** To support messaging needs of internal LOB applications, etc.
Note: Because of server and site consolidation during the Exchange Server 2003
time frame, Microsoft IT deployed almost all Exchange Server 2003 Mailbox servers
as clustered systems with a maximum of 4,000 mailboxes per Exchange Virtual Server
(EVS). Through this configuration, Microsoft IT achieved 99.99 percent server availability,
as explained in the Technical Solution Brief "Achieving High Availability with Exchange
Server at Microsoft" (http://technet.microsoft.com/en-us/library/bb687782.aspx).
Planning
and Design Process
The Microsoft IT planning and design process is unique in the way that messaging
engineers start their work early in the product development cycle and collaborate
very closely with the Microsoft Exchange Server Product group to clarify how exactly
the new Exchange Server version should address concrete business requirements, system
requirements, operational requirements, and user requirements. Through an assessment
of the Exchange Server 2003 environment and in discussions with partner and
customer IT organizations, Microsoft IT identified general issues, such as high
storage costs and server scalability issues on the 32-bit platform, and communicated
the findings as opportunities for improvements to the developers. In this way, the
planning and design process at Microsoft IT actually influenced the product itself,
addressing not only the requirements of Microsoft but also the needs of partner
and customer IT organizations.
Figure 5 illustrates how Microsoft IT aligned the Exchange Server 2007 design
and deployment processes from assessment and scoping through full production rollout.
The individual activities correspond to the phases and milestones outlined in the
Microsoft Solutions Framework (MSF) Process Model.
.jpg)
Figure 5. Microsoft IT planning, design, and deployment processes
Note: Detailed information about the MSF, including an MSF Resource Kit and
case studies, is available on Microsoft TechNet.
The next sections discuss key activities that helped Microsoft IT to determine an
optimal Exchange Server 2007 architecture and design for the corporate production
environment.
Assessment
and Scoping
In extensive planning sessions, product managers, service managers, the Exchange
Systems Management team, Tier 2 Support team, Helpdesk, and Messaging Engineering
team collaborated in virtual project teams to identify business and technical requirements
and translated these requirements into proposals to the Exchange Server Product
group. The Exchange Server Product group reviewed and incorporated these proposals
into its product development plans. The results were commitments and shared goals
between the developers and Microsoft IT to drive deployment actions and investments
intended to improve IT services.
Deployment
Planning Exercises
Within Microsoft IT, the Messaging Engineering team is responsible for creating
the architectures and designs of all Exchange-related technologies. At a stage when
the actual product was not available, messaging engineers began their work with
planning exercises based on product development plans. The objective of these exercises
was to decide how to deploy the new Exchange Server version in the future. The messaging
engineers based their design decisions on specific productivity scenarios, the scalability
and availability needs of the company, and other requirements defined during the
assessment and scoping phase. For example, the Exchange Messaging team decided to
use CCR to eliminate single points of failure in the Mailbox server configuration
and DAS to drive down storage costs while at the same time increasing mailbox quotas
up to a factor of 10. The deployment planning exercises helped to identify required
hardware and storage technologies that Microsoft IT needed to invest in to be able
to achieve the desired improvements.
Engineering
Lab
The Messaging Engineering team maintains a lab environment that simulates the corporate
production environment in terms of technologies, topology, product versions, and
interoperability scenarios, but without production users. The Engineering Lab includes
examples of the same hardware and storage platforms that Microsoft IT uses in the
corporate production environment. It provides the analysis and testing ground for
the messaging engineers to validate, benchmark, and optimize designs for specific
scenarios, test components, verify deployment approaches, and estimate operational
readiness. Testing in the Engineering Lab enables the messaging engineers to ensure
that the conceptual and functional designs scale to the requirements and scope of
the deployment project. For example, code instabilities or missing features of beta
products might require Microsoft IT to alter designs and integration plans. Messaging
engineers can verify the capabilities of chosen platforms and work with the product
teams and hardware vendors to make sure the deployed systems function as expected
even when running pre-release versions.
Pre-Release
Production Deployments
Microsoft IT maintains a pre-release infrastructure, which is a limited production
environment for running pre-release versions of server products. Pre-release production
deployments begin prior to the alpha phase and continue through the beta and release
candidate (RC) stages until Microsoft releases the product to manufacturing. During
the pre-alpha stage, pre-release production deployments are a developer effort.
Additional employees from within Microsoft join the campaign during the beta stages
as early adopters.
Pre-release production deployments enable the developers to determine the enterprise
readiness of the software, identify issues that might otherwise not be found prior
to RTM, and collect valuable user feedback. For example, Exchange Server 2007
pre-release verification started in February 2005, more than 22 months before the
Exchange Server Product group shipped the product. In comparison, the Microsoft
Exchange Server 2003 pre-release deployment period was only six months.
Technology
Adoption Program
The Exchange Server 2007 Technology Adoption Program (TAP) started during the
pre-alpha stage in April 2005. TAP is a special Microsoft initiative, available
by invitation only, to obtain real-world feedback from Microsoft partners and customers.
More than 90 Microsoft partners and customers participated in the Exchange Server 2007
TAP. The Messaging Engineering team was also actively involved by providing early
adopters with presentations outlining the Microsoft IT design process based on the
then-current state of the product.
Microsoft runs several types of TAP programs for partners and customers to obtain
real-world feedback on Microsoft pre-release products. For more information, see
the TAP early-adopter information on MSDN at
http://msdn2.microsoft.com/en-us/isv/bb190413.aspx.
Pilot
Projects
An important task of the Messaging Engineering team is to document all designs,
which the messaging engineers pass as specifications to the technical leads in the
Systems Management team for acceptance and implementation. The messaging engineers
also assist the technical leads during pilot projects and server installations and
help develop a set of build documents and checklists to provide operators with detailed
deployment instructions for the full-scale production rollout.
Production
Rollout
The server designs that the Messaging Engineering team creates include detailed
hardware specifications for each server type, which the Infrastructure Management
team and the Data Center Operations group at Microsoft IT use to coordinate the
procurement and installation of server hardware in the data centers. The Data Center
Operations group builds the servers, installs the operating systems, and joins the
new servers to the appropriate forest before the Exchange Systems Management team
takes over for the deployment of Exchange Server 2007 and related components.
To achieve a rapid deployment, Microsoft IT automated most of the Exchange Server
deployment steps by means of Exchange Management Shell scripts.
Architecture
and Design Decisions
"Microsoft IT is our first and best customer. Almost two years prior to RTM, Microsoft
IT began with pre-release production deployments to help us build an excellent product.
The close relationship with Microsoft IT is so vital to our culture of quality and
customer satisfaction that we do not ship products or service packs until Microsoft
IT signs off on the enterprise readiness. We shipped Exchange Server 2007 on
December 7, 2006, with the confidence and proof in hand that the product delivers
on its potential to help customers build reliable enterprise-class messaging environments
while reducing total cost of ownership."
Terry Myerson
General Manager
Exchange Server Product Group
Microsoft Corporation
One of the most important objectives that Microsoft defined for the Exchange Server 2007
production rollout was to finish the transition no later than the official RTM date
of the product. Microsoft IT committed to perform the rollout at full scale by using
the Beta 2 release to demonstrate the enterprise readiness of Exchange Server 2007
to Microsoft customers. However, the Beta 2 release was not perfect, and Microsoft
IT did not know all of the product's performance parameters yet. For these reasons,
Microsoft IT deliberately created initial designs for the Client Access, Hub Transport,
Edge Transport, Mailbox, and Unified Messaging server roles that exceeded actual
production requirements. To maximize the ROI of server hardware and storage technology,
Microsoft IT began to optimize the designs after the completion of the initial rollout.
The following sections discuss these updated designs.
Administration
and Permissions Model
Like any public company in the United States, Microsoft must safeguard the corporate
IT infrastructure to comply with legal and regulatory requirements, such as the
Health Insurance Portability and Accountability Act of 1996 and the Sarbanes-Oxley
Act of 2002. These regulations also apply to the Exchange Server environment. To
prevent fraud, protect assets, and ensure that messaging resources are used as intended,
Microsoft IT implemented a strict administrative design according to the principle
of fewest privileges as well as formal approval processes for granting administrative
rights.
Security Principles and Guidelines
The Messaging Engineering team used the following principles and guidelines in the
administrative design for Exchange Server 2007:
- Always use groups to assign rights Delegating administrative
permissions through security groups is uncomplicated, easy to understand, and a
best practice in enterprise IT environments. Among other things, users can determine
their effective rights by analyzing their group memberships. Moreover, security
groups eliminate the need to configure individual access control lists (ACLs) for
resources, which helps to reduce complexities caused by possibly conflicting rights
granted through multiple direct or inherited access control entries (ACEs) and helps
to keep the size of ACLs under control. Granting rights through group membership
is also more efficient than granting access permissions through individual assignments,
because a security group's access permissions do not change when members are added
or removed from the group.
- Use the default permissions model The default permissions
model of Exchange Server 2007 covers most of the Microsoft IT requirements.
With the exception of specific needs, such as regulatory requirements, Microsoft
IT does not deviate from the default model to maintain the fewest necessary number
of ACLs on Exchange Server resources.
- Grant the least permission necessary The principle of fewest
privileges is a generally recognized security approach and a Microsoft IT best practice.
Microsoft IT never grants full Domain or Enterprise Admin rights unless there is
a compelling business or technical reason. For example, administrators who need
to modify recipient attributes in Active Directory do not need and do not receive
access permissions to resources in the Exchange Server organization.
- Implement approval processes Each security group that Microsoft
IT uses for granting rights must be controlled based on approval processes that
include the IT team responsible for maintaining the service or data.
Exclusive Microsoft IT Management
The administrative model of Exchange Server 2007 relies on Active Directory
forests to define security boundaries. Within a single forest, there is no security
isolation, because forest owners and enterprise administrators can always gain access
to all resources in any domain. Accordingly, Microsoft IT grants enterprise administrator
and top-level domain administrator rights
in the Corporate Forest only on a temporary basis and enforces very strict approval
processes.
Very strict approval processes imply that developers from the Exchange Server Product
group and employees who are not responsible for operating the corporate production
environment are usually not granted administrator rights in the Corporate Forest.
If individuals outside Microsoft IT require administrative access to Exchange Server
resources for testing or other purposes, the corresponding resources cannot be part
of the Corporate Forest. To accommodate these situations, Microsoft IT maintains
separate forests, such as an Exchange Development Forest that the Exchange Server
Product group can use to test pre-release versions of Exchange Server.
Only in rare situations does Microsoft IT grant developers temporary administrative
access to production servers, such as when troubleshooting critical server failures.
To accommodate these situations, Microsoft IT implemented an access model based
on Group Policy objects (GPOs) and Restricted Groups policies. User accounts that
are removed from the membership list in a Restricted Groups policy are removed from
the corresponding security group during the next GPO refresh, which occurs every
90 minutes on a member server and every five minutes on a domain controller and
also every 16 hours, whether or not there are any changes. In this way, Microsoft
IT enforces the removal of temporary administrative access and maintains the principle
of fewest privileges in the corporate production environment.
Centralized System Administration
Exchange Server 2007 supports centralized and decentralized administration
by means of four administrator roles:
- Exchange Organization Administrators As the name implies,
the Exchange Organization Administrators role gives administrators organization-wide,
full access to all Exchange properties, objects, and servers. All Microsoft IT operators
within the Exchange Messaging team are Exchange Organization Administrators with
the rights to read and change all Exchange configuration data in the Corporate Forest
worldwide. This strictly centralized model greatly simplifies administrative complexities.
- Exchange Server Administrators These administrators only
have permissions to control the configuration of a particular server or group of
selected servers and cannot perform global, organization-wide administration tasks.
Because Microsoft IT manages all Exchange Server resources centrally from headquarters
in Redmond, it was not necessary for Microsoft IT to delegate the Exchange Server
Administrators role for subsets of Exchange servers to separate user accounts or
security groups.
- Exchange Recipient Administrators Users assigned the Exchange
Recipient Administrators role take care of recipient-related tasks, such as mailbox-enabling
user accounts, mail-enabling contacts, distribution groups, and other types of recipient
objects in Active Directory, and configuring Client Access and Unified Messaging
mailbox settings. It is important to note that the Exchange Messaging team does
not perform recipient administration. Other teams, such as the HR department, maintain
the user accounts and related Exchange settings.
- Exchange View-Only Administrators These are users without
operational tasks such as Helpdesk leads and messaging engineers who have a business
need to examine the parameters and current state of the messaging environment.
Note: The administration and permissions model of Exchange Server 2007
does not rely on administrative groups. An administrative group called Exchange
Administrative group (FYDIBOHF23SPDLT) only exists for backward-compatibility reasons.
All computers running Exchange Server 2007 belong to this administrative group.
It is not a supported operation to rename this group or move server objects into
a different administrative group by using low-level Active Directory tools.
Default Permissions Model
The Exchange Server 2007 permissions model is straightforward and flexible
because it promotes the use of universal security groups for permissions management
that closely correspond to the administrative roles listed in the previous section.
The Exchange Setup /PrepareAD process creates these universal security groups in
the root domain with a forest-wide scope. The groups are located in the Microsoft
Exchange Security Groups container. They are globally available for permission assignments
to Exchange Server resources in any domain, and they can contain users and groups
from any domain in the forest.
Microsoft IT based its centralized administration model for Exchange Server 2007
on the following default universal security groups:
- Exchange View-Only Administrators Have read-only access
to Exchange recipient attributes on Active Directory objects and read-only access
to the Exchange configuration data in Active Directory.
- Exchange Recipient Administrators Have full control over
Exchange recipient attributes on Active Directory objects. In addition, this group
has read-only access to the Exchange configuration data in Active Directory because
this group is a member of the Exchange View-Only Administrators group.
- Exchange Organization Administrators Have full access to
the Exchange configuration data in Active Directory. In addition, this group has
full control over Exchange recipient attributes on Active Directory objects because
this group is a member of the Exchange Recipient Administrators group.
One important feature of the default configuration is that there is no Exchange
Server Administrators group because that is not how Microsoft IT applies permissions
for the Exchange Server Administrator role. Instead, Microsoft IT applies the permissions
directly on the object in the configuration partition. Another important fact is
that Exchange Organization Administrators have permissions to modify recipient attributes
through membership in the Exchange Recipient Administrators group. Microsoft IT
considered removing the Exchange Organization Administrators from the Exchange Recipient
Administrators group but found no compelling reason to deviate from the default
model during the initial deployment. The default permissions do not grant Exchange
Organization Administrators rights to create, modify, or delete objects within the
Active Directory domain-naming context, which would require at least Account Operators
privileges. The authoritative source of user account information is a separate LOB
application, maintained by the HR department and synchronized with Active Directory
through MIIS 2003.
Formal Approval Processes
The domain topology that Microsoft IT implemented in the Corporate Forest features
an empty root domain to support strict management processes for schema updates and
for granting forest-wide administrative rights and privileges. Only a very limited
number of IT managers have administrative permissions in the root domain. Because
the Exchange Setup /PrepareAD process creates the universal security groups of Exchange
Server 2007 in the root domain, Microsoft IT implicitly gained the ability
to enforce formal approval processes for Exchange Server administration as well.
Restricting access to the universal security groups of Exchange Server 2007
in the root domain does not prevent Microsoft IT from delegating approval processes
to individual teams and groups that are ultimately responsible for the corporate
and Exchange Server environment, such as the Legal and Corporate Affairs (LCA) team
and the Exchange Messaging team. To delegate approval processes, Microsoft IT added
security groups from child domains to the universal security groups in the root
domain. Team managers with permissions to change group membership information in
child domains can use these security groups to delegate administrative rights.
Figure 6 shows the principle of delegating approval processes to individual teams
and groups through security groups. As a best practice, Microsoft IT always uses
security groups (in child domains) as opposed to individual user accounts to assign
administrative permissions. The membership information of security groups in the
root domain rarely changes.
.jpg)
Figure 6. Delegating access control and approval processes
Permissions Review
One interesting aspect of the Exchange Server 2007 default permissions model
is that when running setup with the /PrepareLegacyExchangePermissions option,
it begins the automatic process of changing the permissions model of the Exchange
Server 2003 environment for coexistence with Exchange Server 2007. Microsoft
IT completes this process by running the setup with the /PrepareAD option,
which finishes the changes to property sets and security groups. While applying
new Exchange Server 2007 permissions, Microsoft IT took the opportunity to
review and clean up permission assignments from the Exchange Server 2003 environment.
For example, Microsoft IT used security groups to grant administrative permissions
to Exchange Server 2003 resources at the organization and administrative group
levels. Yet, some members of these groups no longer need these permissions. By using
new security groups in the child domains, added to the Exchange Organization Administrators
group in the root domain, Microsoft IT effectively devised the following strategy
to reevaluate all existing Exchange Server administrators:
- Current Exchange Server administrators Individuals who need
administrative rights to manage Exchange Server 2007 resources must request
them from the Systems Management team. Following each individual approval, the Systems
Management team adds the corresponding user account to the new security groups to
grant Exchange Organization Administrators permissions.
- Former Exchange Server administrators Individuals who no
longer need administrative permissions automatically lose these rights as Microsoft
IT decommissions Exchange Server 2003 resources and corresponding administrator
groups.
Message
Routing Topology
Agile companies that heavily rely on e-mail in practically all areas of the business,
such as Microsoft, cannot tolerate messages that take hours to reach their final
destinations. Information must travel fast, reliably, predictably, and securely.
For Microsoft, this means 99 percent of all messages within the corporate production
environment must reach their final destination in 90 seconds worldwide. Of course,
this SLA does not apply to messages that leave the Microsoft environment, because
Microsoft IT cannot guarantee message delivery across external systems. Microsoft
established this mail-delivery SLA during the Exchange Server 2003 time frame,
and it was equally important during the design of Exchange Server 2007. Another
important design goal was to increase security in the messaging backbone by means
of access restrictions to messaging connectors, data encryption through TLS, and
messaging antivirus protection based on Forefront Security for Exchange Server.
Microsoft IT had to consider the following new Exchange Server 2007 routing
features in the design of the message routing topology:
- Active Directory sites Comparable to routing groups in Exchange
Server 2003, Active Directory sites define a boundary for Hub Transport servers
to deliver messages directly to Mailbox servers, distribution group expansion servers,
connector servers, and Edge Transport servers subscribed to the local site. For
destinations in remote Active Directory sites, the Hub Transport server in the local
site must relay the messages to a Hub Transport server in the remote site for further
delivery. At least one Hub Transport server must exist in every Active Directory
site with mailbox servers.
- IP site links Comparable to routing group connectors in
Exchange Server 2003, IP site links define logical paths between Active Directory
sites, which Hub Transport servers use to perform message routing calculations.
It is important to note that Active Directory supports IP and SMTP site links, whereas
Exchange Server 2007 ignores SMTP site links in the routing topology. If a
Hub Transport server resides in a site connected only by an SMTP site link, routing
errors will occur. It is necessary to replace the SMTP site link with an IP-based
site link.
- Least-cost routing paths Each IP site link is associated
with a cost value that Exchange Server 2007 uses to calculate the message transfer
paths across the routing topology. If multiple IP site links exist between two Active
Directory sites with Hub Transport servers, Exchange Server 2007 transfers
the message along the path with the least combined cost to the ultimate destination.
- Next hop selection If the least-cost routing path to the
ultimate destination includes multiple Active Directory sites and IP site links,
Exchange Server 2007 attempts to relay the messages to an Active Directory
site that is as close as possible to the ultimate destination when delivery to that
destination fails. For example, Hub Transport servers might select the ultimate
destination site as the next hop to deliver messages directly, skipping all intermediary
sites that exist along the least-cost routing path.
- Queue at the point of failure and backoff If message delivery
to the next hop fails (for example, because there was no Hub Transport server available
in the target site), Exchange Server 2007 attempts to deliver the messages
to an interim site along the least-cost routing path. This backoff mechanism starts
with the Active Directory site that is directly adjacent to the unavailable next
hop, and then backs off, site by site, along the least-cost routing path until a
connection is established or no further site exists. In this way, Exchange Server 2007
queues the messages at the point in the routing path where communication failed,
which facilitates the troubleshooting of transfer issues.
- Delayed fan-out If a message has multiple recipients in
destinations that share part of or the entire least-cost routing path, Exchange
Server 2007 transfers a single message copy to the point in the routing path
where a fork occurs. At the fork, Exchange Server 2007 splits the message into
separate copies. Again, each message copy might still have multiple recipients in
destinations that share part of or the entire remaining least-cost routing path,
and so forth. The bifurcation at the latest possible point to preserve network bandwidth
is called delayed fan-out.
Network Infrastructure and Site Consolidation
An important aspect for planning message routing in an Exchange Server 2007
environment is the physical network topology. The physical network determines the
IP routing topology, which directly influences the Active Directory site topology,
which in turn determines the message routing topology of Exchange Server 2007.
By using the existing Active Directory site infrastructure for message routing purposes,
Exchange Server 2007 takes advantage of an optimal network configuration with
little need for adjustments.
For example, Microsoft IT did not need to perform any network optimization to accommodate
Exchange Server 2007, although the Exchange Server 2003 site and server
consolidation that Microsoft IT conducted at a global scale three years ago benefited
Microsoft IT in the Exchange Server 2007 design. Microsoft IT only had four
main Active Directory sites with Exchange Mailbox servers and reliable WAN links
with sufficient net-available bandwidth to take into consideration for Exchange
Server 2007 topology and routing planning. The Exchange Server 2003 site
consolidation greatly reduced the complexity of the transition exercise for Microsoft
IT.
The fact that Microsoft IT only had to consider four main locations and Active Directory
sites that already mapped to an optimized network infrastructure, as discussed earlier
in this white paper, led to the following benefits:
- Uncomplicated messaging topology No additional configuration
was necessary to establish a functional Exchange Server 2007 routing topology—although
opportunities to increase the efficiency of message transfer still existed. However,
managing only four sites that house Mailbox servers and optimizing message flow
by means of Exchange Server-specific IP site links was a straightforward undertaking
for Microsoft IT.
- Best possible Hub Transport server
utilization Mailbox servers submit messages for routing
and transport to a Hub Transport server in the local Active Directory site. Depending
on the location of the recipients, the Hub Transport server then transfers the messages
to a Mailbox server within the local site, to a Hub Transport server in another
Active Directory site, or through a messaging connector to another external destination.
The result is that message transfer cannot work if there is no Hub Transport server
in the local site of the Mailbox server. In addition, this implies that a small
number of Active Directory sites with a large number of Mailbox servers provide
a better Hub Transport server utilization than a large number of Active Directory
sites with a small number of Mailbox servers. For example, Microsoft IT deployed
three Hub Transport servers for 15 Mailbox servers in ADSITE_DUBLIN to perform load
balancing and provide fault tolerance. If the Mailbox servers in Dublin resided
in two Active Directory sites instead, Microsoft IT would have had to deploy one
additional Hub Transport server, because each site would have required a minimum
of two Hub Transport servers to achieve load balancing and fault tolerance.
Note: The exact server ratios per Active Directory site depend on the performance
characteristics of the individual environment and the server configurations.
- Reduced chance of server communication issues For purposes
of message routing, client access, and unified messaging, it is important to deploy
Hub Transport servers, Client Access servers, and Unified Messaging servers in each
Active Directory site that contains Mailbox servers. Active Directory sites correspond
to one or more IP subnets that represent areas with reliable, high-speed IP connectivity.
Yet, it is possible to leave gaps or specify overlapping IP subnets in the Active
Directory site topology, which can cause communication issues if Exchange Server 2007
cannot correctly determine its site membership. Keeping the Active Directory site
topology straightforward helps Microsoft IT avoid these types of issues.
Dedicated Exchange Sites in the Active Directory Topology
Although dedicated Active Directory sites for Exchange Server may be used for domain
controller/global catalog isolation and dedication of that infrastructure to Exchange
servers for reliability or performance reasons, in the context of Exchange Server 2007
transport, dedicated Exchange sites can complicate the routing topology. They deviate
from the classic definition of an Active Directory site as areas with reliable,
high-speed, low-latency IP connectivity. Because dedicated Exchange sites generally
do not correspond to the physical network layout, they can lead to a message routing
topology that does not use the physical network links as efficiently as possible.
For example, Microsoft IT maintains a dedicated Exchange Active Directory site in
addition to a central hub Active Directory site in Redmond, as indicated in Figure
3 earlier in this white paper. In the Active Directory replication topology, this
dedicated Exchange site is a tail site. ADSITE_REDMOND is the Active Directory hub
site that is used as the Active Directory replication focal point, yet this site
does not contain any Hub Transport servers. Accordingly, at this point Exchange
Server 2007 cannot use ADSITE_REDMOND as a hub site for message routing purposes
and by default interprets the Microsoft IT Exchange organization as a full-mesh
topology where Exchange 2007 Hub servers in each region connect to each other
via a single SMTP hop. This does not match the Active Directory replication and
network topology. Exchange Server 2007 uses the full-mesh topology in this
concrete scenario, because along the IP site links all message transfer paths appear
to be direct.
Figure 7 illustrates this situation by placing the Active Directory site topology
and default message routing topology on top of each other.
.jpg)
Figure 7. Full-mesh message routing in a hub-and-spoke network topology
The routing topology depicted in Figure 7 works because Exchange Server 2007
can transfer messages directly between the sites with Hub Transport servers (such
as ADSITE_SINGAPORE to ADSITE_DUBLIN), yet all messages travel through the Redmond
location according to the physical network layout. In this topology, all bifurcation
of messages, sent to recipients in multiple sites, takes place at the source sites
and not at the latest possible point along the physical path, which would be Redmond.
For example, if a user in Sao Paulo sends a single message to recipients in the
sites ADSITE_REDMOND-EXCHANGE, ADSITE_DUBLIN, and ADSITE_SINGAPORE, the source Hub
Transport server in Sao Paulo establishes three separate SMTP connections, one SMTP
connection to Hub Transport servers in each remote site, to transfer three message
copies. Hence, the same message travels three times over the network from Sao Paulo
to Redmond. Microsoft IT could avoid this by eliminating the dedicated Exchange
site ADSITE_REDMOND-EXCHANGE and moving all Exchange servers to ADSITE_REDMOND.
The Hub Transport servers in ADSITE_REDMOND would then be in the transfer path between
ADSITE_SAO PAULO, ADSITE_DUBLIN, and ADSITE_SINGAPORE, which would mean that Exchange
Server 2007 could delay bifurcation until messages to recipients in multiple
sites reach ADSITE_REDMOND. In this situation, the source server would only need
to transfer one message copy to Redmond, the message routing topology would follow
the physical network layout, and Microsoft IT would not have to take any extra configuration
or optimization steps.
Based on such considerations, it is a logical conclusion that the design of an ideal
Exchange Server 2007 environment takes the implications of dedicated Exchange
Active Directory sites into account. On one hand, it is beneficial to keep the message
routing topology straightforward and the complexities associated with maintaining
and troubleshooting message transfer minimal. On the other hand, Microsoft IT had
to weigh the benefits of eliminating the dedicated Redmond Exchange site ADSITE_REDMOND-EXCHANGE
against the impact of such an undertaking on the overall deployment project in terms
of costs, resources, and timelines. Among other things, Microsoft IT maintains ADSITE_REDMOND-EXCHANGE
to shield Exchange Server 2007 from Windows Server 2008 domain controllers
that exist in the corporate production environment. As mentioned earlier in this
paper, Exchange Server 2007 support for Windows Server 2008 was not available
at the time Microsoft IT transitioned the corporate production environment. Eliminating
ADSITE_REDMOND-EXCHANGE would have required Microsoft IT to remove all Windows Server 2008
domain controllers from ADSITE_REDMOND, which was not an option. Furthermore, Microsoft
IT takes advantage of the dedicated Active Directory site to measure the footprint
of Exchange Server 2007 on domain controller/global catalog servers and provide
this information as feedback to the product teams. Correspondingly, Microsoft IT
decided to leave ADSITE_REDMOND-EXCHANGE in place. Instead, the Exchange Messaging
team collaborated with the Active Directory team to adjust the Active Directory
site topology by using alternative methods to optimize message transfer without
affecting the established Active Directory replication architecture and topology.
Optimized Message Transfer Between Hub Transport Servers
Although Exchange Server 2007 generated a functioning message routing topology
without any extra design work, Microsoft IT decided to review the routing topology
based on business and technical requirements to drive further optimizations. Key
factors that influenced the optimization decision included the "90 seconds, 99 percent
of the time" mail-delivery SLA and the desire to save network bandwidth on WAN links
by increasing the efficiency of message transfer.
Important reasons that compelled Microsoft IT to optimize the Exchange Server 2007
message transfer topology include the following:
- Efficient message flow At Microsoft, 99 percent of the messages
must reach their recipients within 90 seconds or less. Although optimized message
flow is not a strict requirement and it is possible to meet mail-delivery SLAs in
a full-mesh topology, optimized message flow can help to accelerate message delivery.
- Preserved WAN bandwidth The corporate production environment
handles more than 6 million internal messages daily. Although most message traffic
stays in the local site or has Redmond headquarters as the destination, optimized
message flow can help to preserve WAN bandwidth for all messages with recipients
in multiple remote Active Directory sites.
Having made the decision to optimize message routing, Microsoft IT augmented the
Active Directory site link topology in order to take advantage of the Exchange Hub
Transport servers in ADSITE_REDMOND-EXCHANGE. To achieve efficient message flow
and preserve WAN bandwidth, it was necessary to place ADSITE_REDMOND-EXCHANGE in
the routing path between ADSITE_DUBLIN, ADSITE_SAO PAULO, and ADSITE_SINGAPORE by
creating additional Active Directory site IP links. This approach ensured that Exchange
Server 2007 could bifurcate messages traveling between regions closer to their
destination. By configuring the ExchangeCost attribute on Active Directory
site links, which Exchange Server 2007 adds to the Active Directory site link
definition, Microsoft IT was able to perform the message flow optimization without
affecting the Active Directory replication topology. The ExchangeCost attribute
is only relevant for Exchange Server 2007 message routing decisions between
sites, not Active Directory replication.
Microsoft IT performed the following steps to optimize message routing in the corporate
production environment:
- To establish a hub/spoke topology between all sites with Exchange servers, Microsoft
IT created three additional Active Directory IP site links (see Figure 8).
- Microsoft IT specified a Cost value of 999 (highest across the Active Directory
topology) for these new IP site links so that Active Directory does not use these
site links for directory replication.
- Using the Set-AdSiteLink cmdlet, Microsoft IT assigned an ExchangeCost
value of 10 to the new Exchange-specific site links. This value is significantly
lower than the Cost value of all other Active Directory site links, so that
Exchange Server 2007 uses the Exchange-specific site links for message routing
path discovery.
Figure 8 illustrates how the Exchange-specific site links change the message routing
topology. The dedicated Exchange site ADSITE_REDMOND-EXCHANGE in North America now
acts as a fork in the routing path to the sites in ADSITE_DUBLIN, ADSITE_SAO PAULO,
and ADSITE_SINGAPORE.
.jpg)
Figure 8. Optimized message routing topology in the Corporate Forest
Based on the Exchange-specific Active Directory/IP site link topology, Exchange
Server 2007 routes messages in the Corporate Forest as follows:
- Messages to a single destination The source Hub Transport
server selects the final destination as the next hop and sends the messages directly
to a Hub Transport server in that site. For example, in the Dublin to Singapore
mail routing scenario, the network connection passes through Redmond, but the Hub
Transport servers in ADSITE_REDMOND-EXCHANGE do not participate in the message transfer.
- Messages to an unavailable destination If the source Hub
Transport server is unable to establish a direct connection to a destination site,
the Hub Transport server backs off along the least-cost routing path until a connection
to a Hub Transport server in an Active Directory site is established. This would
be a Hub Transport server in ADSITE_REDMOND-EXCHANGE, which queues the messages
for transmission to the final destination upon restoration of network connectivity.
- Messages to recipients in multiple sites Exchange Server 2007
delays message bifurcation if possible. In the optimized topology, this means that
Hub Transport servers transfer all messages with recipients in multiple sites first
to a Hub Transport server in ADSITE_REDMOND-EXCHANGE. The Hub Transport server in
ADSITE_REDMOND-EXCHANGE then performs the bifurcation and transfers a separate copy
of the message to each destination. Figure 8 illustrates this scenario. Exchange
Server 2007 transfers a single message copy from ADSITE_SAO PAULO to ADSITE_REDMOND-EXCHANGE,
where bifurcation takes place, before transferring individual message copies to
each destination site. Again, Exchange Server 2007 transfers only a single
copy per destination site. Within each site, the receiving Hub Transport servers
may bifurcate the message further as necessary for delivery to individual recipients.
Note: Microsoft IT did not configure the REDMOND-EXCHANGE site as a hub site
in the routing topology by using the Set-AdSite cmdlet to force all messages
between regions to travel through the REDMOND-EXCHANGE site, requiring an extra
SMTP hop on Hub Transport servers in that site. This would have mirrored the previous
Exchange Server 2003 routing topology, yet Microsoft IT found no compelling
reason to force all message traffic through the North American Hub Transport servers.
Establishing a hub site is useful if tail sites cannot communicate directly with
each other. In the Microsoft IT corporate production environment, this is not an
issue. For more information about hub site configurations, see the topic "Understanding
Active Directory Site-Based Routing" in the online product documentation at
http://technet.microsoft.com/en-us/library/aa998825.aspx.
Connectivity to Remote SMTP Domains
For destinations outside the Corporate Forest, Microsoft IT distinguishes between
external and internal remote locations. For external remote locations, Microsoft
IT relays all messages over Edge Transport servers deployed in perimeter networks,
as explained in the section "Internet Mail Connectivity" later in this
white paper. For internal remote locations, Microsoft IT uses messaging connectors
directly on the Hub Transport servers in ADSITE_REDMOND-EXCHANGE. This design mirrors
the Exchange Server 2003 topology.
Increased Message Routing Security
To ensure compliance with legal and regulatory requirements, Microsoft IT encrypts
most messaging traffic in the corporate production environment. The only exceptions
are internal destinations without user mailboxes, such as lab and test environments.
Microsoft IT uses the following encryption technologies to prevent unauthorized
access to information during message transmission:
- IP security IPSec encrypts data communication and prevents
unauthorized access to resources in the corporate production environment. Because
IPSec works at the IP layer, it can help secure communication between servers without
relying on the application to support the encryption natively. Microsoft IT extensively
used IPSec encryption in its Exchange Server 2003 environment to help secure
internal SMTP transactions and continues its use today in scenarios where SMTP servers
and applications do not support native TLS-based SMTP encryption. Although IPSec
technology offers strong encryption controls, managing custom IPSec policies can
be quite cumbersome. With the transition to Exchange Server 2007, Microsoft
IT was able to accomplish most of the transport encryption by using native Exchange
Server product features.
- Transport Layer Security Exchange Server 2007 supports
TLS right out of the box. Hub Transport servers use TLS to encrypt all message traffic
within the Exchange Server 2007 environment and rely on opportunistic TLS encryption
for communication with remote destinations, such as Hub Transport servers in other
Microsoft IT-managed forests. Edge Transport servers also support TLS and domain
security to establish security-enhanced message transfer paths to business partners
over the Internet. Native support for SMTP TLS on Hub Transport and Edge Transport
servers enabled Microsoft IT to eliminate the dependency on complex IPSec policies
for encryption of internal and external messages in transit.
In addition to encrypting messaging traffic internally, Microsoft IT also protects
its internal messaging environment by restricting access to inbound SMTP submission
points. This helps Microsoft IT minimize mail spoofing and ensure that unauthorized
SMTP mail submissions from rogue internal clients and applications do not affect
corporate e-mail communications. To accomplish this goal, Microsoft IT removes all
default Receive connectors on the Hub Transport servers and configures custom Receive
connectors by using the New-ReceiveConnector cmdlet to accept only authenticated
SMTP connections from other Hub Transport servers and Edge Transport servers in
the environment. To meet the needs of internal SMTP applications and clients, Microsoft
IT established a separate SMTP gateway infrastructure based on Exchange Server 2007
Hub Transport servers that enforces mail submission access controls, filtering,
and other security checks.
Furthermore, Microsoft IT deployed Forefront Security for Exchange Server on all
Hub Transport and Edge Transport servers to implement messaging protection against
viruses and other malicious e-mail content at multiple layers in the Exchange Server 2007
infrastructure. Despite the fact that internal messages and messages from the Internet
might pass through multiple Hub Transport and Edge Transport servers, performance-intensive
antivirus scanning is performed only once. Forefront Security adds a security-enhanced
antivirus header to each scanned message, so further Hub Transport or Edge Transport
servers do not need to scan the same message a second time. This avoids processing
overhead while maintaining an effective level of antivirus protection for all inbound,
outbound, and internal e-mail messages.
Coexistence with Exchange Server 2003
Coexistence with Exchange Server 2003 introduces a special connectivity scenario
that Microsoft IT had to consider during the planning phase. Exchange Server 2007
integrates into the existing routing topology through a special routing group, called
EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR), which has the following limitations and
restrictions for compatibility reasons:
- All computers running Exchange Server 2007 must be members of EXCHANGE ROUTING
GROUP (DWBGZMFD01QNBJR) Within this routing group, Exchange 2007
servers use the Active Directory site topology for message routing.
- Computers running previous versions of Exchange Server cannot be members of EXCHANGE
ROUTING GROUP (DWBGZMFD01QNBJR) Exchange Server 2003 is
unaware of the various Exchange 2007 server roles and cannot recognize server-specific
communication requirements. Therefore, Exchange Server 2003 cannot coexist
in the same routing group with Exchange Server 2007.
One important implication of these restrictions is that EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)
differs from the classical Exchange Server routing group in scope and definition.
EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) is global in scope and does not define
a network region with reliable connectivity. Rather, EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)
defines a parallel hemisphere in which the Active Directory site topology defines
the message routing topology, as illustrated in Figure 9.
.jpg)
Figure 9. Exchange Server 2003 and Exchange Server 2007 routing topologies
in the Corporate Forest
Microsoft IT started the transition of the Corporate Forest messaging environment
from Exchange Server 2003 to Exchange Server 2007 in the Redmond location.
With the introduction of Exchange Server 2007 Mailbox servers, Hub Transport
servers, and other server roles in that location, Microsoft IT needed to establish
message transfer between the legacy Exchange Server 2003 and new Exchange Server 2007
environment and maintain this messaging connectivity for the entire period of coexistence.
For that reason, Microsoft IT decided to connect the Exchange Server 2007-specific
routing group first to the Exchange Server 2003 routing group RG_REDMOND in
North America by using routing group connectors. Later, during the transition phase,
as the Exchange Server 2007 infrastructure expanded into the regions, Microsoft
IT created routing group connectors between the Exchange Server 2007-specific
routing group and the regional Exchange Server 2003 routing groups. Shortly
after this stage, Microsoft IT raised the cost of the legacy routing group connectors
between regional routing groups in the Exchange Server 2003 environment, to
route all messaging traffic between regions through the Exchange Server 2007
messaging backbone and take advantage of native Exchange Server 2007 routing
features.
Microsoft IT performed the following steps to establish messaging connectivity between
Exchange Server 2003 and Exchange Server 2007 in the corporate production
environment:
- During the first Exchange Server 2007 installation, Microsoft IT selected a
bridgehead server from RG_REDMOND to establish the initial routing group connection.
- To grant all Exchange 2003 servers the necessary send and receive permissions
on Hub Transport servers, Microsoft IT made sure that the
ExchangeLegacyInterop universal security group in the root domain included
the Exchange Domain Servers global security groups from the child domains. When
using Exchange Server 2007 tools to create the routing group connectors,
the specified Exchange Server 2003 servers are automatically added to the ExchangeLegacyInterop group to grant the
legacy servers the required permissions to send mail to and receive mail from Exchange Server 2007
Hub Transport servers.
Note: It is a Microsoft IT-specific configuration to include the Exchange
Domain Servers global security groups from the child domains in the
ExchangeLegacyInterop universal security group in the root domain. This
configuration enables the connector's remote legacy servers to be updated at any
time and ensures that the ExchangeLegacyInterop group contains the appropriate server.
- For fault tolerance and load balancing, Microsoft IT installed additional Hub Transport
servers in ADSITE_REDMOND-EXCHANGE and adjusted the bridgehead server configuration
of the initial routing group connection, as summarized in Table 2.
- Following the deployment of Exchange Server 2007 in Dublin and Singapore, Microsoft
IT established additional RGCs to optimize message flow and facilitate decommissioning
of legacy routing groups. The Sao Paulo location did not require an additional RGC,
because Microsoft IT transitioned all recipients in this location in a single step.
For more information about the deployment process, see the section "Deployment Planning"
later in this white paper.
Table 2. Bridgehead Server Configuration for Routing Group Connectors
|
Routing group connector
|
Local bridgeheads
|
Remote bridgeheads
|
|
From RG_REDMOND to EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)
|
Any local server can send mail over this connector.
This enables all Exchange 2003 servers to transfer messages directly to the
Hub Transport servers without involving Exchange 2003 bridgeheads.
|
All Hub Transport servers located in ADSITE_REDMOND-EXCHANGE
|
|
From EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) to RG_REDMOND
|
All Hub Transport servers located in ADSITE_REDMOND-EXCHANGE.
|
All Hub Transport servers located in RG_REDMOND
|
|
From RG_DUBLIN to EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)
|
Any local server can send mail over this connector.
|
All Hub Transport servers located in ADSITE_DUBLIN
|
|
From EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) to RG_DUBLIN
|
All Hub Transport servers located in ADSITE_DUBLIN.
|
The public-folder servers in RG_DUBLIN, which also function as bridgehead servers
|
|
From RG_SINGAPORE to EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)
|
Any local server can send mail over this connector.
|
All Hub Transport servers located in ADSITE_SINGAPORE
|
|
From EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) to the Singapore routing group
|
All Hub Transport servers located in ADSITE_SINGAPORE.
|
The public-folder servers in RG_SINGAPORE, which also function as bridgehead servers
|
Important: Because Microsoft IT used routing group connectors in a straightforward
hub/spoke topology with the default setting of Any local server can send mail over
this connector, it was not necessary for Microsoft IT to suppress link state
updates on Exchange 2003 servers by specifying the SuppressStateChanges Registry
parameter in preparation of the deployment of additional routing group connectors.
Microsoft IT recommends that all customers suppress link state updates regardless
of routing group connector configuration. For more information about the SuppressStateChanges
Registry parameter, see the topic "How to Suppress Link State Updates" in the Exchange
Server 2007 product documentation, available online at
http://technet.microsoft.com/en-us/library/aa996728.aspx.
Server
Architectures and Designs
Some important business requirements that Microsoft IT addressed in the server architectures
and designs for Exchange Server 2007 revolved around the goals of eliminating
the performance and scalability issues of the 32-bit platform and establishing a
flexible messaging infrastructure to support growing mailboxes and a larger number
of clients. The product group helped Microsoft IT achieve the first of these goals
by optimizing Exchange Server 2007 for 64-bit server hardware, specifically
x64 processors. By exploiting the advantages of dedicated Exchange 2007 server
roles in combination with load-balanced, fault-tolerant server configurations, the
Exchange Messaging team was also able to maintain SLAs with 99.99 percent availability
of messaging services.
Flexible and Scalable Messaging Infrastructure
Microsoft IT heavily focused on single-role server deployments in almost all regions
of the messaging environment. A server role is a logical unit that groups a selected
set of server features and components together to perform a specific messaging function.
Although the single-role server design increases the hardware footprint in the data
center, it also increases the flexibility and scalability of the messaging environment.
Figure 10 illustrates an Exchange Server 2007 architecture based on single-role
server designs.
.jpg)
Figure 10. Exchange Server 2007 architecture based on single-role servers
Exchange Server 2007 supports the following five separate server roles to perform
the tasks of an enterprise messaging system:
- Client Access servers Support Post Office Protocol 3
(POP3) and Internet Message Access Protocol 4 (IMAP4) clients, as well as Exchange
ActiveSync, Office Outlook Web Access, and Outlook Anywhere and new Outlook 2007
client functions.
- Edge Transport servers Handle message traffic to and from
the Internet and run spam filters. Microsoft IT also installs Forefront Security
for Exchange Server on all Edge Transport servers for virus scanning.
- Hub Transport servers Perform the internal message transfer,
distribution list expansions, and message conversions between Internet mail and
Exchange Server message formats. At Microsoft, all Hub Transport servers also run
Forefront Security for Exchange Server for virus scanning.
- Mailbox servers Maintain mailbox store databases and provide
Office Outlook clients and Client Access servers with access to the data.
- Unified Messaging servers Integrate voice and fax with e-mail
messaging and run Outlook Voice Access.
Multiple-Role and Single-Role Server Designs
With the exception of the Edge Transport server role, Exchange Server 2007
supports multiple-role server deployments. The Client Access server role, Hub Transport
server role, Mailbox server role, and Unified Messaging server role can coexist
on the same computer in any combination. Placing several roles on a single computer
is advantageous for small Exchange Server deployments. The multiple-role approach
provides the benefits of a reduced server footprint and can help to minimize the
hardware costs. For example, Microsoft IT deployed a multiple-role server in Sao
Paulo for the Hub Transport server role, Client Access server role, and Unified
Messaging server role to use hardware resources efficiently. Similar to the Exchange
Server 2003 design, Microsoft IT consolidated server roles in this location
due to moderate workload. However, the Mailbox server in Sao Paulo is a single-role
server according to the requirements for CCR.
Microsoft IT based its decisions to combine Exchange server roles on the same hardware
or separate them between dedicated servers on capacity, performance, and availability
demands. Mailbox servers are prime examples of systems with high capacity, performance,
and availability requirements at Microsoft. Accordingly, Microsoft IT deployed the
Mailbox servers in all regions in a single-role design, which enabled Microsoft
IT to eliminate single points of failure in the Mailbox server configuration by
using CCR. In Redmond, Dublin, and Singapore, Microsoft IT also used the single-role
design for the remaining server roles, because these regions include a large number
of users and multiple Mailbox servers.
Single-role server deployments provide Microsoft IT with the following benefits:
Table 3 shows the number of servers and the technologies per server role that Microsoft
IT uses in the corporate production environment to implement load balancing and
fault tolerance.
Table 3. Servers in the Microsoft IT Exchange Server 2007 Environment
|
Server role
|
Red-mond
|
Silicon Valley
|
Dublin
|
Singa-pore
|
Sao Paulo
|
Technology
|
|
Mailbox
|
31
|
0
|
15
|
15
|
1
|
Microsoft Windows Clustering and CCR.
Network interface card (NIC) teaming by using NICs connected to different switches.
|
|
Edge Transport
|
3
|
3
|
2
|
2
|
0
|
Domain Name System (DNS) round robin and Mail Exchanger (MX) records with same cost
values.
Multiple Hub Transport servers as bridgeheads in Send Connector configuration.
|
|
Hub Transport
|
8
|
0
|
3
|
3
|
1
|
Automatic load balancing through Mail Submission Service.
Edge Subscriptions for Hub/Edge connectivity.
|
|
Client Access
|
16
|
0
|
6
|
4
|
Web Publishing Load Balancing (WPLB) on Microsoft Internet Security and Acceleration
(ISA) Server 2006.
Microsoft Network Load Balancing (NLB) internally.
|
|
Unified Messaging
|
7
|
0
|
2
|
2
|
Automatic round robin load balancing between Unified Messaging servers.
Multiple voice over IP (VoIP) gateways per dial plan.
|
Scaling Up Server Designs
Following the successful completion of the initial production rollout, Microsoft
IT began to develop new, scaled-up Mailbox server designs in order to increase the
density of the user mailboxes per server, consolidate resources, and reduce maintenance
overhead. The new designs take advantage of quad-core processors and scale up to
6,000 users with 500-MB mailboxes per Mailbox server. In the initial design, scaling
out with additional Mailbox servers was the only reasonable option for Microsoft
IT to handle the increased demand for mailbox sizes of 500 MB and 2 GB without jeopardizing
SLAs. The new designs enabled Microsoft IT to consolidate all the 2,000-user Mailbox
servers used during the original beta deployment by a factor of three. Because each
Mailbox server corresponds to a two-node CCR cluster, Microsoft IT was able to repurpose
more than 60 recently purchased enterprise servers.
To scale up to 6,000 mailboxes per server, Microsoft IT capitalized on the newest
processor technologies at the time of deployment, such as the quad-core Intel Xeon
Processor X5355. This enabled the elimination of the processor bottleneck on Microsoft
IT Mailbox servers that existed on the dual-core CPU systems used during initial
deployment and prevented achieving that scale. The X5355-compatible server model
that Microsoft IT selected offered eight slots for Fully Buffered Dual Inline Memory
Modules (FB-DIMM). This implies that with a maximum module size of 4 GB, the server
architecture accommodates up to 32 GB of memory, which corresponds to a memory configuration
at the upper end of product recommendations (2 GB + 2 MB to 5 MB/mailbox). However,
at the time Microsoft IT developed this server design, 4 GB DIMMs did not offer
an attractive memory capacity/price ratio. To remain cost-efficient, Microsoft IT
decided to use 2-GB memory modules instead and designed the storage solution according
to the I/O requirements that result from having less memory per user on the server.
In comparison to the initial Mailbox servers, the new design allocated 60 percent
less memory per user. Among other things, this means that even though the Extensible
Storage Engine (ESE) of Exchange Server 2007 can cache any amount, the limited
physical memory available means that the ESE can only cache approximately 2 MB of
messaging data per user. This has a direct impact on the amount of I/O operations,
because with less memory to cache data, the storage engine must go to disk more
often to reload frequently used data. After consolidation to the new server platform,
the same users will have a higher I/O profile. To meet the increased requirements
in terms of I/O operations per second (IOPS), Microsoft IT optimized the design
of the storage subsystem, as explained in the next section.
Mailbox Storage Design
The mailbox is one of the very few components in an Exchange Server 2007 organization
that cannot be load-balanced across multiple servers. Each individual mailbox is
unique and can only reside in one mailbox database on one active Mailbox server.
It follows that the mailbox store is one of the most critical Exchange Server components
that directly affect the availability of messaging services. With previous versions
of Exchange Server, Microsoft IT relied on SAN solutions to provide the necessary
configuration for its mailbox clusters. SAN provided a higher level of availability
due to the architecture, and enabled Microsoft IT to achieve the number of disks
required for I/O throughput and scalability. Mailbox servers clustered by using
Windows Clustering and SAN-based storage enabled Microsoft IT to achieve 99.99 percent
availability with Exchange Server 2003, yet the shared storage solution was
a single point of failure that was expensive and required specialized skills to
optimize and maintain the configuration. Additionally, the mailbox databases on
disks remained single points of failure.
To break through the old limitations, Microsoft IT defined the following storage
design requirements for Exchange Server 2007:
- Continue to maintain 99.99 percent availability at the service level.
- Increase Mailbox server resilience by removing single points of failure in the storage
subsystem and its components.
- Reduce storage infrastructure costs and increase mailbox quotas from 200 MB to 500 MB
and 2 GB depending on the type of mailboxes.
- Increase deleted items' retention from three days to 14 days to enable users to
recover deleted mail items within a larger time window without the necessity of
restoring from backup.
Eliminating Storage as the Single Point of Failure
Exchange Server 2007 supports two different clustering techniques for Mailbox
server configurations: single copy cluster (SCC) and cluster continuous replication
(CCR). Both types rely on Windows Clustering, yet only CCR provides redundancy at
the storage level by replicating mailbox data from one server to another at the
storage group level through a mechanism commonly known as asynchronous log shipping.
SCC is comparable to an Exchange Server 2003 clustering configuration where
all cluster nodes use a shared storage subsystem for quorum resource, mailbox databases,
and transaction logs. Because only CCR eliminates the mailbox store as a single
point of failure, and the fact that CCR technology does not impose any specialized
hardware or storage requirements, Microsoft IT decided to use this technology as
the core of its Mailbox server designs to increase Mailbox server resilience from
storage-level failures.
To use server hardware efficiently while providing the required redundancy level
through CCR, Microsoft IT implemented two-node Majority Node Set (MNS) server clusters
with file-share witness. Each cluster node stores a copy of the MNS quorum on the
local system drive and keeps it synchronized with the other node. The file-share
witness feature enables the use of a file share that is external to the cluster
as an additional vote to determine the status of the cluster in a two-node MNS quorum
cluster deployment. This helps to avoid an occurrence of network partition within
the cluster, also known as split-brain syndrome. Split-brain syndrome occurs when
all networks designated to carry internal cluster communications fail, and nodes
cannot receive heartbeat signals from each other. To enable the file-share witness
feature, Microsoft IT specifies a file share on a Hub Transport server in the MNSFileShare
property of the MNS resource configuration.
Figure 11 illustrates the Microsoft IT CCR configuration. Both CCR cluster nodes
and the file-share witness must exist in the same Active Directory site. Microsoft
IT did not deploy geographically dispersed clusters during the initial production
rollout of Exchange Server 2007 to avoid the complexities of building IP subnets
and Active Directory sites that span multiple geographic locations, which would
be a requirement for distributed Windows Server 2003 cluster nodes.
.jpg)
Figure 11. Microsoft IT CCR configuration
As depicted in Figure 11, Microsoft IT uses three network adapters in each cluster
node. The first two adapters connect the cluster node in a NIC teaming configuration
through separate Gigabit Ethernet switches to the public network, which Microsoft
Office Outlook clients and Exchange Server computers use to communicate with the
Mailbox server. The Microsoft Exchange Replication Service on the cluster nodes
also uses the public network connection to replicate the mailbox store databases
from active node to passive node by using transaction log shipping so the mailbox
store data is readily available on the passive node if the active node fails. The
third network adapter establishes the private network connection between the cluster
nodes to communicate cluster heartbeats.
Another important component that is necessary to ensure high availability in the
CCR-based Mailbox server configuration is the transport dumpster feature on Hub
Transport servers. The transport is a feature that allows messages to be redelivered
to a continuous replication-enabled storage group in the event that the storage
group experiences a lossy failure. It is important to ensure that Hub Transport
servers have the appropriate capacity to handle a transport dumpster. Clustered
Mailbox servers might request redelivery if a failover occurs to a passive node
before CCR has copied the most recent transactions. To configure the transport dumpster,
Microsoft IT uses the set-TransportConfig cmdlet with the following parameters:
- MaxDumpsterSizePerStorageGroup This parameter specifies
the maximum size of the transport dumpster queue per storage group. Microsoft IT
specifies 15 MB for this parameter.
- MaxDumpsterTime This parameter specifies how long the Hub
Transport server can retain messages in the transport dumpster queue. Microsoft
uses a value of 07.00:00:00, which corresponds to seven days.
Note: Microsoft IT uses the same hardware configuration on active and passive
CCR cluster nodes to maintain the same performance level after a failover to the
passive node.
Reducing Storage Costs and Configuration Complexities
The new Mailbox server design with CCR required Microsoft IT to double the storage
and the storage arrays in order to remove the data and storage points of failure.
Microsoft IT looked into the cost of deploying Mailbox servers with SAN technology
and determined that CCR with SAN-based storage would require redundant SAN environments,
which were cost prohibitive. Instead, Microsoft IT decided to use direct attached
storage, which met the requirements, reduced costs, and reduced operational complexity
for Microsoft IT.
With direct attached storage, Microsoft IT eliminated the need to use SAN, Internet
SCSI (iSCSI), or other shared storage technologies in the cluster configuration.
Every cluster node can use its own direct attached storage subsystem to maintain
a separate copy of mailbox databases, which also helps to improve the failover behavior,
because a server failure due to hardware or software problems that affect a database
on the first node is unlikely to affect the recovery operation on the second node.
By removing single points of failure that existed in the Exchange Server 2003
architecture and by using improved failover behavior in the Mailbox server configuration,
Microsoft IT felt confident in making the switch from SAN to direct access storage
(DAS) storage as the basis of its server architectures in the Exchange Server 2007-based
messaging environment.
Optimizing the Storage Design for Reliability and Recoverability
Although CCR is an effective high-availability feature, it is important to recognize
that the technology does not eliminate the need for reliability and recoverability
provisions at the storage and server levels. For example, failing over an entire
cluster with thousands of mailboxes between the active and the passive nodes might
not be an effective measure if only a single disk in a disk array experienced a
failure or if the transaction log volume is running short of disk space.
Microsoft IT uses the following features to ensure reliability and recoverability
at the storage level:
- Redundant array of independent disks (RAID) Microsoft IT
uses stripe sets of mirrored disks according to RAID level 1+0, also called RAID
10, to achieve optimum disk drive performance and fault tolerance at the physical
disk level. In the new server design for 6,000 mailboxes, the RAID 10 drives for
transaction logs use eight disks, which means that the transaction log drives can
tolerate a maximum of four disk failures, and the I/O performance only drops by
25 percent in the event of a single disk failure. The RAID 10 drives for database
files use 14 disks in the Mailbox server design for 6,000 users, which corresponds
to an even higher level of resilience to disk failures. A maximum of seven disks
can fail, and in the event of a single disk failure, the I/O performance only drops
by roughly 15 percent.
Note: Microsoft IT does not use a hot spare in the RAID configuration, because
most Microsoft data centers are staffed 24 hours, seven days per week. In the event
of a disk failure, an IT specialist is readily available to replace the affected
disk with minimum delay.
- Separate transaction logs from database files Placing transaction
log files and database files on separate physical drives ensures that recent transactions
are still available in the transaction logs even if the database RAID group fails.
Separating transaction logs from database files is one of the most basic strategies
to improve the performance and fault tolerance of any version of Exchange Server.
- No circular logging on Mailbox servers In a CCR server cluster,
when circular logging is enabled, the process will not expunge the transaction log
until it has been replicated and replayed into the target database.
- Configure multiple storage groups per Mailbox server To
ensure timely backup and restore operations according to SLA requirements, which
require backup operations to be complete within four hours, Microsoft IT wanted
to keep the size of individual databases per Mailbox server below 200 GB and implement
a schedule with weekly full and daily incremental backups, as explained later in
this paper. Keeping individual database file sizes
below 200 GB also helps Microsoft IT complete online maintenance cycles
regularly, which is critical for keeping the Mailbox server healthy, the system
performance stable, and the sizes of database files under control. Because the CCR
configuration only supports one mailbox database per storage group, Microsoft IT
must configure multiple storage groups per Mailbox server. For example, in the Mailbox
server design for 6,000 users, Microsoft IT
uses 42 storage groups with
approximately 143 mailboxes in each.
Standardizing the Storage Design
Microsoft IT greatly benefits from a standardized storage layout for all Mailbox
servers. Mailbox servers host up to 6,000 mailboxes in 42 separate storage groups
that require identical storage group and database locations on all CCR cluster nodes.
Standardizing the storage design for the CCR server clusters helps Microsoft IT
prevent human errors in disaster-recovery situations, simplify hardware installation
and configuration tasks, accelerate server deployments, and ultimately save operational
costs.
To standardize the storage layout for Exchange Server 2007, ensure reliability,
and provide scalability, Microsoft IT developed the concept of a
universal storage building block (USBB). A USBB is
a self-contained unit of two physical storage enclosures, combined to provide database
and transaction log drives with the necessary level of hardware component redundancy.
The number of disk drives—identified through individual logical unit numbers (LUNs)
at the small computer system interface (SCSI) level—that Microsoft IT can use in
a USBB depends on the capacity of the storage enclosures and the required number
of LUNs per RAID drive to provide the desired data capacity and I/O performance.
The USBB count per server depends on the number of mailboxes and the mailbox quotas
that Microsoft IT wants to maintain on the server.
The standardized storage layout based on USBBs provides Microsoft IT with the benefits
of a flexible and scalable server design. For example, in the Mailbox server design
for 6,000 users, Microsoft IT uses storage enclosures with 25 small form factor
(SFF) disks (146-GB 10,000-RPM serial attached SCSI [SAS] 2.5-inch SFF). Microsoft
IT prefers small form factor disks because of the lower power requirements, lower
cost, and higher performance and reliability in comparison to the large form factor
3.5-inch disks. Microsoft IT configures the hardware RAID controller in such a way
that it mirrors the disks across the two storage enclosures, and then includes these
mirrors in stripe sets to create the desired number of RAID 10 drives. Seen from
an individual node level, one RAID controller introduces a single point of failure
to the configuration. Yet, Microsoft IT did not find it necessary to install redundant
RAID controllers on a single node because the CCR cluster as a whole eliminates
the storage subsystem as a single point of failure.
Figure 12 shows a USBB configuration with 25 disks per storage enclosure, three
RAID 10 LUNs for the databases, and one RAID 10 LUN for all transaction logs combined.
Microsoft IT uses this particular USBB configuration in the Mailbox server design
for 6,000 users by attaching multiple instances of it to the same server to achieve
desired scale.
.jpg)
Figure 12. Universal storage building block design
To determine the required number of disks per database LUN, Microsoft IT considered
the following three factors:
- Backup schedule To facilitate a backup schedule of weekly
full and daily incremental streaming backups, Microsoft IT places seven storage
groups on each database LUN, also known as the "2 LUNs per Backup Set"
model. Because the CCR configuration requires one mailbox database per storage group,
each database LUN must accommodate seven database files. For more information about
the 2 LUNs per Backup Set model, see
http://technet.microsoft.com/en-us/library/bb738147.aspx.
- Mailbox database capacity requirements As explained earlier,
Microsoft IT distributes mailboxes across 42 storage groups to keep the size of
individual database files below 200 GB. In a Mailbox server configuration for 6,000
users, this means that each database file must host 143 mailboxes. Assuming a maximum
mailbox size of 500 MB plus 69 percent overhead (54 percent database overhead, 5
percent content indexing overhead, and 10 percent reserve for unexpected database
growth), an individual database can grow up to approximately 120 GB (500 MB * 143
* 171% = 119.4 GB). It follows that an individual database drive must have a minimum
capacity of 840 GB to store seven mailbox databases. In a RAID 10 configuration
with 146-GB SFF disks, 14 disks are required to provide the necessary storage capacity,
because each individual mirror in the stripe set only has a usable capacity of 137
GB (840 GB / 137 GB = 6.13, which means 7 mirrors are required per stripe set *
2 disks per mirror = 14 disks per stripe set).
Note: Microsoft IT determined the database overhead factor of 54 percent
based on the average message traffic per user in the corporate production environment,
the desired deleted items retention time of 14 days, and internal database overhead.
The average message traffic per user (both sent and received items) is approximately
10 MB per day, which means that the dumpster for the deleted items must be able
to hold 140 MB per user (14 days * 10 MB per day = 140 MB). In addition, Microsoft
IT takes into account 20 percent overhead based on the maximum mailbox and dumpster
size to accommodate internal database structures, table indexes, and so forth. Accordingly,
the total database overhead is approximately 54 percent of the maximum mailbox size
(140 MB + [(500 MB + 140 MB) * 20% / 100%] * 100% / 500 MB = 53.6%).
- Input/output performance To achieve optimal Mailbox server
response times, the storage subsystem must be able to sustain the load that users
generate in terms of IOPS without creating a bottleneck. A single 146-GB SFF 10,000-RPM
SAS disk can perform approximately 160 IOPS with response times of less than 20
milliseconds. The question now is how many disks are necessary to satisfy the I/O
load that 6,000 concurrent users place on a Mailbox server. The answer depends on
the usage patterns of the users and the hardware configuration of the server.
To determine the I/O load, Microsoft IT continuously monitors the Disk Transfers/sec,
Disk Reads/sec, and Disk Writes/sec performance counters on Mailbox servers in the
corporate production environment. Values measured since the initial rollout show
that Microsoft employees are generally heavy users who generate approximately 0.4
IOPS (Redmond full-time employees) on Exchange Server 2007 with a read/write
mix of 1/1. However, these values come from Mailbox servers with 5 MB of memory
per user. It is important to remember that the new Mailbox server design has 60
percent less memory (2 MB per user) to cache mailbox data. Due to the smaller data
cache, read operations will increase and the read/write mix will change to 2/1.
Accordingly, Microsoft IT expects a rise in read activities to a range between .35
and .7 IOPS. This range of IOPS represents a dramatic reduction in I/O requirements
on a Mailbox server running Exchange Server 2007. It provided Microsoft IT
with enough headroom to balance the I/O load against server memory in order to support
greater numbers of users per server.
Supporting approximately 1,000 users per database LUN (143 users per storage group
* 7 storage groups = 1,001 users) with 0.7 IOPS in a 2/1 read/write mix means the
RAID controller receives 468 read and 234 write requests per second. Corresponding
to the RAID 10 configuration, each write request equals two write operations behind
the controller, so the database LUN must be able to perform 468 read and 468 write
requests per second, or 936 IOPS in total. With 160 IOPS per disk, only six disks
are necessary per database drive to reach the required performance level (936 /
160 = 5.8). The USBB design uses 14 disks per database LUN because of capacity requirements
and does not create a performance bottleneck under normal operating conditions.
Having ensured that the standardized USBB design provides the required storage capacities
and I/O performance levels, it is straightforward for Microsoft IT to design the
overall storage subsystem. In the Mailbox server design for 6,000 users, each USBB
stores 3,000 mailboxes (3 database LUNs * 1,000 mailboxes per LUN = 3,000 mailboxes),
so the Mailbox server needs two USBBs per cluster node with corresponding drive
letter assignments, as depicted in Figure 13.
.jpg)
Figure 13. Microsoft IT storage design for Mailbox servers (6,000 users)
Backup and Recovery
In addition to eliminating single points of failure at the Mailbox server and storage
levels, Microsoft IT continues to use backups to protect against loss or damage
of data. Backups provide an additional measure of protection in the event of a single-node
failure. If one node in a two-node CCR cluster fails, only one node remains with
the data until Microsoft IT repairs the databases on the affected node. Especially
in this situation, database backups provide necessary redundancy. If the second
node fails before the first node is restored or in the unlikely event that both
nodes fail simultaneously, backups might provide the last resort to recover the
data.
For the rollout of Exchange Server 2007 in the corporate production environment,
Microsoft IT defined the following backup and recovery requirements:
- Support mailbox capacities of 500 MB and 2 GB, depending on the Mailbox server design.
- Reduce backup costs by eliminating tape as the backup media for Exchange data.
- Complete server backup operations within four hours and database restore operations
within one hour, according to existing SLAs.
- Maintain 14 days' worth of database backups.
Performing VSS-Based Backups on Passive Node
The backup solution that Microsoft IT had to use during the initial production rollout
of Exchange Server 2007 relied on the Windows Backup utility (NTBackup.exe)
to perform streaming online backups controlled through command-line scripts. Microsoft
IT would have preferred to use Volume Shadow Copy Service (VSS) technology instead,
because VSS provides the ability to offload backup processes to the passive node,
yet during the initial deployment time frame, VSS-based solutions for Exchange Server 2007
were not available.
The disadvantage of performing streaming backups is that backup processes must run
on the active node in a CCR server cluster, which limits the backup window. For
example, to minimize the impact of backup operations on user processes and online maintenance, Microsoft IT requires
backup cycles to be complete within four hours, between 8 P.M. and midnight. This
represents a considerable challenge, because Microsoft IT designed the Mailbox servers
to host up to 10 terabytes of messaging data on some servers.
Microsoft IT plans to move to software-based VSS backups as soon as Microsoft System
Center Data Protection Manager (DPM) 2007 becomes available to perform backups
on the passive nodes, as illustrated in Figure 14. This allows backup operations
in much more frequent intervals than streaming backups on active nodes. In fact,
using DPM 2007, Microsoft IT can copy transaction logs as often as every 15 minutes
to the DPM server. Accordingly, Microsoft IT gains the ability to restore Exchange
Server data to any 15-minute point in time to the original server or to a different
server. This is the backup solution Microsoft IT needs to go beyond current Mailbox
server sizes in terms of numbers of users and gigabytes per mailbox in the corporate
production environment.
.jpg)
Figure 14. The future Microsoft IT backup solution for Exchange Server 2007
Eliminating Backups to Tape
Despite the limitations of the streaming backup solution, Exchange Server 2007
enabled Microsoft IT to achieve improvements in the backup design. Specifically,
Microsoft IT eliminated tape backups and their associated operational costs. This
was possible because Microsoft IT is not required to keep data on tape for long-term
archiving. CCR helped Microsoft IT reduce the dependency on backup, because each
Mailbox server already maintains two copies of the mailbox data. With the lack of
strict "backup to tape" requirements, Microsoft IT saw an opportunity to implement
a more reliable and cost-effective solution by performing backups to disk.
To provide the required backup storage, Microsoft IT uses a separate RAID controller
with a set of 500-GB and 750-GB Serial Advanced Technology Attachment (SATA) disks
in a RAID 5 configuration on the active node. This RAID 5 drive provides enough
capacity to store 14 days' worth of online database backups so that Microsoft IT
always has two full backups available. As illustrated in Figure 15, the passive
node does not have this extra storage, because Microsoft IT cannot perform online
backup operations on the passive node until the DPM 2007-based backup solution
is available. During the transition to DPM 2007, Microsoft IT plans to move
the local backup storage and reuse the disks on the DPM host.
Note: It should be noted that the current architecture introduces another
single point of failure if both active and copy databases are lost and the disk
array housing the backups is lost.
.jpg)
Figure 15. Disk-based backup storage for streaming online backups
Moving from backup tapes to disk-based backup storage provided Microsoft IT with
the following advantages:
- Reduced backup costs and complexities Eliminating backup
tapes alone enabled Microsoft IT to reduce costs by approximately $5 million per
year. Microsoft IT achieves further cost savings by removing tape systems from the
data protection infrastructure and reducing maintenance complexities.
- Increased reliability of restore operations Microsoft IT
operational statistics show a 17 percent annual failure rate for tape drives. By
moving to disk-based backup storage, Microsoft IT can reduce the likelihood of restore
failures during recovery operations due to unreadable backup media or corrupted
backup catalogs.
- Increased performance and throughput Tape-based restores
require time for locating and mounting the required backup media and building indexes.
Disk-based backup storage provides much quicker access and faster I/O performance
during recovery operations.
Optimizing Backup Cycles According to SLAs
Prior to Exchange Server 2007, Microsoft IT performed daily full database backups.
This was possible because mailbox quotas were limited to 200 MB. An Exchange 2003
server with 4,000 mailboxes would host approximately 1 terabyte of messaging data.
Using Windows Backup in non-buffered I/O mode to bypass Windows Cache Manager and
backing up four storage groups in parallel enabled Microsoft IT to complete backup
cycles within the SLA-prescribed time frame of four hours (see the Technical Solution
Brief "Achieving High Availability with Exchange Server at Microsoft," available
at http://technet.microsoft.com/en-us/library/bb735154.aspx).
However, with increased mailbox quotas of 500 MB and 2 GB and up to 6,000 mailboxes
per Mailbox server running Exchange Server 2007, data volumes would overtax
the existing backup processes.
Being unable to deploy a VSS-based backup solution during the initial production
rollout, Microsoft IT decided to switch from daily full to daily incremental and
weekly full database backups. This approach enabled Microsoft IT to reduce the daily
backup volume in order to stay within the required four-hour backup window. On the
downside, however, daily incremental and weekly full backups complicate restore
processes. Recovering a database now requires restoring the last full backup and
all incremental backups since the last full backup, which takes more time and requires
multiple restore operations to accommodate the full and incremental backup restorations.
To stagger the full and incremental online backups across a seven-day period, Microsoft
IT places seven storage groups on each database LUN on the Mailbox servers. Streaming
technology supports backing up mailbox databases in different storage groups in
parallel. Using a separate backup session also enables Microsoft IT to perform full
backups for each mailbox database on a different weekday, as summarized in Table
4. According to the business requirement to maintain 14 days' worth of database
backups, the backup storage provides sufficient capacity to store two full backups
and 12 incremental backups of the Mailbox server's messaging databases.
Table 4. Legacy Streaming Backup Schedule per Database LUN
|
Storage group
|
Mon
|
Tue
|
Wed
|
Thu
|
Fri
|
Sat
|
Sun
|
|
SG 1
|
Full
|
Inc
|
Inc
|
Inc
|
Inc
|
Inc
|
Inc
|
|
SG 2
|
Inc
|
Full
|
Inc
|
Inc
|
Inc
|
Inc
|
Inc
|
|
SG 3
|
Inc
|
Inc
|
Full
|
Inc
|
Inc
|
Inc
|
Inc
|
|
SG 4
|
Inc
|
Inc
|
Inc
|
Full
|
Inc
|
Inc
|
Inc
|
|
SG 5
|
Inc
|
Inc
|
Inc
|
Inc
|
Full
|
Inc
|
Inc
|
|
SG 6
|
Inc
|
Inc
|
Inc
|
Inc
|
Inc
|
Full
|
Inc
|
|
SG 7
|
Inc
|
Inc
|
Inc
|
Inc
|
Inc
|
Inc
|
Full
|
Note: Switching to disk-based
backup storage and keeping mailbox database sizes below 200 GB were crucial measures
to ensure that restores of individual databases from backups could still be completed
within the allowed one-hour time window, despite the fact that CCR technology greatly
eliminates the need for restore operations from backups. In case of database corruption
or hardware failures on a single node, Microsoft IT does not need to perform restores
from backup. Instead, Microsoft IT simply reseeds the databases on the restored
cluster node from the second cluster node that still has a healthy database copy.
Client
Access Server Topology
Reliability and performance of Mailbox servers are crucial for the availability
and quality of messaging services in an Exchange Server 2007 organization.
Another important component is the Client Access server, which provides access to
messaging items in mailboxes, availability information, and address book data in
a number of scenarios. For example, users might work with Office Outlook Web Access
in a Web browser session or synchronize mobile devices by using the Exchange ActiveSync
protocol. Users can also work with the full Office Outlook 2003 or Outlook 2007
client over Internet connections, accessing their mailboxes through RPC over HTTPS
(Outlook Anywhere). Office Outlook 2007 clients specifically communicate with
Client Access servers in a number of additional scenarios, such as to retrieve profile
configuration settings by using the Autodiscover service, checking free/busy data
by using the Availability service (which is part of Exchange Web Services), and
downloading Offline Address Book (OAB) from a virtual directory on a Client Access
server. In all these cases, users communicate with a Client Access server, which
in turn communicates with the Mailbox server by using the native Exchange Server
Messaging Application Programming Interface (MAPI).
To provide Microsoft employees with reliable access to their mailboxes from practically
any location with network access, Microsoft IT defined the following requirements
for the Client Access server deployment:
- Establish flexible and scalable client access points for each geographic region
that can accommodate the existing mobile user population at Microsoft and large
spikes in mobile messaging activity.
- Preserve common URL namespaces (such as https://mail.microsoft.com) that were established
during Exchange Server 2003 rollout for all mobile messaging clients within
each individual geographical region.
- Deploy all mobile messaging services on a common standardized Client Access server
platform.
- Maintain security for mobile messaging client access from the Internet by using
the capabilities of ISA Server 2006.
- Provide seamless backward compatibility with mailboxes still on the Exchange Server 2003
platform during transition and provide support for cross-forest access to enable
availability of free/busy information.
Preserving Existing Namespaces for Mobile Access to Messaging Data
Each month, Client Access servers in the corporate production environment support
approximately 60,000 Office Outlook Web Access unique users, 60,000 Outlook Anywhere
connections, and 30,000 Exchange ActiveSync sessions. To distribute this load, Microsoft
IT uses multiple URL namespaces according to geographic regions. Microsoft IT established
this topology during the Exchange 2000 Server time frame, which means that
Microsoft employees became accustomed to these URLs over many years. Preserving
these URLs during the transition to Exchange Server 2007 and providing uninterrupted
mobile messaging services through these URLs was correspondingly an important objective
for the production rollout.
To preserve the existing URL namespaces, Microsoft IT devised the following strategy:
- Deploy the Exchange Server 2007 Client Access servers in the corporate production
environment in each data center/Active Directory site where Exchange Server 2007
Mailbox servers were planned.
- Test access to Exchange Server 2003 and Exchange Server 2007 resources
through Client Access servers in all locations by manually pointing clients to them.
- Switch mobile messaging namespaces for each regional location in DNS to point to
the Client Access servers.
- Move mailboxes from legacy Exchange 2003 servers to new Exchange 2007
servers and enable new mobile messaging features for transitioned users.
As explained in the "Message Routing Topology"
section earlier in this white paper, Exchange Server 2007 extensively uses
the concept of Active Directory sites, such as to define logical boundaries for
message routing and server-to-server communications. For client access scenarios,
this means that each Active Directory site with Mailbox servers must also include
Client Access servers to ensure a fully functional messaging system. Accordingly,
Microsoft IT deployed Client Access servers locally in the data centers of
Dublin, Sao Paulo, Singapore, and Redmond. As illustrated in Figure 16, Microsoft
IT heavily focused the deployment on dedicated servers with varying Mailbox-to-Client
Access server ratios in order to establish flexible and scalable messaging services
that can accommodate large spikes in user activity. In Sao Paulo only, Microsoft
IT deployed a multiple-role server hosting Hub Transport and Unified Messaging server
roles in addition to the Client Access server role due to the moderate number of
users in the South American region.
.jpg)
Figure 16. Global Client Access server deployment
Client Access servers only communicate directly with Exchange Server 2007 Mailbox
servers in their local Active Directory site. For requests to Mailbox servers in
remote Active Directory sites, Client Access servers must proxy or redirect the
request to a Client Access server that is local to the target Mailbox server. Microsoft
IT prefers to redirect Office Outlook Web Access users. Keeping client connections
local within each geographic region minimizes the impact of network latencies between
the Client Access server and Mailbox server on client performance and mitigates
the risk of exhausting thread pools and available connections on central Client
Access servers during large spikes in messaging activity. To redirect Office Outlook
Web Access users to Client Access servers that are local to the user's Mailbox server,
Microsoft IT registers the relevant external URL on each Internet-facing Client
Access server in the ExternalURL property for all Office Outlook Web Access
virtual directories.
Redirection is not available for other services, such as Exchange ActiveSync, Exchange
Web Services, and Outlook Anywhere. To support these clients, remote Client Access
servers act as proxy servers for local Client Access servers (Exchange ActiveSync,
Exchange Web Services) or communicate with Mailbox servers directly by using RPCs
(Outlook Anywhere). However, Microsoft employees know and prefer to use their local
Internet access points.
Note: An exception to the distributed approach to mobile messaging services
at Microsoft is the Internet access point for the Autodiscover service used for
automatic profile configuration of clients such as Outlook 2007. Microsoft
IT provides centralized access to the Autodiscover service because of its reliance
on the primary SMTP address of the users. All Microsoft users in the Corporate Forest
have the same SMTP domain name worldwide (that is, @microsoft.com). Outlook 2007
derives the Autodiscover URL from the user's primary e-mail address and attempts
to access the Autodiscover service at https://autodiscover.<SMTP domain>/autodiscover/autodiscover.xml
or, if this URL does not exist, at https://<SMTP domain>/autodiscover/autodiscover.xml.
Microsoft employees working with Outlook 2007 over the Internet connect to
https://autodiscover.microsoft.com/autodiscover/autodiscover.xml regardless of the
user's geographic location.
Increasing Security Based on ISA Server 2006
To provide adequate protection for Client Access servers from the Internet, Microsoft
IT continues to use Microsoft Internet Security and Acceleration (ISA) Server 2006.
ISA Server 2006 provides many features specifically designed to publish Exchange
Server resources, including mobile messaging scenarios. For example, ISA Server 2006
includes a New Exchange Publishing Rule Wizard that facilitates the configuration
of publishing rules for Office Outlook Web Access, Exchange ActiveSync, and Outlook
Anywhere.
Among other things, ISA Server 2006 provides stateful inspection and application-layer
filtering for mobile messaging connections coming from the Internet. Stateful inspection
enables ISA server to block any traffic that appears out of context, such as requests
to initiate a connection on an established session. Yet, to perform this function,
ISA Server must analyze the payload in data packets, which requires ISA Server 2006
to decrypt the Secure Sockets Layer (SSL) stream. Accordingly, Microsoft IT terminates
SSL connections from the Internet on the ISA server and then reestablishes a new
SSL connection between the ISA server and the Client Access server. This SSL bridging
process enables ISA Server 2006 to filter invalid data packets before the traffic
reaches the Client Access servers while maintaining the confidentiality of client-to-server
communication as it transits both external and internal networks. For Internet connections,
Microsoft IT uses an externally trusted SSL certificate, installed directly on the
ISA servers. For connections between ISA Server 2006 and Exchange 2007
Client Access server, Microsoft IT could have used internally trusted SSL certificates
but chose to use the same externally trusted ones for consistency reasons.
Providing Load Balancing and Fault Tolerance for External Client Connections
ISA Server 2006 features also enable Microsoft IT to establish a highly scalable
load-balancing infrastructure for external mobile messaging clients and avoid complicated
load distribution and client session affinity issues that frequently arise in the
scenarios with network address translation (NAT) and load-balancing components that
rely on the source IP address of the connection. Figure 17 illustrates the load-balancing
architecture that Microsoft IT uses for its Client Access server scenarios. The
same architecture is replicated in each of the regional data centers where Exchange 2007
Mailbox and Client Access server infrastructure exists.
.jpg)
Figure 17. Client access architecture for external clients
Microsoft IT uses the following technologies to provide load balancing and fault
tolerance for Internet-based client access:
- Integrated network load balancing for ISA server farm To
distribute incoming client connections over multiple computers running ISA Server 2006,
Microsoft IT uses integrated NLB in single-affinity mode. Single affinity helps
to maintain the session state by directing subsequent requests from the same client
or a specific client IP address to the same ISA server. Maintaining the session
state is important when using HTTPS connections, because the client can then reuse
the existing SSL session identifier (ID) for multiple requests. If directed to a
different ISA server, the client would have to negotiate a new SSL session ID. Although
this process is transparent to the user, it requires five times the amount of overhead
as reusing the existing SSL session ID. Correspondingly, single affinity reduces
SSL-related performance overhead on the ISA server farm.
- Web server farm load balancing for Client Access servers Using
the New Server Farm Wizard in ISA Server 2006, Microsoft IT creates a server
farm object that includes the IP addresses of all Client Access servers in the specified
location. The server farm object enables ISA Server to treat all servers in the
farm as a single load-balanced entity. The server farm object also defines a connectivity
verifier for each farm member to determine the state of each individual Client Access
server and exclude from load balancing those servers that are temporarily unavailable.
It is important to note that Web server farm load balancing does not require NLB
on the Client Access servers, yet session affinity requirements remain because ISA
servers and Client Access servers communicate over HTTPS. For published Office Outlook
Web Access paths (/exchange/*, /owa/*, and /public/*), ISA Server 2006 automatically
uses new cookie-based load-balancing methods to direct all requests issued by the
same Web browser session to the same Client Access server. This is important to
ensure that once authenticated against a particular Client Access server the user
is not redirected to another Client Access server in the cluster, which would cause
new authentication prompts. The cookie-based load-balancing method of ISA Server 2006
enables Microsoft IT to perform effective load balancing even for scenarios where
the multiple clients are hidden behind the NAT device or a Web proxy on the originating
side, which does not expose their unique IP address.
For all remaining paths (/Microsoft-Server-ActiveSync/*, /RPC/*, /Autodiscover/*,
/EWS/*, and /UnifiedMessaging/*) that do not rely on the Web browser, ISA Server
uses IP-based load balancing, which is comparable to a single-affinity configuration
in that requests from the same client IP address are sent to the same Client Access
server.
Providing Load Balancing and Fault Tolerance for Internal Client Connections
Although mobile messaging connections coming from the Internet through the ISA Server 2006
infrastructure do not require NLB on Client Access servers, internal clients, such
as Outlook Web Access clients and Outlook 2007 clients in the Corporate Forest,
can establish HTTPS connections to Client Access servers directly and must be load-balanced.
To bypass ISA Server 2006 within the corporate production environment, Microsoft
IT registered internal Client Access server IP addresses for the URL namespaces
in the internal DNS.
Note: Microsoft IT uses a split DNS configuration to accommodate the registration
of internal Client Access servers, provide load balancing, and provide fault tolerance
for internal client connections.
Figure 18 illustrates the load-balancing configuration that Microsoft IT established
for internal messaging clients. Similar to the ISA server farm, Microsoft IT uses
NLB with single affinity to direct requests from a particular client IP address
always to the same Client Access server to maintain the session state and reuse
SSL session IDs.
.jpg)
Figure 18. Client access architecture for internal clients
Optimizing Offline Address Book Distribution
Exchange Server 2007 introduces a new mechanism to distribute OAB information
to Outlook 2007 clients. The new mechanism uses Windows Background Intelligent
Transfer Service (BITS) over HTTPS connections instead of downloading the OAB files
from a public folder by using MAPI. Especially with large OAB files, BITS provides
significant advantages over the traditional OAB download method that previous versions
of Office Outlook use, because BITS requires less network bandwidth, downloads the
OAB files asynchronously in the background, and can resume file transfers after
network disconnects and computer restarts. Exchange Server uses HTTP as the default
communication method. Microsoft IT uses a trusted SSL certificate and enables SSL
on the appropriate OAB directory for internal clients.
Within the corporate production environment, Microsoft IT generates four different
OABs according to each geographical region. This approach of generating and distributing
OABs locally helps Microsoft IT to minimize OAB traffic over WAN connections while
providing users with relevant address information in cached or offline mode. Accordingly,
Microsoft IT configured an Exchange 2007 Mailbox server in each geographic
region as the OAB Generation (OABGen) server and local Client Access servers as
hosts to download the regional OABs. The Exchange File Distribution Service, running
on each Client Access server, downloads the OAB in the form of XML files from the
OABGen server into a virtual directory that serves as the distribution point for
Outlook 2007 clients. Outlook 2007 clients can determine the OAB download
URL by querying the Autodiscover service.
Figure 19 illustrates the OAB distribution architecture. Outlook 2007 clients
on the Internet access the OAB virtual directory through ISA Server 2006 by
using an encrypted HTTPS connection. Within the internal network, Outlook 2007
clients can access the OAB virtual directory on Client Access servers directly without
the need to go through ISA Server 2006. Office Outlook 2003 clients can
still download the OABs from the public folder by using MAPI and RPCs directly or,
in the case of external Outlook 2003 clients, by using MAPI and RPCs through
Outlook Anywhere (formerly known as RPC over HTTPS).
.jpg)
Figure 19. OAB download scenarios
Enabling Cross-Forest Availability Lookups
Another important business requirement that Microsoft IT needed to address in the
Client Access architecture concerned the integration of free/busy and availability
information across multiple forests to facilitate meeting management for all Microsoft
employees. As mentioned earlier in this white paper, Microsoft IT maintains several
forests with Exchange Server organizations. Some of these forests serve the purpose
of compatibility testing with previous product versions, whereas others run pre-release
software. Accordingly, Microsoft IT had to provide for seamless integration and
backward compatibility in the cross-forest availability architecture.
Three components need to work together for backward-compatible, seamless availability
integration. First, it is necessary to synchronize address lists across all forests,
which Microsoft IT accomplishes by using Microsoft Identity Integration Server 2003,
as mentioned earlier in this white paper. Second, if Outlook 2003 or earlier
clients are used, it is also necessary to synchronize free/busy items between messaging
environments by using the Microsoft Exchange Server Inter-Organization Replication
tool so that users with legacy clients can see the full set of availability information
for all Microsoft users. To remain backward-compatible, Microsoft IT maintains free/busy
public folders in all relevant Exchange Server 2007 organizations. Outlook 2007
and Office Outlook Web Access 2007 still use these public folders to publish
free/busy information for down-level clients. The third component that is important
for cross-forest availability integration is the Client Access server, specifically
the Availability API, which Outlook 2007 clients and Outlook Web Access 2007
use to obtain availability information for all Microsoft users.
Note: The Exchange Server 2007 Availability service is a Web service
on Client Access servers to provide Outlook 2007 clients with access to the
Availability API. Office Outlook Web Access uses the Availability API directly and
does not use the Availability service.
Figure 20 shows the cross-forest availability architecture that Microsoft IT established
in the corporate production environment. Although clients using Outlook 2003
or clients with mailboxes on Exchange Server 2003 continue to work with free/busy
items in public folders as usual, Client Access servers communicate differently
to process availability requests from users with mailboxes on Exchange Server 2007
who work with Outlook 2007 or Outlook Web Access. If the target mailbox is
in the local Active Directory site, Client Access servers use MAPI to access the
calendar information directly in the target mailbox. If the target mailbox is in
a remote Active Directory site, the Client Access server proxies the request via
HTTP to a Client Access server in the target mailbox's local site, which in turn
accesses the calendar in the user's mailbox. The same communication mechanism applies
if the target mailbox is in a remote forest with Client Access servers. If the target
mailbox is on Exchange Server 2003, Client Access servers must revert to accessing
free/busy items in public folders via the /public/* virtual directory.
.jpg)
Figure 20. Cross-forest availability architecture
To enable cross-forest availability lookups in Exchange Server 2007, Microsoft
IT implemented the following configuration:
- Trusted forests By using the Add-AvailabilityAddressSpace
cmdlet, Microsoft IT specifies the per-user access method (-AccessMethod PerUserFB)
to create the Availability Space with support for most detailed availability information.
Subsequently, Microsoft IT grants all Client Access server accounts the necessary
ms-Exch-EPI-Token-Serialization extended right through the Add-ADPermission
cmdlet.
- Non-trusted forests and forests without Client Access servers By
using the Add-AvailabilityAddressSpace cmdlet, Microsoft IT specifies the
public-folder access method (-AccessMethod PublicFolder) to continue using
free/busy items in this Availability Space.
Note: Client Access servers can use an organization-wide access method (-AccessMethod
OrgWideFB) for communication with remote Client Access servers in non-trusted
forests. However, Microsoft IT does not use this access method in order to avoid
the need for maintaining special system account credentials in each forest. For
more information about cross-forest availability lookups, see
http://technet.microsoft.com/en-us/library/bb125182.aspx.
Unified Messaging
Microsoft IT has provided users with unified messaging capabilities for many years,
starting with Exchange 2000 Server. The Microsoft voice-mail infrastructure
before Exchange Server 2007 relied on various Private Branch eXchange (PBX)
telephony devices and a third-party unified messaging product that interfaced with
Exchange servers to deliver voice mail to users' inboxes. The third-party product
required a direct physical connection to the PBX device for each location that provided
unified messaging services. This requirement meant Microsoft IT placed third-party
Unified Messaging servers in the same physical location as the PBX. The Exchange
Server 2007 Unified Messaging server role provided Microsoft IT with the opportunity
to redesign the environment and prepare the infrastructure for VoIP telephony across
the entire company.
For the rollout of unified messaging services to multiple locations worldwide, Microsoft
IT defined the following requirements:
- Provide high-quality, next-generation VoIP services with clear voice-mail playback
and voice conversations at all locations.
- Roll out unified messaging services to all feasible locations.
- Increase security through encrypted VoIP communication.
- Replace third-party unified messaging implementation with a native Exchange Server 2007
implementation.
- Provide Active Directory-based management of unified messaging users and devices.
- Increase redundancy and fault tolerance of unified messaging services and devices.
- Ensure smooth migration between third-party unified messaging systems and Exchange
Server 2007-based Unified Messaging.
- Reduce administrative overhead by educating and enabling users for self-service.
Unified Messaging Topology
Microsoft IT faced the decision either to replace the third-party Unified Messaging
servers with Exchange 2007 Unified Messaging servers at each location, or to
consolidate the locations of Unified Messaging servers to the four regional data
centers. To maintain a centralized environment, Microsoft IT chose to deploy Unified
Messaging servers in the four regional data centers that already housed Mailbox
and Hub Transport servers, which provided the following benefits:
- Reduced server footprint and costs Unified Messaging servers
must reside in the same Active Directory sites as Hub Transport and Mailbox servers.
Deploying Unified Messaging servers in a decentralized way would require a decentralization
of the entire messaging environment, with associated higher server footprint and
operational costs.
- Optimal preparation for future communication needs Having
Unified Messaging servers centralized in four locations, it is straightforward to
deploy technological advancements in VoIP technology. It is also easier to integrate
and maintain centralized Unified Messaging servers in a global unified communications
infrastructure.
Figure 21 illustrates the Microsoft IT Unified Messaging server deployment in the
corporate production environment.
.jpg)
Figure 21. Unified Messaging server topology
Having chosen to deploy Unified Messaging servers in the four regional data centers,
Microsoft IT faced the goal of ensuring high quality of voice for all unified messaging
users. To ensure high quality of voice data, Microsoft IT distributed Unified Messaging
servers according to the number of users each region supports. For example, for
Australia and Asia, Microsoft IT determined that two Unified Messaging servers provide
adequate capacity for all the locations enabled for services. When assigning server
partners for VoIP gateways in a specific location, Microsoft IT picks the site with
the least latency.
The connectivity requirements to PBXs at Microsoft locations vary according to the
call load. Microsoft IT deployed these connections years ago as part of a voice-mail
solution. There are existing PBXs with T1 Primary Rate Interface (PRI) or Basic
Rate Interface (BRI) trunks grouped logically as a digital set emulation group.
The T1 trunks can use channel associated signaling (CAS) where signaling data is
on each channel (24 channels for T1), or Q.SIG where there are 23 channels and a
dedicated channel for signaling.
The VoIP gateway decisions depend on the type of telephony connection. VoIP gateways
support specific signaling types and trunk sizes. Microsoft IT considers the signaling
types and the size of the trunk, and then ensures that the combination meets the
user load. From monitoring performance, Microsoft IT concluded that the existing
connectivity more than met call load and did not require expansion.
Unified Messaging Redundancy and Load Balancing
With Exchange Server 2007, Microsoft IT sought to increase the redundancy and
load-balancing capabilities of the unified messaging environment. As shown in Figure
22, there were several opportunities to design flexibility and scalability into
each location and for the overall infrastructure. Microsoft IT took advantage of
these opportunities and built in redundancy and fault tolerance into the connectivity,
VoIP gateway, and Unified Messaging servers.
.jpg)
Figure 22. Unified messaging redundancy configuration
Microsoft IT decided that the minimum level of scalability and flexibility in the
unified messaging environment required at least two VoIP gateways communicating
with at least two Unified Messaging server partners. Microsoft IT based this decision
on a few considerations. First, using two VoIP gateways and two Unified Messaging
servers ensures that if one telephony link or network link fails at any given time,
users can still receive unified messaging services. Second, if one VoIP gateway
fails, requires configuration changes, or requires updated firmware, Microsoft IT
can temporarily switch all traffic to the other gateway. Third, two or more Unified
Messaging servers ensure that in case one server fails, the other server can take
over. Microsoft IT considered redundancy for PBXs and, based on previous experience,
decided those PBXs provided stable service with built-in redundancy through multiple
telephony interface cards and multiple incoming telephony links to the telephone
company.
After deciding to use a minimum of two VoIP gateway devices by using two Unified
Messaging servers as communication partners, Microsoft IT considered the type and
capacity of VoIP gateway to use. Microsoft IT followed several technical and business
requirements for making VoIP gateway selections for each location, as follows:
- Connectivity type For Microsoft IT, the connectivity type
came down to two choices of digital connections: PRI T1 or BRI emulated as a digital
set. Analog connections were eliminated immediately because of cost and scalability
factors. For small sites, Microsoft IT uses BRI (PIMG80PBXDNI); for large sites,
Microsoft IT uses either TIMG300DT or TIMG600DT. Whereas TIMG300DT supports a single
T1 for each device, TIMG600DT supports dual T1s. Microsoft IT varied the number
of T1s depending on usage, employing dual T1s in Redmond and single T1s in the Silicon
Valley location. Other sites used BRI trunks emulated as a digital set, with either
eight or 16 lines per gateway, depending on user load.
- Simplified Message Desk Interface (SMDI)/signaling integration Intel
gateways provide a standard, supported SMDI integration, which is a decision factor
for Microsoft IT. To accomplish SMDI integration with Intel gateways, Microsoft
IT connected multiple gateways to the same SMDI link by using two primary gateways
and multiple secondary gateways. By doing this, Microsoft IT can switch over from
one primary gateway to another, enabling gateway firmware updates with no service
interruption.
Increased Unified Messaging Security
There are many security concerns associated with a unified messaging environment.
For example, Session Initiation Protocol (SIP) proxy impersonation, network sniffing,
session hijacking, and even unauthorized phone calls can compromise network security.
Microsoft IT can choose from several methods to help secure the unified messaging
environment, especially Unified Messaging servers and traffic between VoIP gateways
and Unified Messaging servers:
- Secure protocols In the unified messaging environment, all
traffic that uses SIP can use Mutual Transport Layer Security (MTLS). This includes
the traffic between VoIP gateway devices and the Unified Messaging servers.
- Trusted local area networks (LANs) To prevent network sniffing
and reduce overall security risks, Microsoft IT places VoIP gateways on a virtual
LAN (VLAN) separate from the corporate production environment. This makes traffic
access possible only for authorized individuals with physical access to VoIP gateways.
Moreover, Unified Messaging servers only communicate with gateways explicitly listed
in the dial plan.
- IPSec The Microsoft corporate network uses IPSec for all
IP communication within the network. However, the VoIP gateway to Unified Messaging
server traffic is already encrypted by using MTLS. To ensure optimal performance,
and according to product recommendations, Microsoft IT created an exception for
VoIP gateway to Unified Messaging server traffic to bypass IPSec while maintaining
security through MTLS.
In addition to these security measures, Microsoft IT enforces general security practices
such as using strong authentication methods and strong passwords.
Unified Messaging Feature Considerations
Exchange Server 2007 Unified Messaging servers include various configuration
options, such as dial plans, VoIP gateway communication partners, hunt groups, mailbox
policies, and so on. Some configuration options represent default configurations
or require inputting the necessary values, such as the IP addresses for VoIP gateways.
For other options, such as dial plans and hunt groups, Microsoft IT considered the
feature set necessary to meet business requirements and configured settings accordingly.
One consideration Microsoft IT faced when configuring settings on Unified Messaging
servers involved dialing rules and mailbox policies. Dialing rules represent logical
groupings of PBXs and specify details about sets of numbers and mailbox extensions,
whereas mailbox policies enable Microsoft IT to apply a common set of policies or
security settings such as personal identification number (PIN) details and dialing
restrictions to a collection of unified-messaging-enabled mailboxes. Microsoft IT
considered which phone calls to allow users to make: local only or local and long-distance
calls. After considering the costs, Microsoft IT enabled calling for users in North
America to anywhere in North America and restricted other sites to local-only calls.
User Education
Microsoft IT also addressed the needs of unified messaging users by providing comprehensive
documentation for the new environment. Before enabling users for unified messaging
services, Microsoft IT ran a pilot where a select group of users tested features
and functionality to verify that the system performed as expected. In order to run
the pilot and prepare the way for enabling all the users for a specific location,
Microsoft IT considered the following:
- Customized e-mail templates Exchange Server 2007 sends
out messages to users during the rollout process. The first message provides an
initial PIN, and the second message notifies the user of rollout and migration completion.
Exchange Server 2007 stores the e-mail templates in a configuration file. Microsoft
IT customizes these e-mail templates to include intranet links to locations that
provide further help and instructions for common tasks.
- User documentation Microsoft IT created an information and
support repository for users that consists of help documents located on the corporate
intranet. Microsoft IT maintains these documents to ensure that they provide up-to-date
information for users.
Internet
Mail Connectivity
Microsoft IT capitalizes on native Exchange Server 2007 anti-spam and antivirus
capabilities to help protect the company's Internet mail connectivity points against
spammers and attacks at the messaging layer. Specifically, Microsoft IT uses Exchange
Server 2007 Edge Transport servers and Forefront Security for Exchange Server
to help protect the corporate network from outside threats.
Microsoft IT defined the following goals for the Internet mail connectivity design:
- Increase security by using built-in Exchange Server 2007 features in a perimeter
network that is strictly separate from the corporate production environment.
- Adopt flexible spam filtering methods through Edge Transport agents.
- Optimize spam filtering to keep unwanted messages out while delivering legitimate
messages.
- Develop a fault-tolerant system that balances both incoming and outgoing message
traffic.
- Enable inbound and outbound scanning for viruses at multiple levels, including the
perimeter network, to stop viruses at the earliest point possible.
Inbound and Outbound Message Transfer
In the Exchange Server 2003 environment, Microsoft IT used a total of six Internet
mail gateway servers in Redmond and Silicon Valley as the main points of contact
for inbound and outbound Internet message transfer and four additional outbound-only
Internet mail gateway servers in Dublin and Singapore. Concentrating the incoming
Internet message traffic through the six Internet mail gateway servers in Redmond
and Silicon Valley enabled Microsoft IT to limit internal resource exposure, concentrate
spam filtering, and centralize security administration. Maintaining four additional
outbound-only Internet mail gateway servers in Dublin and Singapore eliminated the
need to transfer messages to Internet recipients from these regions across the Microsoft
WAN to an Internet mail gateway server in Redmond or Silicon Valley. To provide
good performance and redundancy at each data center, Microsoft IT decided to use
three Internet mail gateway servers in Redmond, three in Silicon Valley, two in
Dublin, and two in Singapore. Because this design proved reliable and adequate,
Microsoft IT retained this topology in the Exchange Server 2007 design as well.
Microsoft IT replaced the Internet mail gateway servers in a straightforward way
with the same number of Edge Transport servers. Microsoft IT subscribed the Edge
Transport servers in Redmond and Silicon Valley to the Active Directory site ADSITE_REDMOND-EXCHANGE.
This causes the Hub Transport servers in North America to relay outbound messages
through these Edge Transport servers. Similarly, Microsoft IT subscribed the Edge
Transport servers in Dublin and Singapore to the Active Directory sites of ADSITE_DUBLIN
and ADSITE_SINGAPORE so that Exchange Server 2007 routes outbound messages
to the Internet primarily through the Edge Transport servers in each region. Figure
23 shows the Internet mail connectivity and topology with Exchange Server 2007.
.jpg)
Figure 23. Internet mail connectivity topology
Redundancy and Load Balancing
Because Microsoft IT must meet stringent performance and availability SLAs, balancing
the traffic load and providing redundancy is a vital consideration for Internet
mail connectivity. Internally, Microsoft IT uses multiple Hub Transport servers
in the regions with Edge Transport servers. Within each region, all Hub Transport
servers can transfer outbound messages to their local Edge Transport servers. In
the opposite direction, Edge Transport servers can also choose any of the Hub Transport
servers in the local region to transfer inbound messages.
Externally, for inbound message transfer from the Internet, Microsoft IT uses DNS
round-robin and MX records with a preference value of 10. This method has proven
to be effective for distributing traffic load; therefore, Microsoft IT continues
to use it with all six Edge Transport servers in North America. Because all records
have the same preference value, Internet clients are free to select any one of these
MX records at random. The A record for each MX record specifies two possible servers
that reside in separate data centers. In this way, if an SMTP client constantly
uses the same MX record, the SMTP client can choose between two servers. This configuration
of two servers for each record provides an added degree of load balancing.
The following name server lookup (nslookup) response shows the configuration of
the public DNS zone for microsoft.com that Microsoft IT established for the Edge
Transport servers in Redmond and Silicon Valley.
microsoft.com MX preference = 10, mail exchanger = maila.microsoft.com
microsoft.com MX preference = 10, mail exchanger = mailb.microsoft.com
microsoft.com MX preference = 10, mail exchanger = mailc.microsoft.com
maila.microsoft.com internet address = 131.107.115.212
maila.microsoft.com internet address = 205.248.106.64
mailb.microsoft.com internet address = 131.107.115.215
mailb.microsoft.com internet address = 205.248.106.30
mailc.microsoft.com internet address = 131.107.115.214
mailc.microsoft.com internet address = 205.248.106.32
Note: In addition to DNS host (a) and mail exchanger (MX) records for round-robin
load balancing, Microsoft IT maintains Sender ID (Sender Policy Framework, or SPF)
records. The Sender ID framework relies on SPF records to identify messaging hosts
that are authorized to send messages for a specific SMTP domain, such as microsoft.com.
Internet mail hosts that receive messages from microsoft.com can look up the SPF
records for the domain to determine whether the sending host is authorized to send
mail for Microsoft users. If the sending host is not one of the six Edge Transport
servers in Redmond and Silicon Valley or one of the four outbound Edge Transport
servers in Dublin and Singapore, the receiving Internet host can block the submission
attempt or perform any other action specified by the Internet host's administrator,
such as increasing the messages' spam confidence level.
Increasing Perimeter Network Security
Microsoft IT took advantage of new features available with Edge Transport servers
on the 64-bit platform to increase security in the perimeter network. For example,
Edge Transport servers do not need to be part of the internal Active Directory environment,
which enabled Microsoft IT to tighten access rules on the firewall between the perimeter
network and the corporate production environment. Furthermore, the increased processing
power available on the 64-bit platform enabled Microsoft IT to perform virus scanning
directly on the Edge Transport servers to stop viruses before they enter the corporate
production environment. This is a significant improvement over previous conditions
where Microsoft IT had to perform virus scanning for performance reasons only in
the production environment on dedicated hub servers running Exchange Server 2007
but not on the Internet mail gateway servers.
Figure 24 illustrates the deployment of Edge Transport servers in regional perimeter
networks. Microsoft IT separates the perimeter network through inner and outer firewalls.
.jpg)
Figure 24. Edge Transport server security
Edge Transport servers must communicate with Internet hosts as well as the internal
Exchange organization. To accomplish this, Microsoft IT enabled traffic by opening
the following ports for specific services:
- Inbound SMTP TCP port 25 The only inbound port opened on
both the internal and external firewall was port 25 for SMTP. Microsoft IT required
no other ports, because Edge Transport servers have the dedicated role of accepting
Internet mail, processing it, and forwarding it to a Hub Transport server.
- Outbound SMTP TCP port 25 For outbound SMTP traffic, Microsoft
IT opens outgoing traffic on port 25 on both firewalls. This enables Hub Transport
servers to route mail traffic to Edge Transport servers and enables Edge Transport
servers to relay messages to Internet hosts.
- Antivirus updates TCP port 80 Microsoft IT opens port 80
on the firewall between the perimeter network and Internet hosts to enable antivirus
definition updates.
- DNS TCP/User Datagram Protocol (UDP) port 53 Edge Transport
servers use DNS records for MX host resolution and to check SPF records for Sender
ID. Accordingly, Microsoft IT enables port 53 on the firewall between the perimeter
network and Internet hosts.
- Terminal services TCP port 3389 To perform maintenance and
operations tasks, Microsoft IT opens port 3389 for Terminal services on the inner
firewalls.
- EdgeSync TCP port 50636 Although only port 25 is necessary
to support message transfer from Edge Transport servers to Hub Transport servers
via SMTP, Edge Transport servers also need access to Active Directory data through
EdgeSync. Accordingly, Microsoft IT opens port 50636 for the Exchange EdgeSync service.
It is important to note that the direction of this data flow is following the push
model from the internal network to the perimeter network. There is no need for Edge
Transport servers to actively connect and pull the directory information from the
corporate network.
Note: Although it is not necessary to deploy Edge Transport servers in an
Active Directory environment, Microsoft IT chose to deploy all Edge Transport servers
on computers that are members of the Extranet Forest. The Extranet Forest is separate
from the Corporate Forest. This form of deployment helps Microsoft IT maintain a
consistent management framework, apply a common set of policies to all Edge Transport
servers, and support single sign-on for server administrators.
Server Hardening
In addition to securing firewalls on both sides of the perimeter network, Microsoft
IT took into consideration the security configuration for Edge Transport servers.
This process of server hardening took into account the following components:
- Ports Following the classic concept of a perimeter
network, Edge Transport servers use separate NICs, with one card facing Internet
hosts and the other card facing Hub Transport servers. The Internet-facing card
has all unnecessary protocols and services, such as NetBIOS over TCP/IP and File
and Printer Sharing, disabled and accepts traffic only on port 25.
- Services Microsoft IT uses the Security Configuration
Wizard to analyze the unnecessary services to disable. Microsoft IT disables all
unnecessary services, such as Internet Information Services (IIS).
- File shares Microsoft IT removed the Everyone group
from all shared folders. All shares must have the security groups applied that contain
only the users who need access to the shares. Microsoft IT does not apply open security
groups to shares such as Authenticated Users, Domain Users, or Everyone.
- Accounts After obtaining the proper security groups
developed during the permissions and administration model design for the environment,
Microsoft IT adds security groups as members of the local administrators group on
the Exchange Server and verifies that the computer is in the correct organizational
unit. Microsoft IT also enforces passwords of 15 characters or more to meet strong
password requirements. Microsoft IT requires smart-card authentication to administer
servers in the Extranet Forest through Terminal Services.
- Security updates Microsoft IT monitors the installation
of security updates and security configurations on server platforms by using Microsoft
Baseline Security Analyzer (MBSA). Edge Transport servers must have all current
security updates to help ensure security.
Optimizing Spam and Virus Scanning
In terms of design for transport agents on Edge Transport and Hub Transport servers,
Microsoft IT considered what settings required customization to meet performance,
security, and reliability demands. Many agents work with the default settings and
do not require additional configuration. Other agents need varying levels of customization,
such as:.
- Connection-filtering configuration Settings that relate
to the connection-filtering configuration include IP block-list and IP allow-list
providers, and Sender Reputation Level (SRL). For block-list provider settings,
Microsoft IT uses the information provided from third-party block-list vendors.
For the SRL, Microsoft IT uses the default configuration to perform an open proxy
test when determining the sender confidence level, blocks senders with an SRL of
seven, and adds these senders to the IP block list for the duration of 24 hours.
In addition to using block-list providers and locally maintained block lists, Microsoft
IT keeps up to date through automatic IP reputation updates from Microsoft Research.
- Recipient-filtering configuration For recipient filtering,
Microsoft IT considered the following aspects: blocking messages to invalid recipients,
blocking messages to mailboxes and global distribution lists that are for internal
use only, blocking messages sent to recipients not listed in the recipient list,
and blocking specified recipients in the properties of the Recipient Filtering Agent.
- Content-filtering configuration Microsoft
IT configured a spam confidence level (SCL) store threshold of five on Mailbox servers
and an SCL reject messages threshold of seven on Edge Transport servers for the
Content Filter Agent. These values came from testing and previous experience running
Exchange servers. Microsoft IT did not configure an SCL threshold value to delete
messages or quarantine messages.
- Attachment-filtering configuration Microsoft IT uses Forefront
Security for Exchange Server file filtering to remove critical attachments that
correspond to Level 1 file types blocked by default in Outlook 2007. Forefront
Security for Exchange Server filters attachments based on Multipurpose Internet
Mail Extensions (MIME) type, files within a container file, files based on extension,
and critical files with an arbitrary extension.
Microsoft IT installs other agents enabled with default settings, except for the
inbound and outbound address-rewriting agents. The address-rewriting agents on Edge
Transport servers modify sender and recipient SMTP addresses based on predefined
address alias information. Because this is not a consideration for the Exchange
Server environment at Microsoft, Microsoft IT disables these agents.
Optimizing Outbound Message Transfer
Edge Transport servers communicate with Internet hosts as well as the internal Exchange
Server environment through SMTP send and receive connectors. These connectors offer
built-in protection, including a header firewall, connection tarpitting, and SMTP
backpressure. Microsoft IT uses two receive connectors and four send connectors
on all Edge Transport servers for all mail transfer:
- Inbound mail For inbound mail from the Internet to the internal
Exchange Server organization, Microsoft IT configured one receive connector that
faces the Internet and accepts SMTP messages, and one send connector that transfers
these inbound messages to Hub Transport servers for further routing.
- Outbound mail For outbound mail destined for Internet hosts,
Microsoft IT configures one receive connector that faces Hub Transport servers and
three send connectors for relaying outbound messages to Internet hosts. Edge Transport
servers use the three send connectors in different scenarios. The first send connector
is dedicated to encrypted communication with remote domains that support TLS. Microsoft
IT configures the second connector to communicate with destinations that do not
support Extended SMTP. The third send connector is a general Internet connector
that Edge Transport servers use for all destinations that do not match the address
space definitions of the TLS and HELO connectors. For more information about the
SMTP connectors, see the "Microsoft Exchange Server 2007 Edge Transport and
Messaging Protection" white paper at
http://technet.microsoft.com/en-us/library/bb735142.aspx.
Deployment
Planning
Deployment planning is a critical element of the Microsoft IT planning and design
process. It addresses the question of how to implement the new messaging environment
with minimal interference on existing business processes and provides all members
of the Exchange Messaging team with a clear understanding of when to perform the
required deployment steps.
The high-level deployment plan the Messaging Engineering team recommended for the
transition to Exchange Server 2007 in the corporate production environment
included the following phases:
- Introduce Exchange Server 2007 into the corporate production environment.
- Verify the successful integration of Exchange Server 2007.
- Fully deploy Client Access servers in North America.
- Fully deploy Hub Transport servers in North America.
- Deploy Mailbox servers in North America.
- Introduce Edge Transport servers in North America.
- Deploy Forefront Security for Exchange Server 2007.
- Deploy Exchange Server 2007 in regional data centers.
- Switch the messaging backbone to Exchange Server 2007.
- Complete the transition to Exchange Server 2007.
Note: In addition to business and technical requirements, the Messaging Engineering
team had to consider several unique software issues because the designs were based
on beta versions of Exchange Server 2007. Important features, such as OAB generation
on Mailbox servers in CCR configuration, and software products, such as Forefront
Security for Exchange Server, were not yet ready for deployment at the time Microsoft
IT started the production rollout. The Microsoft IT deployment plans reflect these
dependencies, which became obsolete with the release of Exchange Server 2007.
Introducing
Exchange Server 2007 into the Corporate Production Environment
Microsoft IT introduced Exchange Server 2007 into the Corporate Forest in May
2006. This work included the preparation of Active Directory, the implementation
of the administrative model, and the installation of the first Hub Transport and
Client Access servers, as required to integrate Exchange Server 2007 with an
existing Exchange Server 2003 environment. Microsoft IT configured the initial
routing group connector between the routing group RG_REDMOND and EXCHANGE ROUTING
GROUP (DWBGZMFD01QNBJR) because Microsoft IT focused the initial deployment activities
on the data center in North America that corresponded to the routing group RG_REDMOND.
Verifying the Successful Integration of Exchange Server 2007
This initial phase also included the installation of a first Mailbox server for
approximately 250 production mailboxes. Moving 250 power users to Exchange Server 2007
enabled Microsoft IT to verify the successful integration of Exchange Server 2007
into the existing messaging environment.
Fully
Deploying Client Access Servers in North America
With the availability of Exchange Server 2007 Beta 2, Microsoft IT began the
deployment of Client Access servers at full scale in North America. This included
building and testing the Client Access server farm and switching the redmond.microsoft.com
URL namespace from the Exchange 2003 front-end servers to the Exchange 2007
Client Access servers. Switching to Client Access servers early on was necessary
to preserve the existing namespaces. Exchange 2003 front-end servers do not
support access to Mailbox servers running Exchange Server 2007.
Figure 25 illustrates how Client Access servers support users with mailboxes on
Exchange Server 2003 and Exchange Server 2007. Users with mailboxes on
Exchange Server 2003 simply continue to use the legacy virtual directories
that Client Access servers provide for backward compatibility. The Microsoft Office
Outlook Web Access 2003 proxy component in Exchange Server 2007 proxies
the user requests to the Exchange 2003 back-end server where the mailbox resides.
The back-end server renders the HTML response as usual, which the Exchange 2003
back-end server passes back to the client via the Client Access server. In this
scenario, the Client Access server behaves exactly like an Exchange 2003 front-end
server.
.jpg)
Figure 25. Office Outlook Web Access coexistence
If users with mailboxes on Exchange
Server 2007 access the legacy virtual directories (that is, /public/* and /exchange/*),
Exchange Server 2007 redirects the user to the /owa/* virtual directory, which
provides access to Outlook Web Access. It is important to note, however, that Outlook
Web Access 2007 did not yet support public folders (support for public folders
was added in Exchange Server 2007 SP1). A pre-SP1 Mailbox server could only
redirect public-folder requests to an Exchange 2003 back-end server. For this
reason, Microsoft IT decided to retain public-folder servers running Exchange Server 2003
in the messaging environment until Exchange Server 2007 SP1 was available.
Note: Microsoft IT maintains multiple public-folder servers running Exchange
Server 2007 in all four locations with Mailbox servers for redundancy. In addition
to these public-folder servers supporting Outlook Web Access and legacy clients,
such as Office Outlook 2003, Microsoft IT maintains two public-folder servers
running Exchange Server 2007 in Redmond. These two Exchange 2007 servers
host all non-system public folders.
Fully Deploying Hub Transport Servers in North America
Prior to deploying Mailbox servers in large numbers, Microsoft IT deployed Hub Transport
servers to provide the necessary redundancy and scalability for the message routing
functions within the Exchange Server 2007 environment. All messages sent between
Mailbox servers must pass through a Hub Transport server. The work also included
creating SMTP send and SMTP receive connectors to Hub Transport servers in other
forests and changes to the routing group connector configuration for messaging connectivity
with Exchange Server 2003. Microsoft IT specified all Exchange Server 2003
computers as bridgeheads so that messages from Exchange Server 2003 to Exchange
Server 2007 would not have to pass through the legacy hub servers. This "all
to many" relationship corresponds to the configuration that Microsoft IT also uses
in all other routing group connectors among Exchange Server 2003 routing groups.
Deploying
Mailbox Servers in North America
With the Client Access server and Hub Transport server topologies in place, Microsoft
IT began to deploy additional Mailbox servers to move more and more mailboxes to
Exchange Server 2007. The plan was to increase the number of mailboxes in the
Exchange Server 2007 environment to 16,000 mailboxes before approaching other
deployment tasks, such as deploying Edge Transport servers. As Microsoft IT moved
approximately 1,000 mailboxes every day, the load on Hub Transport servers continuously
increased. Because the Hub Transport servers ran beta software, Microsoft IT monitored
server response times and workload very closely during this time frame.
Introducing
Edge Transport Servers in North America
With the confidence that Hub Transport servers operated reliably, Microsoft IT prepared
to decommission the legacy hub servers in order to move more and more of the messaging
traffic onto Exchange Server 2007 and simplify the routing topology. This also
included introducing the first Edge Transport servers into the Redmond perimeter
network to handle inbound and outbound Internet traffic. Yet, up to this point,
Microsoft IT had not yet deployed Forefront Security for Exchange Server for virus
scanning on Hub Transport servers, because the software was not yet ready for production
deployment. Accordingly, Microsoft IT routed all incoming and outgoing messages
through the dedicated hub servers running Exchange Server 2003 and Microsoft
Antigen for Exchange. Routing all incoming and outgoing Internet messages through
dedicated hub servers running Exchange Server 2003 required manual changes
regarding the connector configuration between Hub Transport servers and Edge Transport
servers. Nevertheless, deploying Edge Transport servers provided Microsoft IT with
the opportunity to get practical experience with important Edge Transport features,
such as EdgeSync, and test the reliability of this server role for production use.
Deploying
Forefront Security for Exchange Server 2007
Having virus-scanning capabilities on Hub Transport servers was crucial for Microsoft
IT before introducing any further changes to the routing topology. Accordingly,
Microsoft IT began to test Forefront Security for Exchange Server as soon as the
software was available in a stable version. The criteria that Microsoft IT defined
for the deployment in the corporate production environment included the following
points:
- The antivirus solution works reliably (that is, without excessive failures) and
with acceptable throughput according to the expected message volumes.
- The software is able to find known viruses in various message encodings, formats,
and attachments.
- If the software fails, message transport must halt so that messages do not pass
the Hub Transport servers without scanning.
Having completed these tests successfully, Microsoft IT was able to deploy Forefront
Security and change the routing configuration. The Edge Transport servers and legacy
hub servers running Exchange Server 2003 in the perimeter network now routed
incoming messages through Hub Transport servers. The dedicated hub servers running
Microsoft Antigen for virus scanning were no longer involved in message transfer.
Microsoft IT decommissioned these servers at this stage.
Deploying
Exchange Server 2007 in Regional Data Centers
With the core of the message routing functions and virus scanning now reliably performed
by Hub Transport servers in North America, Microsoft IT was ready to approach the
deployment of Hub Transport servers, Client Access servers, and Mailbox servers
in the regional data centers. In general, the deployment processes followed the
approach Microsoft IT had so successfully used in North America. Microsoft IT established
routing group connectors between the Hub Transport servers in each region and the
local routing groups, switched the regional URL namespaces to Client Access servers,
began to move mailboxes to Exchange Server 2007, and transitioned the outbound
Internet mail gateway servers in the regional data centers to Edge Transport servers.
Switching
the Messaging Backbone to Exchange Server 2007
The final switch to Exchange Server 2007 in the messaging backbone did not
require many further changes. One important step was to optimize the message routing
topology by using Exchange Server-specific Active Directory site connectors in order
to implement a hub/spoke topology along the physical network links. For details
regarding message routing optimization, see the section "Message Routing Topology"
under "Architecture and Design Decisions" earlier in this white paper.
Another important step was to install the remaining Edge Transport servers for inbound
and outbound Internet traffic in the Redmond and Silicon Valley perimeter networks.
Furthermore, with the completion of the Forefront Security deployment, Microsoft
IT was able to decommission the Exchange Server 2003-based gateway and hub
servers in the perimeter networks and the corporate production environment.
Completing
the Transition to Exchange Server 2007
At this point, the Exchange Server 2007 environment no longer depended on the
legacy topology for message routing. The most important remaining tasks involved
completing the Mailbox server deployment and establishing the UM environment. Corresponding
to the commitment to finish the production rollout with the release of Exchange
Server 2007 to manufacturing, Microsoft IT moved between 1,000 and 1,500 mailboxes
to the new environment every day during September through November 2006 and finished
the deployment within the same week the product shipped.
Figure 26 shows the messaging environment at Microsoft after completion of the production
rollout in December 2006.
.jpg)
Figure 26. The Microsoft messaging environment after transitioning to Exchange Server 2007
(December 7, 2006)
Best
Practices
While designing a messaging environment with multiple worldwide data centers using
both IP and telephony technologies, Microsoft IT employed design phases during which
engineering teams analyzed the existing environment, considered the possible decisions,
and arrived at a fitting design for the Exchange Server 2007-based messaging
production environment. During the process of considering business needs and choosing
the features to address business requirements, Microsoft IT developed the following
best practices that supported a solid design and ensured a smooth transition to
Exchange Server 2007.
Planning and Design Best Practices
Microsoft IT relied on the following best practices in its planning and design activities:
- Clearly define goals Exchange Server 2007 includes
roles and configuration options that enable numerous topology and design scenarios.
The mix of server roles, enabled options, and settings depends on the business needs
and messaging goals of the organization.
- Design for production in mind To meet business requirements,
Microsoft IT checks design considerations against practical real-world constraints
that exist in the production environment. This helps to produce a smooth transition
to the new environment after implementation of the design.
- Design for peak load days Microsoft IT uses the concept
of peak load days, or snow days, to plan for the event when a large number of people
use the messaging infrastructure from outside the corporate network. The messaging
design takes into account the possibility of some days when the majority of users
work from home or remotely.
- Test in lab environment With the many options to meet business
requirements, Microsoft IT validates the chosen designs in a test environment. This
enables Microsoft IT to determine stability and finalize design plans before rolling
out a planned infrastructure in the production environment.
- Identify key risks Microsoft IT practices sound project
management practices as part of the MSF processes. These practices include identifying
risks present with design decisions as well as overall system risks. By identifying
risks early on, Microsoft IT can develop mitigation strategies to address the risks.
- Develop rollback and mitigation procedures It is important
for Microsoft IT to have rollback procedures when designing the Exchange Server 2007-based
messaging environment. Often, Microsoft IT accomplishes this through a period of
coexistence where both Exchange Server 2003 and Exchange Server 2007 process
mail. Microsoft IT later decommissions the Exchange Server 2003 server after
verifying functionality of the new environment.
Server Design Best Practices
Microsoft IT relied on the following best practices in its server designs:
- Use multiple-core processors and design storage based on both capacity and I/O
performance During testing, Microsoft IT determined that multi-core
processors provide substantial performance benefits. However, processing power is
not the only factor that determines Mailbox server performance. It is also important
to design the storage subsystem according to I/O load and capacity requirements
based on the desired number of users per Mailbox server.
- Use VSS-based backup VSS provides the ability to offload
backup processes to the passive node. This enables backup operations in much more
frequent intervals than streaming backups on active nodes.
- Eliminate single points of failure When designing an Exchange
environment, it was vital for Microsoft to create redundancy at all points possible.
Microsoft IT relied on multiple data centers, multi-homed NICs, redundant Hub Transport
and Edge Transport servers, multiple VoIP gateways for unified messaging, multiple
Client Access servers, and CCR on Mailbox servers.
Deployment Best Practices
Microsoft IT relied on the following best practices during the transition to Exchange
Server 2007:
- Establish flexible and scalable messaging infrastructure Microsoft
IT focused on single-role server deployments throughout the Exchange Server 2007
organization. Using dedicated server roles enabled Microsoft IT to fine-tune the
server designs and place the necessary number of servers to handle the load at each
location.
- Carefully plan URL namespaces At Microsoft, Client Access
servers handle approximately 150,000 mobile user sessions per month. To distribute
this load, Microsoft IT uses multiple URL namespaces, where a URL represents access
points for clients in specific geographic regions. Microsoft IT chose to preserve
these namespaces to provide a seamless transition for mobile users.
- Manage permissions through security groups Deploying Exchange
Server 2007 in an existing organization does not affect any existing rights
granted on Exchange Server 2003 resources through user accounts or security
groups. This provides Microsoft IT with the opportunity to clean up legacy permission
assignments and manage permissions through security groups.
- Use fewest permissions necessary Microsoft grants only necessary
rights to administrators, opting to grant full Domain or Enterprise Admin rights
only when there is a business or technical reason to do so. Even in these cases,
the rights are given temporarily to accomplish a specific task. Doing so helps maintain
network security.
- Use Forefront and multiple layers of protection Microsoft
IT designed the messaging environment to provide many layers of messaging protection
against viruses, spam, and other unwanted e-mail. Microsoft IT deploys Forefront
Security for Exchange Server on Edge Transport and Hub Transport servers to enable
bidirectional scanning of e-mail messages and enforce protection at multiple organization
levels.
- Place Edge Transport servers
in a perimeter network Microsoft IT helps protect the corporate
production environment by using perimeter networks, removing unnecessary client
receive connectors, and enabling transport encryption. Perimeter networks provide
the capability to restrict access to the Edge Transport servers in terms of open
ports, services, permissions, and users, and also to disable incoming mail flow
easily in case of severe attacks.
- Use ISA Server 2006 to publish Client Access servers Microsoft
IT capitalizes on the close integration of ISA Server 2006 and Exchange Server 2007
to provide security-enhanced access to the internal messaging environment over the
Internet. Among other things, ISA Server 2006 supports stateful packet inspection
and application-layer filtering. Microsoft IT also benefits from Web Publishing
Load Balancing, which enabled Microsoft IT to establish a highly scalable load-balancing
infrastructure for external messaging clients and avoid complicated load distribution
and client session affinity issues.
Conclusion
With the completion of the production rollout at the RTM date of the product, Microsoft
IT demonstrated the enterprise readiness of Exchange Server 2007. The Microsoft
messaging environment hosts 130,000 mailboxes with 500-MB and 2-GB quotas in four
data centers on 62 clustered Mailbox servers with 99.99 percent availability, achieving
SLA targets. The corporate production environment also includes 26 Client Access
servers, 15 Hub Transport servers, 10 Edge Transport servers, and 11 Unified Messaging
servers.
With the transition to Exchange Server 2007, Microsoft IT was able to reduce
operational costs, including costs for server hardware, storage, and backup. Among
other things, Microsoft IT replaced third-party solutions (such as UM systems) with
features that are directly available in Exchange Server 2007, replaced expensive
SAN-based storage with a more cost-efficient DASD-based solution, and eliminated
tape backups with associated cost savings of approximately $5 million per year.
Microsoft IT continues to achieve cost savings by scaling up Mailbox servers. Initially,
Microsoft IT limited the scale of Mailbox servers to fewer than 3,600 mailboxes
in order to support increased mailbox capacities while using a streaming backup
solution with limited throughput. In preparation of deploying a new VSS-based solution
that can perform backup operations on the passive node in CCR server clusters, Microsoft
IT began to scale up to 6,000 users per Mailbox server. Based on the new server
design, Microsoft IT was able to lower the overall number of Mailbox servers in
the corporate production environment from 62 to fewer than 30 server clusters.
By increasing mailbox quotas to 500 MB and 2 GB and deploying new productivity features
that are readily available in Exchange Server 2007, such as UM, Microsoft IT
helps to increase the productivity of Microsoft employees. Users can store all messages
on the server, including e-mail, voice mail, and fax messages, and access these
messages from any suitable stationary or portable client, including standard telephones.
Office Outlook 2007 also helps employees to increase productivity. By using
Office Outlook 2007 as the primary messaging client in the corporate production
environment, employees can benefit from new and advanced information management
features, such as instant search, managed folders, and more.
Increasing user productivity also entails maintaining availability levels for messaging
services according to business requirements and SLAs. To accomplish this goal, Microsoft
IT heavily focuses on single-role server deployments with Exchange Server 2007.
Dedicated Mailbox servers support CCR, which Microsoft IT uses to increase resilience
from storage-level failures. Single-role server deployments also enable Microsoft
IT to establish scalable and flexible middle-tier services that are capable of handling
large spikes in messaging activity, such as during snow days in Redmond when the
vast majority of Microsoft staff at headquarters prefers to work from home over
mobile messaging connections.
Another important aspect is messaging protection and security. To achieve the highest
security and protection levels while maintaining a flexible environment, Microsoft
IT encrypts all server-to-server message traffic by using IPSec or TLS to help prevent
spoofing and help protect confidentiality for messages and deployed Edge Transport
servers in the perimeter networks. Edge Transport servers increase the security
in the perimeter network and reduce the number of legitimate messages incorrectly
identified as spam. For antivirus protection, Microsoft IT deployed Forefront Security
for Exchange Server on all Edge Transport and Hub Transport servers.
Exchange Server 2007 enabled Microsoft IT to capitalize on 64-bit technologies
and cost-efficient storage solutions to increase the level of messaging services
in the corporate environment. The Exchange Messaging team is now in a better position
to respond to new messaging trends and accommodate emerging needs as Microsoft continues
to grow as a company.
For More Information
For more information about Microsoft products or services, call the Microsoft Sales
Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information
Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact
your local Microsoft subsidiary. To access information through the World Wide Web,
go to http://www.microsoft.com.