Export (0) Print
Expand All

Exchange Server 2007 Design and Architecture at Microsoft

How the Microsoft Information Technology organization designed the corporate Exchange Server 2007 environment

Technical White Paper

Published: November 9, 2007

Download

Download Technical White Paper, 11.2 MB, Microsoft Word file

Download PowerPoint Presentation, 10.4 MB, Microsoft PowerPoint file

Download IT Pro Webcast, MP3

Situation

Solution

Benefits

Products & Technologies

With Exchange Server 2003, Microsoft IT streamlined the messaging environment through server and site consolidation. Server clusters based on Windows Clustering and a highly available, shared storage solution based on storage area network (SAN) technology helped to ensure 99.99 percent availability. However, the costs and general limitations associated with the platforms and technologies used in the Exchange Server 2003 environment prevented Microsoft IT from efficiently meeting emerging messaging and business needs.

By replacing all Exchange Server 2003 servers with servers running Exchange Server 2007, Microsoft IT created new opportunities to drive down costs and system complexities while at the same time increasing security and deploying new features not available in previous versions of Exchange Server.

  • Increased reliability through new high-availability technologies, such as Cluster Continuous Replication.
  • Larger mailbox sizes through Mailbox servers equipped with cost-efficient storage solutions.
  • Reduced total cost of ownership (TCO) through cost-efficient storage solutions and elimination of tape backups.
  • Further reduced TCO by replacing third-party unified messaging systems with Unified Messaging servers in the Exchange organization.
  • Increased protection against attacks, spam, and malicious messages through Edge Transport servers.
  • Reduced topology complexities and improved regulatory compliance through Hub Transport servers.
  • Enhanced remote access and mobility options through Client Access servers.
  • Windows Server 2003
  • Active Directory
  • Microsoft Exchange Server 2007
  • Microsoft Office Outlook 2007
  • Enterprise Storage Technologies
Contents
Bb894728.arrow_px_down(en-us,TechNet.10).gif Executive Summary
Bb894728.arrow_px_down(en-us,TechNet.10).gif Introduction
Bb894728.arrow_px_down(en-us,TechNet.10).gif Reasons for Microsoft IT to Upgrade
Bb894728.arrow_px_down(en-us,TechNet.10).gif Environment Prior to Exchange Server 2007
Bb894728.arrow_px_down(en-us,TechNet.10).gif Planning and Design Process
Bb894728.arrow_px_down(en-us,TechNet.10).gif Architecture and Design Decisions
Bb894728.arrow_px_down(en-us,TechNet.10).gif Deployment Planning
Bb894728.arrow_px_down(en-us,TechNet.10).gif Best Practices
Bb894728.arrow_px_down(en-us,TechNet.10).gif Conclusion

Executive Summary

Microsoft Information Technology (Microsoft IT) maintains a complex Microsoft® Exchange Server environment consisting of several geographic locations and multiple Active Directory® forests. There are 16 data centers, four of which host Exchange Mailbox servers, to support more than 515 office locations in 102 countries with 121,000 users, including managers, employees, contractors, business partners, and vendors. Site and server consolidation conducted with Microsoft Exchange Server 2003 and new deployment features available in Microsoft Exchange Server 2007 in combination with proven planning, design, and deployment methodologies enabled Microsoft IT to transition this environment to Exchange Server 2007 in less than eight months. Microsoft IT decommissioned the last Mailbox servers running Exchange Server 2003 in the corporate Active Directory forest shortly after Microsoft released the new Exchange Server release to manufacturing (RTM) version on December 7, 2006.

This technical white paper discusses the Exchange Server 2007 architectures, designs, and technologies that Microsoft IT chose for the corporate environment and the strategies, procedures, successes, and practical experiences that Microsoft IT gained during the planning and design phase. In addition to common planning and design tasks typical for many Exchange Server deployment projects, such as server design, high-availability implementation, and capacity planning, transitioning a complex messaging environment to run on Exchange Server 2007 also entails specific planning considerations regarding directory integration, routing topology, Internet connectivity, client access technologies, and unified messaging (UM).

The most important benefits Microsoft IT achieved with the production rollout of Exchange Server 2007 included a substantial reduction of hardware, storage, and backup costs while maintaining 99.99 percent availability of messaging services. New features, such as cluster continuous replication (CCR), enabled Microsoft IT to reduce single points of failure in the messaging environment, increase Mailbox server resilience from storage failures, and eliminate tape backups to reduce costs. Exchange Server 2007 also enabled Microsoft IT to overcome the scalability limitations of the 32-bit platform and increase mailbox quotas in the corporate production environment from 200 megabytes (MB) to 500 MB and 2 gigabytes (GB). Microsoft IT also lowered maintenance overhead and associated costs by replacing third-party unified messaging systems with Exchange Server 2007-based Unified Messaging servers. Other important benefits included increased fault tolerance and simplified server maintenance through load-balanced server configurations for middle-tier services such as Client Access, Hub Transport, and Unified Messaging server roles. Microsoft IT also increased messaging protection through Microsoft Forefront™ Security for Exchange Server installed on all Hub Transport and Edge Transport servers.

This paper contains information for business and technical decision makers who are planning to deploy Exchange Server 2007. This paper assumes the audience is already familiar with the concepts of the Windows Server® 2003 operating system, Active Directory, and previous versions of Exchange Server. A high-level understanding of the new features and technologies included in Exchange Server 2007 is also helpful. Detailed product information is available in the Microsoft Exchange Server 2007 Technical Library at http://technet.microsoft.com/en-us/library/bb124558.aspx.

Note: For security reasons, the sample names of forests, domains, organizations, and other internal resources mentioned in this paper do not represent real resource names used within Microsoft and are for illustration purposes only.

Introduction

From the earliest days, e-mail messaging has been an important communication tool for Microsoft. Microsoft established the first company-wide messaging environment in July 1982 based on Microsoft XENIX (a UNIX version for the Intel 8088 platform). This environment evolved over more than a decade into a large and distributed infrastructure that was increasingly difficult to manage. By migrating to Microsoft Exchange Server version 4.0 in 1996, and subsequently upgrading to Microsoft Exchange Server version 5.0 and Microsoft Exchange Server version 5.5 in 1997, Microsoft IT achieved significant improvements in terms of functionality, maintainability, and reliability. At the beginning of the new millennium, Microsoft IT operated a messaging environment that included approximately 200 Exchange 5.5 servers in more than 100 server locations with approximately 59,000 users.

Many things changed for Microsoft IT with the upgrade to Microsoft Exchange 2000 Server, released in October 2000. Exchange 2000 Server so tightly integrated with the TCP/IP infrastructure, the Windows® operating system, and Active Directory that the Exchange Messaging team no longer could manage the messaging environment as an isolated infrastructure. A fundamental organizational change was necessary, which manifested itself in a new approach that viewed Microsoft IT as a provider of essential business services. Keeping Exchange servers running was no longer sufficient. The Exchange Messaging team now owned messaging as a service, which included all components upon which Exchange 2000 Server depended, such as Active Directory.

The shift of Microsoft IT toward a service-focused IT organization was also noticeable in the designs and service level agreements (SLAs) that Microsoft IT established with the rollout of Exchange Server 2003, released in October 2003. New technologies, such as support for multi-node server clusters and cached Exchange mode available in Microsoft Office Outlook® 2003, enabled Microsoft IT to concentrate Mailbox servers in four data centers and reduce the number of Exchange servers, including those for special purposes, from approximately 200 to roughly 100. The number of Mailbox servers dropped from 118 to 36. Individual Mailbox servers hosted up to 4,000 mailboxes per active cluster node with quotas of 200 MB per mailbox. Consolidating the corporate messaging environment yielded overall cost savings of $20 million USD in the fiscal year 2003 alone.

Microsoft IT designed the Exchange Server 2003 environment for the scalability and availability requirements of a fast-growing company. The consolidation included upgrades to the network infrastructure and the deployment of large Mailbox servers to support 85,000 mailboxes in total. Business-driven SLAs demanded 99.99 percent availability, including both unplanned outages and planned downtime for maintenance, patching, and so forth. To comply with the SLAs, Microsoft IT deployed almost 100 percent of the Mailbox servers in server clusters by using Windows Clustering and a highly available, shared storage subsystem based on storage area network (SAN) technology.

In 2006, prior to Exchange Server 2007, the environment had grown to include 130,000 mailboxes that handled 6 million internal messages and received 1 million legitimate message submissions from the Internet daily. On average, every user sent and received approximately 100 messages daily, amounting to an average e-mail volume per user per day of 25 MB. As the demand for greater mailbox limits increased, new technologies and cost-efficient storage solutions such as direct access storage (DAS) were necessary to increase the level of messaging services in the corporate environment.

"Our mission is to deliver value by enabling people with innovative and reliable information technology solutions that seamlessly integrate with, and improve how people work."

Jim DuBois

General Manager, MSIT

Microsoft Corporation

Reasons for Microsoft IT to Upgrade

A global survey of 1,400 chief information officers (CIOs) conducted by Gartner Executive Programs in 2006 indicated the focus of IT is increasingly shifting from cost-cutting to improving productivity, performance, and competitiveness. Following a decade of unrestrained growth in IT and approximately five years of consolidation and cutbacks thereafter, IT departments are now in a better position to refocus their strategies on future-oriented business goals.

Microsoft is a good example. After a period of cost-cutting through server and site consolidation, Microsoft IT used the deployment of Exchange Server 2007 as an opportunity to shift gears and focus on implementing solutions that help improve productivity and competitiveness, streamline system administration, and increase messaging protection beyond the levels already possible with Exchange Server 2003 Service Pack 2.

For the deployment of Exchange Server 2007, Microsoft IT defined the following key objectives:

  • Increase employee productivity   This included increasing mailbox quotas by as much as 1,000 percent, from 200 MB to 500 MB and 2 GB, and deploying new productivity features available in Exchange Server 2007, such as unified messaging, which enables users to receive all messages in their mailboxes, including e-mail, voice mail, and fax messages. In addition to desktop and portable clients, users can use standard telephones to access these messages. Increasing employee productivity also included deploying Microsoft Office Outlook 2007 as the primary messaging client so users can benefit from new and advanced information management features such as instant search, managed folders, and more.
  • Increase operational efficiency   This included reducing administrative overhead associated with maintaining the messaging environment through features that are directly available in Exchange Server 2007, such as Exchange Management Shell. Based on the Microsoft .NET Framework, Exchange Management Shell enables Microsoft IT to create custom scripts that facilitate mundane deployment tasks, such as applying a consistent set of configuration settings per server role across multiple servers.
  • Decrease security risks   This included deploying Edge Transport servers and Forefront Security for Exchange Server in the perimeter network to increase security and messaging protection and to reduce the number of legitimate messages incorrectly identified as spam (false positives). This also included encrypting all internal server-to-server message traffic using Transport Layer Security (TLS) to help protect confidentiality for messages in transit.
  • Decrease costs   This included redesigning server architectures and backup solutions for high availability to meet challenging SLAs. In redesigning server architectures, Microsoft IT heavily focused on incorporating the features directly available in Exchange Server 2007, replacing expensive SAN storage with a more cost-efficient DAS solution for CCR-based Mailbox server clusters, and eliminating tape backups. All of these considerations resulted in significant cost savings. From backup changes alone, Microsoft IT realized a cost savings of approximately 5 million per year.

Note: The Technical Case Study "Enterprise Messaging with Microsoft Exchange Server 2007" (http://technet.microsoft.com/en-us/library/bb687782.aspx) provides detailed information about the business benefits and advantages that Microsoft IT realized with the transition to Exchange Server 2007 in the corporate production environment.

Environment Prior to Exchange Server 2007

The Microsoft IT deployment options and design decisions for the transition to Exchange Server 2007 heavily depended on the characteristics of the existing network, Active Directory, and the messaging environment. Among other things, it was important to perform the transition from Exchange Server 2003 to Exchange Server 2007 without service interruptions or data loss. Like any organization operating a large messaging environment, Microsoft IT also had to take a phase of coexistence into account, because it was not possible to transition the entire environment in one gigantic step. To understand the Microsoft IT design decisions in detail, it is necessary to review the environment in which Microsoft IT performed the transition.

Figure 1 illustrates the locations of the data centers that contain Mailbox servers in the corporate production environment and the overall wide area network (WAN) connections between them. Concerning the WAN backbone, it is important to note Microsoft IT deliberately designed the network links to exceed capacity requirements. Only 10 percent of the theoretically available network bandwidth is dedicated to messaging traffic. The vast majority of the bandwidth is for non-messaging purposes to support the Microsoft product development teams.

Bb894728.image001(en-us,TechNet.10).jpg

Figure 1. The Exchange Server 2003 environment at Microsoft on March 31, 2006

Note: In addition to the data centers shown in Figure 1, there is one more important site with Exchange servers in North America: Silicon Valley. The Exchange servers in Silicon Valley provide a redundant path for sending and receiving Internet mail. This site does not contain Mailbox servers.

Network Infrastructure

Each Microsoft data center is responsible for a different region defined along geographical boundaries. Within each region, network connectivity between offices and the data center varies widely. For example, a high-speed metropolitan area network (MAN) based on Gigabit Ethernet and Synchronous Optical Network (SONET) connects more than 70 buildings to the Redmond data center. These are the office buildings on and off the main campus in the Puget Sound area. In other regions, such as Asia and the South Pacific, Internet-connected offices, or ICOs for short, are more dominant. Broadband connectivity solutions, such as digital subscriber line (DSL) or cable modems, provide significant cost savings over leased lines as long as the decrease in performance and maintainability is acceptable. Microsoft IT uses this type of connectivity primarily for regional sales and marketing offices.

Figure 2 summarizes the typical regional connectivity scenarios at Microsoft. It is important to note there are no Mailbox servers outside the data center, whereas Active Directory domain controllers may exist in high-availability buildings and medium-sized offices for local handling of user authentication, authorization, and application requests.

Bb894728.image002(en-us,TechNet.10).jpg

Figure 2. Regional connectivity scenarios at Microsoft

For regional connectivity, Microsoft IT relies on a mix of Internet-based and privately owned/leased connections, as follows:

  • Regional data centers and main campus   The main campus and regional data centers are connected together in a privately owned WAN based on frame relay, asynchronous transfer mode (ATM), clear channel ATM links, and SONET links.
  • Office buildings with standard or high availability requirements   Office buildings connect to regional data centers over fiber-optic network links with up to eight wavelength division multiplexing (WDM) channels per fiber pair.
  • Regional offices with up to 150 employees   Regional offices use a persistent broadband connection or leased line to a local Internet service provider (ISP) and then access their regional data centers through a transparent virtual private network over Internet Protocol security (VPN/IPSec) tunnels.
  • Mobile users   These use a dial-up or non-persistent broadband connection to a local ISP, and then access their mailboxes through VPN/IPsec tunnels, or by using Microsoft Exchange ActiveSync®, remote procedure call (RPC) over Hypertext Transfer Protocol (RPC over HTTP, also known as Outlook Anywhere), or Microsoft Office Outlook Web Access over secure HTTP (HTTPS) connections.

Directory Infrastructure

Like many IT organizations that must keep business units strictly separated for legal or other reasons, Microsoft IT has implemented an Active Directory environment with multiple forests. Each forest provides a clear separation of resources and establishes strict boundaries for effective security isolation. At Microsoft, some forests exist for legal reasons, others correspond to business divisions within the company, and yet others are dedicated to product development groups for early pre-release deployments and testing without interfering with the rest of the corporate environment. For example, by maintaining separate product development forests, Microsoft IT can prevent uncontrolled Active Directory schema changes in the Corporate Forest.

The most important forests at Microsoft have the following purposes:

  • Corporate   Seventy percent of the resources used at Microsoft reside in this forest. The Corporate Forest includes approximately 140 domain controllers. The Active Directory database is 11 GB in size, with over 1 million directory objects, including users, groups, organizational units, workstations, servers, domain controller accounts, and printer objects.
  • Corporate Staging   Microsoft IT uses this forest to stage software images, gather performance metrics, and create deployment documentation.
  • Exchange Development   Microsoft uses this forest for running pre-release Exchange Server versions in a limited production environment. Users within this forest use beta or pre-beta versions in their daily work to help identify issues prior to the release of the product. Microsoft IT manages and monitors this forest, while the Exchange Server development group hosts the mailboxes in this forest to validate productivity scenarios.
  • Extranet   Microsoft IT has implemented this forest to provide business partners with access to corporate resources. There are approximately 30,000 user accounts in this forest.
  • MSN®    MSN is an online content provider through Internet portals such as msn.com. Microsoft IT manages this forest jointly with the MSN technology team.
  • MSNBC   MSNBC is a news service and a joint venture between Microsoft and NBC Universal News. Legal reasons require Microsoft to maintain a separate forest for MSNBC. Microsoft IT manages this forest jointly with the MSNBC technology team.
  • Test Extranet   This forest enables the Extranet Technology team to test new solutions for partner collaboration without interfering with the Extranet Forest. Microsoft IT manages this forest jointly with the Extranet Technology team.
  • Windows Deployment   Microsoft IT created this forest to launch pilot projects during the Windows Server 2003 deployment phase as a pre-staging environment prior to deployment and feature configuration in the Corporate Forest. It is a limited production forest. Users within this forest use beta or pre-beta software in their daily work to help product development groups identify and eliminate design flaws and other issues.
  • Windows Legacy   This forest is used as a test environment for compatibility testing of previous Windows Server versions with Exchange Server (specifically Microsoft Windows 2000 service pack testing).

Note: Microsoft IT maintains a common global address list (GAL) across all relevant forests that contain Exchange Server organizations by using Active Directory GAL management agents, available in Microsoft Identity Integration Server (MIIS) 2003.

Domains in the Corporate Forest

Microsoft IT implemented nine domains in the Corporate Forest, separated into geographic regions. At the time of the production rollout, all domains in the Corporate Forest operated at the Windows Server 2003 functional level and contained between seven and 30 domain controllers. The domain controllers are 64-bit multi-processor systems with 16 GB of random access memory (RAM). The Microsoft IT Active Directory database in the Corporate Forest is approximately 11 GB in size. With 16 GB of RAM, domain controllers can load the entire 11 GB Active Directory database into memory, which provides good performance for Exchange Server and other applications that extensively perform directory lookups by using Lightweight Directory Access Protocol (LDAP).

Note: Microsoft IT does not use domains to decentralize user account administration. The human resources (HR) department centrally manages the user accounts, including e-mail address information, in a separate line-of-business (LOB) application. The HR system provides advanced business logic not readily available in Active Directory to enforce consistency and compliance. It is the authoritative source of user account information, synchronized with Active Directory through MIIS 2003.

Active Directory Sites

Overall, the corporate production environment (that is, the Corporate Forest) includes 202 Active Directory sites in a hub-and-spoke topology that closely mirrors the network infrastructure. The authoritative source of IP address and subnet information necessary for the Active Directory site definitions is an infrastructure database that the Microsoft IT network team maintains. Using MIIS 2003, Microsoft IT provisions site and subnet objects in Active Directory based on the data from the IP address and subnet infrastructure database and ensures in this way an accurate Active Directory site topology that mirrors the network layout. The MIIS solution automatically calculates all site links during the import into Active Directory. Based on this information, Knowledge Consistency Checker (KCC) updates the replication topology for the forest. Microsoft IT does not maintain the Active Directory replication topology manually.

Site Topology and Exchange Server 2003

A highly granular site and subnet topology provides advantages in terms of Exchange Server communication. Although Exchange Server 2003 relies primarily on routing groups to describe the physical network topology, communication with Active Directory and client referral logic is site-aware. The Directory Service Access (DSAccess) component prefers to communicate with domain controllers and global catalog servers in the local Active Directory site. Through an Active Directory site topology that closely mirrors the physical network infrastructure, Microsoft IT confines DSAccess communication to local network segments.

Figure 3 shows the Active Directory sites in the Corporate Forest that are relevant for the Exchange Server organization.

Bb894728.image003(en-us,TechNet.10).jpg

Figure 3. Exchange Server 2003 Active Directory sites and site links at Microsoft

As shown in Figure 3, there are four sites with Mailbox servers and a fifth site with Internet mail gateway servers running Exchange Server 2003. The remaining Active Directory sites, ADSITE_REDMOND and ADSITE_NORTH CAROLINA, contain infrastructure servers and domain controllers to handle authentication requests from client workstations and LOB applications, but no Exchange servers. Although these sites are not very relevant for Exchange Server 2003, they influence the future Exchange Server 2007 design, as explained later in this white paper. With respect to the Exchange Server 2007 design, it is important to note that all site links are bidirectional and IP-based.

Note: The Active Directory site topology at Microsoft mirrors the network layout of the corporate production environment, with ADSITE_REDMOND as the hub site in a hub-and-spoke arrangement of sites and site links. In contrast, the routing topology of Exchange Server 2003, defined through a hub-and-spoke topology of routing groups and routing group connectors, used a routing group containing the servers that physically resided in the ADSITE_REDMOND-EXCHANGE site. This difference is significant for the design of Exchange Server 2007 message routing, discussed later in this paper.

Dedicated Exchange Site Design

It is also worth mentioning that the Active Directory site named ADSITE_REDMOND-EXCHANGE contains only Exchange servers and domain controllers configured as global catalog servers. Microsoft created this dedicated site design during the Exchange 2000 Server time frame to provide its Exchange 2000 and  2003 servers with exclusive access to highly available Active Directory servers, shielded from client authentication and other application traffic.

Microsoft IT continues to use the dedicated Exchange site in its Exchange Server 2007 environment for the following reasons:

  • Performance assessments   Exclusive Active Directory servers provide an opportunity to gather targeted performance data. Based on this data, Microsoft IT and developers can assess the impact of Exchange Server versions and service packs on domain controllers in a genuine large-scale production environment.
  • Windows Server 2008 domain controllers   Microsoft IT maintained Windows Server 2008 Beta domain controllers in the corporate production environment. Microsoft IT decided to separate the Windows Server 2008 deployment from the messaging environment by using Active Directory sites without Exchange servers, such as ADSITE_REDMOND.

Warning: Implementing dedicated Active Directory sites for Exchange Server increases the complexity of the directory replication topology and the required number of domain controllers in the environment. To maximize the return on investment (ROI), customers should weigh the business and technical needs for dedicated Active Directory sites. Due to the Exchange Server 2007 reliance on Active Directory sites, dedicated sites can increase the complexity of Exchange Server 2007 topologies and design.

Messaging Infrastructure

The design of an Exchange Server 2003 organization relies on two topologies: a flat arrangement of administrative groups and a topology of routing groups and routing group connectors. Figure 4 shows the administrative and routing group topology of Microsoft IT's Exchange Server 2003 organization in the Corporate Forest and connectivity to other production forests and Exchange organizations.

Bb894728.image004(en-us,TechNet.10).jpg

Figure 4. Administrative and routing groups in the Corporate Forest just prior to the start of the transition to Exchange Server 2007 (March 31, 2006)

Administrative Topology

Microsoft IT uses a centralized administration model for Exchange servers. Although there are four administrative groups (North America, Dublin, Singapore, and Sao Paulo) in the legacy Exchange Server topology, Microsoft IT did not use custom permissions based on administrative groups. All Exchange Server administrators are located in Redmond and perform system administration and remote monitoring. The regional data centers are only responsible for hardware maintenance.

Note: Microsoft data centers operate 24 hours, seven days per week. In the event of a hardware problem, such as a disk failure, a local IT specialist is readily available to replace the affected hardware with minimum delay. The only exception is Sao Paulo, which observes regular business hours.

Routing Topology

The Exchange Server 2003 routing topology followed a hub/spoke architecture that corresponded to the WAN links depicted in Figure 1 earlier in this white paper. The data centers in Redmond, Dublin, Singapore, and Sao Paulo corresponded to routing groups with Mailbox servers. Two additional routing groups existed with gateway servers dedicated to external communication: RG_REDMOND PERIMETER and RG_SILICON VALLEY PERIMETER. Silicon Valley provided Internet mail redundancy for the messaging environment.

Between the routing groups, Microsoft IT used routing group connectors (RGCs) with the default option Any local server can send mail over this connector. Accordingly, all Exchange servers were able to transfer messages to adjacent routing groups directly. Using only one physical path to each routing group and all Exchange servers as local bridgeheads eliminated the need to communicate link state changes within and across routing groups. It also enabled Microsoft IT to keep the number of dedicated bridgehead servers at a minimum.

To implement the hub/spoke topology, Microsoft IT deployed four central bridgehead servers in North America, which Microsoft IT specified as remote bridgehead servers in the RGC configuration for the regions. Exchange servers in RG_DUBLIN, RG_SINGAPORE, and RG_SAO PAULO that wanted to transfer messages to other routing groups could do so through one of these bridgehead servers. In this way, the bridgehead servers created a bifurcation point for messages addressed to recipients in multiple locations. By splitting messages into multiple copies (bifurcation) at the latest possible point (the bridgeheads), Microsoft IT preserved network bandwidth on WAN links.

The four central bridgehead servers also performed message routing for communication with external entities, such as Internet destinations, partner domains, and Exchange organizations in other forests. For communication with Exchange organizations in other forests, Microsoft IT configured Simple Mail Transfer Protocol (SMTP) connectors directly on the bridgeheads (see Figure 4). Messages addressed to partners or Internet recipients, on the other hand, reached their destinations through further bridgehead servers (for antivirus scanning) and Internet mail gateways.

The Internet mail gateways in RG_DUBLIN and RG_SINGAPORE only handled outbound Internet mail for their local routing groups, whereas the Internet mail gateways in RG_REDMOND PERIMETER and RG_SILICON VALLEY PERIMETER were responsible for outbound and inbound Internet mail transfer. All inbound Internet mail messages reached Microsoft through two redundant locations in North America, which was the most efficient configuration for Microsoft because it has the flat e-mail domain namespace (@microsoft.com) and the majority of mailboxes are located in its Redmond location. The Internet mail gateways performed a series of anti-spam and other filtering checks (for example, block-list, sender, and recipient filtering), before routing the messages to internal bridgehead servers for virus scanning. Microsoft IT opted not to co-deploy antivirus solutions on the 32-bit Internet mail gateway servers in order to be able to apply anti-spam filtering first and avoid the overhead associated with virus scanning.

Additional Information: The Internet mail gateways in Redmond and Silicon Valley received up to 13 million daily message submissions from the Internet in normal situations, blocking 10.5 million of these as not legitimate (March 2006 statistics). During virus outbreaks on the Internet, the load occasionally exceeded 30 million e-mail submissions per day.

Server Roles per Location

Microsoft IT assigned a dedicated role to most servers in the Exchange Server 2003 organization, which allowed a more precise hardware configuration for each server to reflect high performance and scalability demands. An exception to this rule was the Mailbox server in Sao Paulo. Due to moderate workload, Microsoft IT was able to consolidate server roles in this location. The server in Sao Paulo acted as a Mailbox server, public-folder server, and bridgehead server. Moderate workload also enabled Microsoft IT to assign multiple roles to the public-folder servers in Dublin and Singapore. These servers assumed the responsibilities of public-folder servers and bridgehead servers for the regions.

Table 1 lists the number of servers per role that Microsoft IT deployed in each routing group (March 31, 2006).

Table 1. Servers per Role per Routing Group Prior to Exchange Server 2007 Transition

Routing group

Mailbox servers
(clustered)

Public-folder servers*

Bridge-head servers

Front-end servers

Gateway servers

Special purpose**

RG_REDMOND-EXCHANGE

21

5

8

6

0

3

RG_DUBLIN

6

2

2

2

0

RG_SINGAPORE

5

2

2

2

0

RG_SAO PAULO

1

2

0

0

RG_REDMOND PERIMETER

0

0

0

0

3

0

RG_SILICON VALLEY PERIMETER

0

0

0

0

3

0

* For hosting public-folder data, free-and-busy information, and offline address books

** To support messaging needs of internal LOB applications, etc.

Note: Because of server and site consolidation during the Exchange Server 2003 time frame, Microsoft IT deployed almost all Exchange Server 2003 Mailbox servers as clustered systems with a maximum of 4,000 mailboxes per Exchange Virtual Server (EVS). Through this configuration, Microsoft IT achieved 99.99 percent server availability, as explained in the Technical Solution Brief "Achieving High Availability with Exchange Server at Microsoft" (http://technet.microsoft.com/en-us/library/bb687782.aspx).

Planning and Design Process

The Microsoft IT planning and design process is unique in the way that messaging engineers start their work early in the product development cycle and collaborate very closely with the Microsoft Exchange Server Product group to clarify how exactly the new Exchange Server version should address concrete business requirements, system requirements, operational requirements, and user requirements. Through an assessment of the Exchange Server 2003 environment and in discussions with partner and customer IT organizations, Microsoft IT identified general issues, such as high storage costs and server scalability issues on the 32-bit platform, and communicated the findings as opportunities for improvements to the developers. In this way, the planning and design process at Microsoft IT actually influenced the product itself, addressing not only the requirements of Microsoft but also the needs of partner and customer IT organizations.

Figure 5 illustrates how Microsoft IT aligned the Exchange Server 2007 design and deployment processes from assessment and scoping through full production rollout. The individual activities correspond to the phases and milestones outlined in the Microsoft Solutions Framework (MSF) Process Model.

Bb894728.image005(en-us,TechNet.10).jpg

Figure 5. Microsoft IT planning, design, and deployment processes

Note: Detailed information about the MSF, including an MSF Resource Kit and case studies, is available on Microsoft TechNet.

The next sections discuss key activities that helped Microsoft IT to determine an optimal Exchange Server 2007 architecture and design for the corporate production environment.

Assessment and Scoping

In extensive planning sessions, product managers, service managers, the Exchange Systems Management team, Tier 2 Support team, Helpdesk, and Messaging Engineering team collaborated in virtual project teams to identify business and technical requirements and translated these requirements into proposals to the Exchange Server Product group. The Exchange Server Product group reviewed and incorporated these proposals into its product development plans. The results were commitments and shared goals between the developers and Microsoft IT to drive deployment actions and investments intended to improve IT services.

Deployment Planning Exercises

Within Microsoft IT, the Messaging Engineering team is responsible for creating the architectures and designs of all Exchange-related technologies. At a stage when the actual product was not available, messaging engineers began their work with planning exercises based on product development plans. The objective of these exercises was to decide how to deploy the new Exchange Server version in the future. The messaging engineers based their design decisions on specific productivity scenarios, the scalability and availability needs of the company, and other requirements defined during the assessment and scoping phase. For example, the Exchange Messaging team decided to use CCR to eliminate single points of failure in the Mailbox server configuration and DAS to drive down storage costs while at the same time increasing mailbox quotas up to a factor of 10. The deployment planning exercises helped to identify required hardware and storage technologies that Microsoft IT needed to invest in to be able to achieve the desired improvements.

Engineering Lab

The Messaging Engineering team maintains a lab environment that simulates the corporate production environment in terms of technologies, topology, product versions, and interoperability scenarios, but without production users. The Engineering Lab includes examples of the same hardware and storage platforms that Microsoft IT uses in the corporate production environment. It provides the analysis and testing ground for the messaging engineers to validate, benchmark, and optimize designs for specific scenarios, test components, verify deployment approaches, and estimate operational readiness. Testing in the Engineering Lab enables the messaging engineers to ensure that the conceptual and functional designs scale to the requirements and scope of the deployment project. For example, code instabilities or missing features of beta products might require Microsoft IT to alter designs and integration plans. Messaging engineers can verify the capabilities of chosen platforms and work with the product teams and hardware vendors to make sure the deployed systems function as expected even when running pre-release versions.

Pre-Release Production Deployments

Microsoft IT maintains a pre-release infrastructure, which is a limited production environment for running pre-release versions of server products. Pre-release production deployments begin prior to the alpha phase and continue through the beta and release candidate (RC) stages until Microsoft releases the product to manufacturing. During the pre-alpha stage, pre-release production deployments are a developer effort. Additional employees from within Microsoft join the campaign during the beta stages as early adopters.

Pre-release production deployments enable the developers to determine the enterprise readiness of the software, identify issues that might otherwise not be found prior to RTM, and collect valuable user feedback. For example, Exchange Server 2007 pre-release verification started in February 2005, more than 22 months before the Exchange Server Product group shipped the product. In comparison, the Microsoft Exchange Server 2003 pre-release deployment period was only six months.

Technology Adoption Program

The Exchange Server 2007 Technology Adoption Program (TAP) started during the pre-alpha stage in April 2005. TAP is a special Microsoft initiative, available by invitation only, to obtain real-world feedback from Microsoft partners and customers. More than 90 Microsoft partners and customers participated in the Exchange Server 2007 TAP. The Messaging Engineering team was also actively involved by providing early adopters with presentations outlining the Microsoft IT design process based on the then-current state of the product.

Microsoft runs several types of TAP programs for partners and customers to obtain real-world feedback on Microsoft pre-release products. For more information, see the TAP early-adopter information on MSDN at http://msdn2.microsoft.com/en-us/isv/bb190413.aspx.

Pilot Projects

An important task of the Messaging Engineering team is to document all designs, which the messaging engineers pass as specifications to the technical leads in the Systems Management team for acceptance and implementation. The messaging engineers also assist the technical leads during pilot projects and server installations and help develop a set of build documents and checklists to provide operators with detailed deployment instructions for the full-scale production rollout.

Production Rollout

The server designs that the Messaging Engineering team creates include detailed hardware specifications for each server type, which the Infrastructure Management team and the Data Center Operations group at Microsoft IT use to coordinate the procurement and installation of server hardware in the data centers. The Data Center Operations group builds the servers, installs the operating systems, and joins the new servers to the appropriate forest before the Exchange Systems Management team takes over for the deployment of Exchange Server 2007 and related components. To achieve a rapid deployment, Microsoft IT automated most of the Exchange Server deployment steps by means of Exchange Management Shell scripts.

Architecture and Design Decisions

"Microsoft IT is our first and best customer. Almost two years prior to RTM, Microsoft IT began with pre-release production deployments to help us build an excellent product. The close relationship with Microsoft IT is so vital to our culture of quality and customer satisfaction that we do not ship products or service packs until Microsoft IT signs off on the enterprise readiness. We shipped Exchange Server 2007 on December 7, 2006, with the confidence and proof in hand that the product delivers on its potential to help customers build reliable enterprise-class messaging environments while reducing total cost of ownership."

Terry Myerson

General Manager

Exchange Server Product Group

Microsoft Corporation

One of the most important objectives that Microsoft defined for the Exchange Server 2007 production rollout was to finish the transition no later than the official RTM date of the product. Microsoft IT committed to perform the rollout at full scale by using the Beta 2 release to demonstrate the enterprise readiness of Exchange Server 2007 to Microsoft customers. However, the Beta 2 release was not perfect, and Microsoft IT did not know all of the product's performance parameters yet. For these reasons, Microsoft IT deliberately created initial designs for the Client Access, Hub Transport, Edge Transport, Mailbox, and Unified Messaging server roles that exceeded actual production requirements. To maximize the ROI of server hardware and storage technology, Microsoft IT began to optimize the designs after the completion of the initial rollout. The following sections discuss these updated designs.

Administration and Permissions Model

Like any public company in the United States, Microsoft must safeguard the corporate IT infrastructure to comply with legal and regulatory requirements, such as the Health Insurance Portability and Accountability Act of 1996 and the Sarbanes-Oxley Act of 2002. These regulations also apply to the Exchange Server environment. To prevent fraud, protect assets, and ensure that messaging resources are used as intended, Microsoft IT implemented a strict administrative design according to the principle of fewest privileges as well as formal approval processes for granting administrative rights.

Security Principles and Guidelines

The Messaging Engineering team used the following principles and guidelines in the administrative design for Exchange Server 2007:

  • Always use groups to assign rights   Delegating administrative permissions through security groups is uncomplicated, easy to understand, and a best practice in enterprise IT environments. Among other things, users can determine their effective rights by analyzing their group memberships. Moreover, security groups eliminate the need to configure individual access control lists (ACLs) for resources, which helps to reduce complexities caused by possibly conflicting rights granted through multiple direct or inherited access control entries (ACEs) and helps to keep the size of ACLs under control. Granting rights through group membership is also more efficient than granting access permissions through individual assignments, because a security group's access permissions do not change when members are added or removed from the group.
  • Use the default permissions model   The default permissions model of Exchange Server 2007 covers most of the Microsoft IT requirements. With the exception of specific needs, such as regulatory requirements, Microsoft IT does not deviate from the default model to maintain the fewest necessary number of ACLs on Exchange Server resources.
  • Grant the least permission necessary   The principle of fewest privileges is a generally recognized security approach and a Microsoft IT best practice. Microsoft IT never grants full Domain or Enterprise Admin rights unless there is a compelling business or technical reason. For example, administrators who need to modify recipient attributes in Active Directory do not need and do not receive access permissions to resources in the Exchange Server organization.
  • Implement approval processes   Each security group that Microsoft IT uses for granting rights must be controlled based on approval processes that include the IT team responsible for maintaining the service or data.

Exclusive Microsoft IT Management

The administrative model of Exchange Server 2007 relies on Active Directory forests to define security boundaries. Within a single forest, there is no security isolation, because forest owners and enterprise administrators can always gain access to all resources in any domain. Accordingly, Microsoft IT grants enterprise administrator and top-level domain administrator rights in the Corporate Forest only on a temporary basis and enforces very strict approval processes.

Very strict approval processes imply that developers from the Exchange Server Product group and employees who are not responsible for operating the corporate production environment are usually not granted administrator rights in the Corporate Forest. If individuals outside Microsoft IT require administrative access to Exchange Server resources for testing or other purposes, the corresponding resources cannot be part of the Corporate Forest. To accommodate these situations, Microsoft IT maintains separate forests, such as an Exchange Development Forest that the Exchange Server Product group can use to test pre-release versions of Exchange Server.

Only in rare situations does Microsoft IT grant developers temporary administrative access to production servers, such as when troubleshooting critical server failures. To accommodate these situations, Microsoft IT implemented an access model based on Group Policy objects (GPOs) and Restricted Groups policies. User accounts that are removed from the membership list in a Restricted Groups policy are removed from the corresponding security group during the next GPO refresh, which occurs every 90 minutes on a member server and every five minutes on a domain controller and also every 16 hours, whether or not there are any changes. In this way, Microsoft IT enforces the removal of temporary administrative access and maintains the principle of fewest privileges in the corporate production environment.

Centralized System Administration

Exchange Server 2007 supports centralized and decentralized administration by means of four administrator roles:

  • Exchange Organization Administrators   As the name implies, the Exchange Organization Administrators role gives administrators organization-wide, full access to all Exchange properties, objects, and servers. All Microsoft IT operators within the Exchange Messaging team are Exchange Organization Administrators with the rights to read and change all Exchange configuration data in the Corporate Forest worldwide. This strictly centralized model greatly simplifies administrative complexities.
  • Exchange Server Administrators   These administrators only have permissions to control the configuration of a particular server or group of selected servers and cannot perform global, organization-wide administration tasks. Because Microsoft IT manages all Exchange Server resources centrally from headquarters in Redmond, it was not necessary for Microsoft IT to delegate the Exchange Server Administrators role for subsets of Exchange servers to separate user accounts or security groups.
  • Exchange Recipient Administrators   Users assigned the Exchange Recipient Administrators role take care of recipient-related tasks, such as mailbox-enabling user accounts, mail-enabling contacts, distribution groups, and other types of recipient objects in Active Directory, and configuring Client Access and Unified Messaging mailbox settings. It is important to note that the Exchange Messaging team does not perform recipient administration. Other teams, such as the HR department, maintain the user accounts and related Exchange settings.
  • Exchange View-Only Administrators   These are users without operational tasks such as Helpdesk leads and messaging engineers who have a business need to examine the parameters and current state of the messaging environment.

Note: The administration and permissions model of Exchange Server 2007 does not rely on administrative groups. An administrative group called Exchange Administrative group (FYDIBOHF23SPDLT) only exists for backward-compatibility reasons. All computers running Exchange Server 2007 belong to this administrative group. It is not a supported operation to rename this group or move server objects into a different administrative group by using low-level Active Directory tools.

Default Permissions Model

The Exchange Server 2007 permissions model is straightforward and flexible because it promotes the use of universal security groups for permissions management that closely correspond to the administrative roles listed in the previous section. The Exchange Setup /PrepareAD process creates these universal security groups in the root domain with a forest-wide scope. The groups are located in the Microsoft Exchange Security Groups container. They are globally available for permission assignments to Exchange Server resources in any domain, and they can contain users and groups from any domain in the forest.

Microsoft IT based its centralized administration model for Exchange Server 2007 on the following default universal security groups:

  • Exchange View-Only Administrators   Have read-only access to Exchange recipient attributes on Active Directory objects and read-only access to the Exchange configuration data in Active Directory.
  • Exchange Recipient Administrators   Have full control over Exchange recipient attributes on Active Directory objects. In addition, this group has read-only access to the Exchange configuration data in Active Directory because this group is a member of the Exchange View-Only Administrators group.
  • Exchange Organization Administrators   Have full access to the Exchange configuration data in Active Directory. In addition, this group has full control over Exchange recipient attributes on Active Directory objects because this group is a member of the Exchange Recipient Administrators group.

One important feature of the default configuration is that there is no Exchange Server Administrators group because that is not how Microsoft IT applies permissions for the Exchange Server Administrator role. Instead, Microsoft IT applies the permissions directly on the object in the configuration partition. Another important fact is that Exchange Organization Administrators have permissions to modify recipient attributes through membership in the Exchange Recipient Administrators group. Microsoft IT considered removing the Exchange Organization Administrators from the Exchange Recipient Administrators group but found no compelling reason to deviate from the default model during the initial deployment. The default permissions do not grant Exchange Organization Administrators rights to create, modify, or delete objects within the Active Directory domain-naming context, which would require at least Account Operators privileges. The authoritative source of user account information is a separate LOB application, maintained by the HR department and synchronized with Active Directory through MIIS 2003.

Note: For information about implementing a split permissions model, see the topic "Planning and Implementing a Split Permissions Model" in the Exchange Server 2007 product documentation, available online at http://technet.microsoft.com/en-us/library/bb232100.aspx.

Formal Approval Processes

The domain topology that Microsoft IT implemented in the Corporate Forest features an empty root domain to support strict management processes for schema updates and for granting forest-wide administrative rights and privileges. Only a very limited number of IT managers have administrative permissions in the root domain. Because the Exchange Setup /PrepareAD process creates the universal security groups of Exchange Server 2007 in the root domain, Microsoft IT implicitly gained the ability to enforce formal approval processes for Exchange Server administration as well.

Restricting access to the universal security groups of Exchange Server 2007 in the root domain does not prevent Microsoft IT from delegating approval processes to individual teams and groups that are ultimately responsible for the corporate and Exchange Server environment, such as the Legal and Corporate Affairs (LCA) team and the Exchange Messaging team. To delegate approval processes, Microsoft IT added security groups from child domains to the universal security groups in the root domain. Team managers with permissions to change group membership information in child domains can use these security groups to delegate administrative rights.

Figure 6 shows the principle of delegating approval processes to individual teams and groups through security groups. As a best practice, Microsoft IT always uses security groups (in child domains) as opposed to individual user accounts to assign administrative permissions. The membership information of security groups in the root domain rarely changes.

Bb894728.image006(en-us,TechNet.10).jpg

Figure 6. Delegating access control and approval processes

Permissions Review

One interesting aspect of the Exchange Server 2007 default permissions model is that when running setup with the /PrepareLegacyExchangePermissions option, it begins the automatic process of changing the permissions model of the Exchange Server 2003 environment for coexistence with Exchange Server 2007. Microsoft IT completes this process by running the setup with the /PrepareAD option, which finishes the changes to property sets and security groups. While applying new Exchange Server 2007 permissions, Microsoft IT took the opportunity to review and clean up permission assignments from the Exchange Server 2003 environment.

For example, Microsoft IT used security groups to grant administrative permissions to Exchange Server 2003 resources at the organization and administrative group levels. Yet, some members of these groups no longer need these permissions. By using new security groups in the child domains, added to the Exchange Organization Administrators group in the root domain, Microsoft IT effectively devised the following strategy to reevaluate all existing Exchange Server administrators:

  • Current Exchange Server administrators   Individuals who need administrative rights to manage Exchange Server 2007 resources must request them from the Systems Management team. Following each individual approval, the Systems Management team adds the corresponding user account to the new security groups to grant Exchange Organization Administrators permissions.
  • Former Exchange Server administrators   Individuals who no longer need administrative permissions automatically lose these rights as Microsoft IT decommissions Exchange Server 2003 resources and corresponding administrator groups.

Message Routing Topology

Agile companies that heavily rely on e-mail in practically all areas of the business, such as Microsoft, cannot tolerate messages that take hours to reach their final destinations. Information must travel fast, reliably, predictably, and securely. For Microsoft, this means 99 percent of all messages within the corporate production environment must reach their final destination in 90 seconds worldwide. Of course, this SLA does not apply to messages that leave the Microsoft environment, because Microsoft IT cannot guarantee message delivery across external systems. Microsoft established this mail-delivery SLA during the Exchange Server 2003 time frame, and it was equally important during the design of Exchange Server 2007. Another important design goal was to increase security in the messaging backbone by means of access restrictions to messaging connectors, data encryption through TLS, and messaging antivirus protection based on Forefront Security for Exchange Server.

Microsoft IT had to consider the following new Exchange Server 2007 routing features in the design of the message routing topology:

  • Active Directory sites   Comparable to routing groups in Exchange Server 2003, Active Directory sites define a boundary for Hub Transport servers to deliver messages directly to Mailbox servers, distribution group expansion servers, connector servers, and Edge Transport servers subscribed to the local site. For destinations in remote Active Directory sites, the Hub Transport server in the local site must relay the messages to a Hub Transport server in the remote site for further delivery. At least one Hub Transport server must exist in every Active Directory site with mailbox servers.
  • IP site links   Comparable to routing group connectors in Exchange Server 2003, IP site links define logical paths between Active Directory sites, which Hub Transport servers use to perform message routing calculations. It is important to note that Active Directory supports IP and SMTP site links, whereas Exchange Server 2007 ignores SMTP site links in the routing topology. If a Hub Transport server resides in a site connected only by an SMTP site link, routing errors will occur. It is necessary to replace the SMTP site link with an IP-based site link.
  • Least-cost routing paths   Each IP site link is associated with a cost value that Exchange Server 2007 uses to calculate the message transfer paths across the routing topology. If multiple IP site links exist between two Active Directory sites with Hub Transport servers, Exchange Server 2007 transfers the message along the path with the least combined cost to the ultimate destination.
  • Next hop selection   If the least-cost routing path to the ultimate destination includes multiple Active Directory sites and IP site links, Exchange Server 2007 attempts to relay the messages to an Active Directory site that is as close as possible to the ultimate destination when delivery to that destination fails. For example, Hub Transport servers might select the ultimate destination site as the next hop to deliver messages directly, skipping all intermediary sites that exist along the least-cost routing path.
  • Queue at the point of failure and backoff   If message delivery to the next hop fails (for example, because there was no Hub Transport server available in the target site), Exchange Server 2007 attempts to deliver the messages to an interim site along the least-cost routing path. This backoff mechanism starts with the Active Directory site that is directly adjacent to the unavailable next hop, and then backs off, site by site, along the least-cost routing path until a connection is established or no further site exists. In this way, Exchange Server 2007 queues the messages at the point in the routing path where communication failed, which facilitates the troubleshooting of transfer issues.
  • Delayed fan-out   If a message has multiple recipients in destinations that share part of or the entire least-cost routing path, Exchange Server 2007 transfers a single message copy to the point in the routing path where a fork occurs. At the fork, Exchange Server 2007 splits the message into separate copies. Again, each message copy might still have multiple recipients in destinations that share part of or the entire remaining least-cost routing path, and so forth. The bifurcation at the latest possible point to preserve network bandwidth is called delayed fan-out.

Network Infrastructure and Site Consolidation

An important aspect for planning message routing in an Exchange Server 2007 environment is the physical network topology. The physical network determines the IP routing topology, which directly influences the Active Directory site topology, which in turn determines the message routing topology of Exchange Server 2007. By using the existing Active Directory site infrastructure for message routing purposes, Exchange Server 2007 takes advantage of an optimal network configuration with little need for adjustments.

For example, Microsoft IT did not need to perform any network optimization to accommodate Exchange Server 2007, although the Exchange Server 2003 site and server consolidation that Microsoft IT conducted at a global scale three years ago benefited Microsoft IT in the Exchange Server 2007 design. Microsoft IT only had four main Active Directory sites with Exchange Mailbox servers and reliable WAN links with sufficient net-available bandwidth to take into consideration for Exchange Server 2007 topology and routing planning. The Exchange Server 2003 site consolidation greatly reduced the complexity of the transition exercise for Microsoft IT.

The fact that Microsoft IT only had to consider four main locations and Active Directory sites that already mapped to an optimized network infrastructure, as discussed earlier in this white paper, led to the following benefits:

  • Uncomplicated messaging topology   No additional configuration was necessary to establish a functional Exchange Server 2007 routing topology—although opportunities to increase the efficiency of message transfer still existed. However, managing only four sites that house Mailbox servers and optimizing message flow by means of Exchange Server-specific IP site links was a straightforward undertaking for Microsoft IT.
  • Best possible Hub Transport server utilization   Mailbox servers submit messages for routing and transport to a Hub Transport server in the local Active Directory site. Depending on the location of the recipients, the Hub Transport server then transfers the messages to a Mailbox server within the local site, to a Hub Transport server in another Active Directory site, or through a messaging connector to another external destination. The result is that message transfer cannot work if there is no Hub Transport server in the local site of the Mailbox server. In addition, this implies that a small number of Active Directory sites with a large number of Mailbox servers provide a better Hub Transport server utilization than a large number of Active Directory sites with a small number of Mailbox servers. For example, Microsoft IT deployed three Hub Transport servers for 15 Mailbox servers in ADSITE_DUBLIN to perform load balancing and provide fault tolerance. If the Mailbox servers in Dublin resided in two Active Directory sites instead, Microsoft IT would have had to deploy one additional Hub Transport server, because each site would have required a minimum of two Hub Transport servers to achieve load balancing and fault tolerance.

Note: The exact server ratios per Active Directory site depend on the performance characteristics of the individual environment and the server configurations.

  • Reduced chance of server communication issues   For purposes of message routing, client access, and unified messaging, it is important to deploy Hub Transport servers, Client Access servers, and Unified Messaging servers in each Active Directory site that contains Mailbox servers. Active Directory sites correspond to one or more IP subnets that represent areas with reliable, high-speed IP connectivity. Yet, it is possible to leave gaps or specify overlapping IP subnets in the Active Directory site topology, which can cause communication issues if Exchange Server 2007 cannot correctly determine its site membership. Keeping the Active Directory site topology straightforward helps Microsoft IT avoid these types of issues.

Dedicated Exchange Sites in the Active Directory Topology

Although dedicated Active Directory sites for Exchange Server may be used for domain controller/global catalog isolation and dedication of that infrastructure to Exchange servers for reliability or performance reasons, in the context of Exchange Server 2007 transport, dedicated Exchange sites can complicate the routing topology. They deviate from the classic definition of an Active Directory site as areas with reliable, high-speed, low-latency IP connectivity. Because dedicated Exchange sites generally do not correspond to the physical network layout, they can lead to a message routing topology that does not use the physical network links as efficiently as possible.

For example, Microsoft IT maintains a dedicated Exchange Active Directory site in addition to a central hub Active Directory site in Redmond, as indicated in Figure 3 earlier in this white paper. In the Active Directory replication topology, this dedicated Exchange site is a tail site. ADSITE_REDMOND is the Active Directory hub site that is used as the Active Directory replication focal point, yet this site does not contain any Hub Transport servers. Accordingly, at this point Exchange Server 2007 cannot use ADSITE_REDMOND as a hub site for message routing purposes and by default interprets the Microsoft IT Exchange organization as a full-mesh topology where Exchange 2007 Hub servers in each region connect to each other via a single SMTP hop. This does not match the Active Directory replication and network topology. Exchange Server 2007 uses the full-mesh topology in this concrete scenario, because along the IP site links all message transfer paths appear to be direct.

Figure 7 illustrates this situation by placing the Active Directory site topology and default message routing topology on top of each other.

Bb894728.image007(en-us,TechNet.10).jpg

Figure 7. Full-mesh message routing in a hub-and-spoke network topology

The routing topology depicted in Figure 7 works because Exchange Server 2007 can transfer messages directly between the sites with Hub Transport servers (such as ADSITE_SINGAPORE to ADSITE_DUBLIN), yet all messages travel through the Redmond location according to the physical network layout. In this topology, all bifurcation of messages, sent to recipients in multiple sites, takes place at the source sites and not at the latest possible point along the physical path, which would be Redmond. For example, if a user in Sao Paulo sends a single message to recipients in the sites ADSITE_REDMOND-EXCHANGE, ADSITE_DUBLIN, and ADSITE_SINGAPORE, the source Hub Transport server in Sao Paulo establishes three separate SMTP connections, one SMTP connection to Hub Transport servers in each remote site, to transfer three message copies. Hence, the same message travels three times over the network from Sao Paulo to Redmond. Microsoft IT could avoid this by eliminating the dedicated Exchange site ADSITE_REDMOND-EXCHANGE and moving all Exchange servers to ADSITE_REDMOND. The Hub Transport servers in ADSITE_REDMOND would then be in the transfer path between ADSITE_SAO PAULO, ADSITE_DUBLIN, and ADSITE_SINGAPORE, which would mean that Exchange Server 2007 could delay bifurcation until messages to recipients in multiple sites reach ADSITE_REDMOND. In this situation, the source server would only need to transfer one message copy to Redmond, the message routing topology would follow the physical network layout, and Microsoft IT would not have to take any extra configuration or optimization steps.

Based on such considerations, it is a logical conclusion that the design of an ideal Exchange Server 2007 environment takes the implications of dedicated Exchange Active Directory sites into account. On one hand, it is beneficial to keep the message routing topology straightforward and the complexities associated with maintaining and troubleshooting message transfer minimal. On the other hand, Microsoft IT had to weigh the benefits of eliminating the dedicated Redmond Exchange site ADSITE_REDMOND-EXCHANGE against the impact of such an undertaking on the overall deployment project in terms of costs, resources, and timelines. Among other things, Microsoft IT maintains ADSITE_REDMOND-EXCHANGE to shield Exchange Server 2007 from Windows Server 2008 domain controllers that exist in the corporate production environment. As mentioned earlier in this paper, Exchange Server 2007 support for Windows Server 2008 was not available at the time Microsoft IT transitioned the corporate production environment. Eliminating ADSITE_REDMOND-EXCHANGE would have required Microsoft IT to remove all Windows Server 2008 domain controllers from ADSITE_REDMOND, which was not an option. Furthermore, Microsoft IT takes advantage of the dedicated Active Directory site to measure the footprint of Exchange Server 2007 on domain controller/global catalog servers and provide this information as feedback to the product teams. Correspondingly, Microsoft IT decided to leave ADSITE_REDMOND-EXCHANGE in place. Instead, the Exchange Messaging team collaborated with the Active Directory team to adjust the Active Directory site topology by using alternative methods to optimize message transfer without affecting the established Active Directory replication architecture and topology.

Optimized Message Transfer Between Hub Transport Servers

Although Exchange Server 2007 generated a functioning message routing topology without any extra design work, Microsoft IT decided to review the routing topology based on business and technical requirements to drive further optimizations. Key factors that influenced the optimization decision included the "90 seconds, 99 percent of the time" mail-delivery SLA and the desire to save network bandwidth on WAN links by increasing the efficiency of message transfer.

Important reasons that compelled Microsoft IT to optimize the Exchange Server 2007 message transfer topology include the following:

  • Efficient message flow   At Microsoft, 99 percent of the messages must reach their recipients within 90 seconds or less. Although optimized message flow is not a strict requirement and it is possible to meet mail-delivery SLAs in a full-mesh topology, optimized message flow can help to accelerate message delivery.
  • Preserved WAN bandwidth   The corporate production environment handles more than 6 million internal messages daily. Although most message traffic stays in the local site or has Redmond headquarters as the destination, optimized message flow can help to preserve WAN bandwidth for all messages with recipients in multiple remote Active Directory sites.

Having made the decision to optimize message routing, Microsoft IT augmented the Active Directory site link topology in order to take advantage of the Exchange Hub Transport servers in ADSITE_REDMOND-EXCHANGE. To achieve efficient message flow and preserve WAN bandwidth, it was necessary to place ADSITE_REDMOND-EXCHANGE in the routing path between ADSITE_DUBLIN, ADSITE_SAO PAULO, and ADSITE_SINGAPORE by creating additional Active Directory site IP links. This approach ensured that Exchange Server 2007 could bifurcate messages traveling between regions closer to their destination. By configuring the ExchangeCost attribute on Active Directory site links, which Exchange Server 2007 adds to the Active Directory site link definition, Microsoft IT was able to perform the message flow optimization without affecting the Active Directory replication topology. The ExchangeCost attribute is only relevant for Exchange Server 2007 message routing decisions between sites, not Active Directory replication.

Microsoft IT performed the following steps to optimize message routing in the corporate production environment:

  1. To establish a hub/spoke topology between all sites with Exchange servers, Microsoft IT created three additional Active Directory IP site links (see Figure 8).
  2. Microsoft IT specified a Cost value of 999 (highest across the Active Directory topology) for these new IP site links so that Active Directory does not use these site links for directory replication.
  3. Using the Set-AdSiteLink cmdlet, Microsoft IT assigned an ExchangeCost value of 10 to the new Exchange-specific site links. This value is significantly lower than the Cost value of all other Active Directory site links, so that Exchange Server 2007 uses the Exchange-specific site links for message routing path discovery.

Figure 8 illustrates how the Exchange-specific site links change the message routing topology. The dedicated Exchange site ADSITE_REDMOND-EXCHANGE in North America now acts as a fork in the routing path to the sites in ADSITE_DUBLIN, ADSITE_SAO PAULO, and ADSITE_SINGAPORE.

Bb894728.image008(en-us,TechNet.10).jpg

Figure 8. Optimized message routing topology in the Corporate Forest

Based on the Exchange-specific Active Directory/IP site link topology, Exchange Server 2007 routes messages in the Corporate Forest as follows:

  • Messages to a single destination   The source Hub Transport server selects the final destination as the next hop and sends the messages directly to a Hub Transport server in that site. For example, in the Dublin to Singapore mail routing scenario, the network connection passes through Redmond, but the Hub Transport servers in ADSITE_REDMOND-EXCHANGE do not participate in the message transfer.
  • Messages to an unavailable destination   If the source Hub Transport server is unable to establish a direct connection to a destination site, the Hub Transport server backs off along the least-cost routing path until a connection to a Hub Transport server in an Active Directory site is established. This would be a Hub Transport server in ADSITE_REDMOND-EXCHANGE, which queues the messages for transmission to the final destination upon restoration of network connectivity.
  • Messages to recipients in multiple sites   Exchange Server 2007 delays message bifurcation if possible. In the optimized topology, this means that Hub Transport servers transfer all messages with recipients in multiple sites first to a Hub Transport server in ADSITE_REDMOND-EXCHANGE. The Hub Transport server in ADSITE_REDMOND-EXCHANGE then performs the bifurcation and transfers a separate copy of the message to each destination. Figure 8 illustrates this scenario. Exchange Server 2007 transfers a single message copy from ADSITE_SAO PAULO to ADSITE_REDMOND-EXCHANGE, where bifurcation takes place, before transferring individual message copies to each destination site. Again, Exchange Server 2007 transfers only a single copy per destination site. Within each site, the receiving Hub Transport servers may bifurcate the message further as necessary for delivery to individual recipients.

Note: Microsoft IT did not configure the REDMOND-EXCHANGE site as a hub site in the routing topology by using the Set-AdSite cmdlet to force all messages between regions to travel through the REDMOND-EXCHANGE site, requiring an extra SMTP hop on Hub Transport servers in that site. This would have mirrored the previous Exchange Server 2003 routing topology, yet Microsoft IT found no compelling reason to force all message traffic through the North American Hub Transport servers. Establishing a hub site is useful if tail sites cannot communicate directly with each other. In the Microsoft IT corporate production environment, this is not an issue. For more information about hub site configurations, see the topic "Understanding Active Directory Site-Based Routing" in the online product documentation at http://technet.microsoft.com/en-us/library/aa998825.aspx.

Connectivity to Remote SMTP Domains

For destinations outside the Corporate Forest, Microsoft IT distinguishes between external and internal remote locations. For external remote locations, Microsoft IT relays all messages over Edge Transport servers deployed in perimeter networks, as explained in the section "Internet Mail Connectivity" later in this white paper. For internal remote locations, Microsoft IT uses messaging connectors directly on the Hub Transport servers in ADSITE_REDMOND-EXCHANGE. This design mirrors the Exchange Server 2003 topology.

Increased Message Routing Security

To ensure compliance with legal and regulatory requirements, Microsoft IT encrypts most messaging traffic in the corporate production environment. The only exceptions are internal destinations without user mailboxes, such as lab and test environments.

Microsoft IT uses the following encryption technologies to prevent unauthorized access to information during message transmission:

  • IP security   IPSec encrypts data communication and prevents unauthorized access to resources in the corporate production environment. Because IPSec works at the IP layer, it can help secure communication between servers without relying on the application to support the encryption natively. Microsoft IT extensively used IPSec encryption in its Exchange Server 2003 environment to help secure internal SMTP transactions and continues its use today in scenarios where SMTP servers and applications do not support native TLS-based SMTP encryption. Although IPSec technology offers strong encryption controls, managing custom IPSec policies can be quite cumbersome. With the transition to Exchange Server 2007, Microsoft IT was able to accomplish most of the transport encryption by using native Exchange Server product features.
  • Transport Layer Security   Exchange Server 2007 supports TLS right out of the box. Hub Transport servers use TLS to encrypt all message traffic within the Exchange Server 2007 environment and rely on opportunistic TLS encryption for communication with remote destinations, such as Hub Transport servers in other Microsoft IT-managed forests. Edge Transport servers also support TLS and domain security to establish security-enhanced message transfer paths to business partners over the Internet. Native support for SMTP TLS on Hub Transport and Edge Transport servers enabled Microsoft IT to eliminate the dependency on complex IPSec policies for encryption of internal and external messages in transit.

In addition to encrypting messaging traffic internally, Microsoft IT also protects its internal messaging environment by restricting access to inbound SMTP submission points. This helps Microsoft IT minimize mail spoofing and ensure that unauthorized SMTP mail submissions from rogue internal clients and applications do not affect corporate e-mail communications. To accomplish this goal, Microsoft IT removes all default Receive connectors on the Hub Transport servers and configures custom Receive connectors by using the New-ReceiveConnector cmdlet to accept only authenticated SMTP connections from other Hub Transport servers and Edge Transport servers in the environment. To meet the needs of internal SMTP applications and clients, Microsoft IT established a separate SMTP gateway infrastructure based on Exchange Server 2007 Hub Transport servers that enforces mail submission access controls, filtering, and other security checks.

Furthermore, Microsoft IT deployed Forefront Security for Exchange Server on all Hub Transport and Edge Transport servers to implement messaging protection against viruses and other malicious e-mail content at multiple layers in the Exchange Server 2007 infrastructure. Despite the fact that internal messages and messages from the Internet might pass through multiple Hub Transport and Edge Transport servers, performance-intensive antivirus scanning is performed only once. Forefront Security adds a security-enhanced antivirus header to each scanned message, so further Hub Transport or Edge Transport servers do not need to scan the same message a second time. This avoids processing overhead while maintaining an effective level of antivirus protection for all inbound, outbound, and internal e-mail messages.

Coexistence with Exchange Server 2003

Coexistence with Exchange Server 2003 introduces a special connectivity scenario that Microsoft IT had to consider during the planning phase. Exchange Server 2007 integrates into the existing routing topology through a special routing group, called EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR), which has the following limitations and restrictions for compatibility reasons:

  • All computers running Exchange Server 2007 must be members of EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)   Within this routing group, Exchange 2007 servers use the Active Directory site topology for message routing.
  • Computers running previous versions of Exchange Server cannot be members of EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)   Exchange Server 2003 is unaware of the various Exchange 2007 server roles and cannot recognize server-specific communication requirements. Therefore, Exchange Server 2003 cannot coexist in the same routing group with Exchange Server 2007.

One important implication of these restrictions is that EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) differs from the classical Exchange Server routing group in scope and definition. EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) is global in scope and does not define a network region with reliable connectivity. Rather, EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) defines a parallel hemisphere in which the Active Directory site topology defines the message routing topology, as illustrated in Figure 9.

Bb894728.image009(en-us,TechNet.10).jpg

Figure 9. Exchange Server 2003 and Exchange Server 2007 routing topologies in the Corporate Forest

Microsoft IT started the transition of the Corporate Forest messaging environment from Exchange Server 2003 to Exchange Server 2007 in the Redmond location. With the introduction of Exchange Server 2007 Mailbox servers, Hub Transport servers, and other server roles in that location, Microsoft IT needed to establish message transfer between the legacy Exchange Server 2003 and new Exchange Server 2007 environment and maintain this messaging connectivity for the entire period of coexistence. For that reason, Microsoft IT decided to connect the Exchange Server 2007-specific routing group first to the Exchange Server 2003 routing group RG_REDMOND in North America by using routing group connectors. Later, during the transition phase, as the Exchange Server 2007 infrastructure expanded into the regions, Microsoft IT created routing group connectors between the Exchange Server 2007-specific routing group and the regional Exchange Server 2003 routing groups. Shortly after this stage, Microsoft IT raised the cost of the legacy routing group connectors between regional routing groups in the Exchange Server 2003 environment, to route all messaging traffic between regions through the Exchange Server 2007 messaging backbone and take advantage of native Exchange Server 2007 routing features.

Microsoft IT performed the following steps to establish messaging connectivity between Exchange Server 2003 and Exchange Server 2007 in the corporate production environment:

  1. During the first Exchange Server 2007 installation, Microsoft IT selected a bridgehead server from RG_REDMOND to establish the initial routing group connection.
  2. To grant all Exchange 2003 servers the necessary send and receive permissions on Hub Transport servers, Microsoft IT made sure that the ExchangeLegacyInterop universal security group in the root domain included the Exchange Domain Servers global security groups from the child domains. When using Exchange Server 2007 tools to create the routing group connectors, the specified Exchange Server 2003 servers are automatically added to the ExchangeLegacyInterop group to grant the legacy servers the required permissions to send mail to and receive mail from Exchange Server 2007 Hub Transport servers.

    Note: It is a Microsoft IT-specific configuration to include the Exchange Domain Servers global security groups from the child domains in the ExchangeLegacyInterop universal security group in the root domain. This configuration enables the connector's remote legacy servers to be updated at any time and ensures that the ExchangeLegacyInterop group contains the appropriate server.

  3. For fault tolerance and load balancing, Microsoft IT installed additional Hub Transport servers in ADSITE_REDMOND-EXCHANGE and adjusted the bridgehead server configuration of the initial routing group connection, as summarized in Table 2.
  4. Following the deployment of Exchange Server 2007 in Dublin and Singapore, Microsoft IT established additional RGCs to optimize message flow and facilitate decommissioning of legacy routing groups. The Sao Paulo location did not require an additional RGC, because Microsoft IT transitioned all recipients in this location in a single step. For more information about the deployment process, see the section "Deployment Planning" later in this white paper.

Table 2. Bridgehead Server Configuration for Routing Group Connectors

Routing group connector

Local bridgeheads

Remote bridgeheads

From RG_REDMOND to EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)

Any local server can send mail over this connector.

This enables all Exchange 2003 servers to transfer messages directly to the Hub Transport servers without involving Exchange 2003 bridgeheads.

All Hub Transport servers located in ADSITE_REDMOND-EXCHANGE

From EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) to RG_REDMOND

All Hub Transport servers located in ADSITE_REDMOND-EXCHANGE.

All Hub Transport servers located in RG_REDMOND

From RG_DUBLIN to EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)

Any local server can send mail over this connector.

All Hub Transport servers located in ADSITE_DUBLIN

From EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) to RG_DUBLIN

All Hub Transport servers located in ADSITE_DUBLIN.

The public-folder servers in RG_DUBLIN, which also function as bridgehead servers

From RG_SINGAPORE to EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR)

Any local server can send mail over this connector.

All Hub Transport servers located in ADSITE_SINGAPORE

From EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) to the Singapore routing group

All Hub Transport servers located in ADSITE_SINGAPORE.

The public-folder servers in RG_SINGAPORE, which also function as bridgehead servers

 

Important: Because Microsoft IT used routing group connectors in a straightforward hub/spoke topology with the default setting of Any local server can send mail over this connector, it was not necessary for Microsoft IT to suppress link state updates on Exchange 2003 servers by specifying the SuppressStateChanges Registry parameter in preparation of the deployment of additional routing group connectors. Microsoft IT recommends that all customers suppress link state updates regardless of routing group connector configuration. For more information about the SuppressStateChanges Registry parameter, see the topic "How to Suppress Link State Updates" in the Exchange Server 2007 product documentation, available online at http://technet.microsoft.com/en-us/library/aa996728.aspx.

Server Architectures and Designs

Some important business requirements that Microsoft IT addressed in the server architectures and designs for Exchange Server 2007 revolved around the goals of eliminating the performance and scalability issues of the 32-bit platform and establishing a flexible messaging infrastructure to support growing mailboxes and a larger number of clients. The product group helped Microsoft IT achieve the first of these goals by optimizing Exchange Server 2007 for 64-bit server hardware, specifically x64 processors. By exploiting the advantages of dedicated Exchange 2007 server roles in combination with load-balanced, fault-tolerant server configurations, the Exchange Messaging team was also able to maintain SLAs with 99.99 percent availability of messaging services.

Flexible and Scalable Messaging Infrastructure

Microsoft IT heavily focused on single-role server deployments in almost all regions of the messaging environment. A server role is a logical unit that groups a selected set of server features and components together to perform a specific messaging function. Although the single-role server design increases the hardware footprint in the data center, it also increases the flexibility and scalability of the messaging environment. Figure 10 illustrates an Exchange Server 2007 architecture based on single-role server designs.

Bb894728.image010(en-us,TechNet.10).jpg

Figure 10. Exchange Server 2007 architecture based on single-role servers

Exchange Server 2007 supports the following five separate server roles to perform the tasks of an enterprise messaging system:

  • Client Access servers   Support Post Office Protocol 3 (POP3) and Internet Message Access Protocol 4 (IMAP4) clients, as well as Exchange ActiveSync, Office Outlook Web Access, and Outlook Anywhere and new Outlook 2007 client functions.
  • Edge Transport servers   Handle message traffic to and from the Internet and run spam filters. Microsoft IT also installs Forefront Security for Exchange Server on all Edge Transport servers for virus scanning.
  • Hub Transport servers   Perform the internal message transfer, distribution list expansions, and message conversions between Internet mail and Exchange Server message formats. At Microsoft, all Hub Transport servers also run Forefront Security for Exchange Server for virus scanning.
  • Mailbox servers   Maintain mailbox store databases and provide Office Outlook clients and Client Access servers with access to the data.
  • Unified Messaging servers   Integrate voice and fax with e-mail messaging and run Outlook Voice Access.

Multiple-Role and Single-Role Server Designs

With the exception of the Edge Transport server role, Exchange Server 2007 supports multiple-role server deployments. The Client Access server role, Hub Transport server role, Mailbox server role, and Unified Messaging server role can coexist on the same computer in any combination. Placing several roles on a single computer is advantageous for small Exchange Server deployments. The multiple-role approach provides the benefits of a reduced server footprint and can help to minimize the hardware costs. For example, Microsoft IT deployed a multiple-role server in Sao Paulo for the Hub Transport server role, Client Access server role, and Unified Messaging server role to use hardware resources efficiently. Similar to the Exchange Server 2003 design, Microsoft IT consolidated server roles in this location due to moderate workload. However, the Mailbox server in Sao Paulo is a single-role server according to the requirements for CCR.

Microsoft IT based its decisions to combine Exchange server roles on the same hardware or separate them between dedicated servers on capacity, performance, and availability demands. Mailbox servers are prime examples of systems with high capacity, performance, and availability requirements at Microsoft. Accordingly, Microsoft IT deployed the Mailbox servers in all regions in a single-role design, which enabled Microsoft IT to eliminate single points of failure in the Mailbox server configuration by using CCR. In Redmond, Dublin, and Singapore, Microsoft IT also used the single-role design for the remaining server roles, because these regions include a large number of users and multiple Mailbox servers.

Single-role server deployments provide Microsoft IT with the following benefits:

  • Optimized server hardware and software components   Different server roles require different hardware configurations and optimization approaches. For example, a Hub Transport server design for high performance must take sufficient storage capacity and input/output (I/O) performance into consideration to support message queues in addition to message routing functions, while Client Access servers typically do not have the same storage capacity requirements.

    During the initial production deployment in 2006, Microsoft IT used the following hardware per server role:

    • Client Access   Two dual-core AMD Opteron, 2.2 gigahertz (GHz), with 4 GB of memory.
    • Edge Transport   Two dual-core AMD Opteron, 2.2 GHz, with 8 GB of memory.
    • Hub Transport   Two dual-core AMD Opteron, 2.2 GHz, with 8 GB of memory.
    • Mailbox (2,000 users with 500-MB quotas)   Two dual-core Intel Xeon, 2.66 GHz, with 12 GB of memory.
    • Mailbox (2,400 users with 2-GB quotas)   Two dual-core Intel Xeon, 3.0 GHz, with 16 GB of memory.
    • Mailbox server (3,600 users with 2-GB quotas)   Four dual-core AMD Opteron, 2.6 GHz, with 24 GB of memory.
    • Unified Messaging   One dual-core AMD Opteron, 2.2 GHz, with 4 GB of memory.
  • Flexible systems scaling approach   The single-role server deployment enables Microsoft IT to design server hardware more accurately according to specific tasks and increase the capacity of the messaging environment selectively according to specific demands and changing trends, as illustrated in Figure 10. For example, as demand for mobile messaging services continues to grow, Microsoft IT can increase the capacity of Client Access servers without affecting other areas in the messaging environment.
  • Structured system administration and maintenance   At Microsoft IT, several groups deploy, manage, and maintain the messaging environment. These groups collaborate closely while individual system engineers, program managers, and service managers specialize in specific areas of expertise, closely related to server roles. For example, different system engineers designed the message routing topology and the Mailbox server configuration.
  • Role-specific load balancing and fault tolerance   Different server roles support different techniques and architectures for load balancing and fault tolerance. For example, if multiple Hub Transport servers exist in the same Active Directory site, Exchange Server 2007 balances the message traffic automatically between these servers, whereas Mailbox servers are not load-balanced in the same way. A mailbox can only be available on a single Mailbox server, whereas CCR can maintain a redundant copy of the mailbox store to achieve high availability for the Mailbox server.

Table 3 shows the number of servers and the technologies per server role that Microsoft IT uses in the corporate production environment to implement load balancing and fault tolerance.

Table 3. Servers in the Microsoft IT Exchange Server 2007 Environment

Server role

Red-mond

Silicon Valley

Dublin

Singa-pore

Sao Paulo

Technology

Mailbox

31

0

15

15

1

Microsoft Windows Clustering and CCR.

Network interface card (NIC) teaming by using NICs connected to different switches.

Edge Transport

3

3

2

2

0

Domain Name System (DNS) round robin and Mail Exchanger (MX) records with same cost values.

Multiple Hub Transport servers as bridgeheads in Send Connector configuration.

Hub Transport

8

0

3

3

1

Automatic load balancing through Mail Submission Service.

Edge Subscriptions for Hub/Edge connectivity.

Client Access

16

0

6

4

Web Publishing Load Balancing (WPLB) on Microsoft Internet Security and Acceleration (ISA) Server 2006.

Microsoft Network Load Balancing (NLB) internally.

Unified Messaging

7

0

2

2

Automatic round robin load balancing between Unified Messaging servers.

Multiple voice over IP (VoIP) gateways per dial plan.

 

Scaling Up Server Designs

Following the successful completion of the initial production rollout, Microsoft IT began to develop new, scaled-up Mailbox server designs in order to increase the density of the user mailboxes per server, consolidate resources, and reduce maintenance overhead. The new designs take advantage of quad-core processors and scale up to 6,000 users with 500-MB mailboxes per Mailbox server. In the initial design, scaling out with additional Mailbox servers was the only reasonable option for Microsoft IT to handle the increased demand for mailbox sizes of 500 MB and 2 GB without jeopardizing SLAs. The new designs enabled Microsoft IT to consolidate all the 2,000-user Mailbox servers used during the original beta deployment by a factor of three. Because each Mailbox server corresponds to a two-node CCR cluster, Microsoft IT was able to repurpose more than 60 recently purchased enterprise servers.

To scale up to 6,000 mailboxes per server, Microsoft IT capitalized on the newest processor technologies at the time of deployment, such as the quad-core Intel Xeon Processor X5355. This enabled the elimination of the processor bottleneck on Microsoft IT Mailbox servers that existed on the dual-core CPU systems used during initial deployment and prevented achieving that scale. The X5355-compatible server model that Microsoft IT selected offered eight slots for Fully Buffered Dual Inline Memory Modules (FB-DIMM). This implies that with a maximum module size of 4 GB, the server architecture accommodates up to 32 GB of memory, which corresponds to a memory configuration at the upper end of product recommendations (2 GB + 2 MB to 5 MB/mailbox). However, at the time Microsoft IT developed this server design, 4 GB DIMMs did not offer an attractive memory capacity/price ratio. To remain cost-efficient, Microsoft IT decided to use 2-GB memory modules instead and designed the storage solution according to the I/O requirements that result from having less memory per user on the server.

In comparison to the initial Mailbox servers, the new design allocated 60 percent less memory per user. Among other things, this means that even though the Extensible Storage Engine (ESE) of Exchange Server 2007 can cache any amount, the limited physical memory available means that the ESE can only cache approximately 2 MB of messaging data per user. This has a direct impact on the amount of I/O operations, because with less memory to cache data, the storage engine must go to disk more often to reload frequently used data. After consolidation to the new server platform, the same users will have a higher I/O profile. To meet the increased requirements in terms of I/O operations per second (IOPS), Microsoft IT optimized the design of the storage subsystem, as explained in the next section.

Mailbox Storage Design

The mailbox is one of the very few components in an Exchange Server 2007 organization that cannot be load-balanced across multiple servers. Each individual mailbox is unique and can only reside in one mailbox database on one active Mailbox server. It follows that the mailbox store is one of the most critical Exchange Server components that directly affect the availability of messaging services. With previous versions of Exchange Server, Microsoft IT relied on SAN solutions to provide the necessary configuration for its mailbox clusters. SAN provided a higher level of availability due to the architecture, and enabled Microsoft IT to achieve the number of disks required for I/O throughput and scalability. Mailbox servers clustered by using Windows Clustering and SAN-based storage enabled Microsoft IT to achieve 99.99 percent availability with Exchange Server 2003, yet the shared storage solution was a single point of failure that was expensive and required specialized skills to optimize and maintain the configuration. Additionally, the mailbox databases on disks remained single points of failure.

Note: For more information about mailbox storage design, see http://technet.microsoft.com/en-us/library/bb738147.aspx.

To break through the old limitations, Microsoft IT defined the following storage design requirements for Exchange Server 2007:

  • Continue to maintain 99.99 percent availability at the service level.
  • Increase Mailbox server resilience by removing single points of failure in the storage subsystem and its components.
  • Reduce storage infrastructure costs and increase mailbox quotas from 200 MB to 500 MB and 2 GB depending on the type of mailboxes.
  • Increase deleted items' retention from three days to 14 days to enable users to recover deleted mail items within a larger time window without the necessity of restoring from backup.

Eliminating Storage as the Single Point of Failure

Exchange Server 2007 supports two different clustering techniques for Mailbox server configurations: single copy cluster (SCC) and cluster continuous replication (CCR). Both types rely on Windows Clustering, yet only CCR provides redundancy at the storage level by replicating mailbox data from one server to another at the storage group level through a mechanism commonly known as asynchronous log shipping. SCC is comparable to an Exchange Server 2003 clustering configuration where all cluster nodes use a shared storage subsystem for quorum resource, mailbox databases, and transaction logs. Because only CCR eliminates the mailbox store as a single point of failure, and the fact that CCR technology does not impose any specialized hardware or storage requirements, Microsoft IT decided to use this technology as the core of its Mailbox server designs to increase Mailbox server resilience from storage-level failures.

To use server hardware efficiently while providing the required redundancy level through CCR, Microsoft IT implemented two-node Majority Node Set (MNS) server clusters with file-share witness. Each cluster node stores a copy of the MNS quorum on the local system drive and keeps it synchronized with the other node. The file-share witness feature enables the use of a file share that is external to the cluster as an additional vote to determine the status of the cluster in a two-node MNS quorum cluster deployment. This helps to avoid an occurrence of network partition within the cluster, also known as split-brain syndrome. Split-brain syndrome occurs when all networks designated to carry internal cluster communications fail, and nodes cannot receive heartbeat signals from each other. To enable the file-share witness feature, Microsoft IT specifies a file share on a Hub Transport server in the MNSFileShare property of the MNS resource configuration.

Figure 11 illustrates the Microsoft IT CCR configuration. Both CCR cluster nodes and the file-share witness must exist in the same Active Directory site. Microsoft IT did not deploy geographically dispersed clusters during the initial production rollout of Exchange Server 2007 to avoid the complexities of building IP subnets and Active Directory sites that span multiple geographic locations, which would be a requirement for distributed Windows Server 2003 cluster nodes.

Bb894728.image011(en-us,TechNet.10).jpg

Figure 11. Microsoft IT CCR configuration

As depicted in Figure 11, Microsoft IT uses three network adapters in each cluster node. The first two adapters connect the cluster node in a NIC teaming configuration through separate Gigabit Ethernet switches to the public network, which Microsoft Office Outlook clients and Exchange Server computers use to communicate with the Mailbox server. The Microsoft Exchange Replication Service on the cluster nodes also uses the public network connection to replicate the mailbox store databases from active node to passive node by using transaction log shipping so the mailbox store data is readily available on the passive node if the active node fails. The third network adapter establishes the private network connection between the cluster nodes to communicate cluster heartbeats.

Another important component that is necessary to ensure high availability in the CCR-based Mailbox server configuration is the transport dumpster feature on Hub Transport servers. The transport is a feature that allows messages to be redelivered to a continuous replication-enabled storage group in the event that the storage group experiences a lossy failure. It is important to ensure that Hub Transport servers have the appropriate capacity to handle a transport dumpster. Clustered Mailbox servers might request redelivery if a failover occurs to a passive node before CCR has copied the most recent transactions. To configure the transport dumpster, Microsoft IT uses the set-TransportConfig cmdlet with the following parameters:

  • MaxDumpsterSizePerStorageGroup   This parameter specifies the maximum size of the transport dumpster queue per storage group. Microsoft IT specifies 15 MB for this parameter.
  • MaxDumpsterTime   This parameter specifies how long the Hub Transport server can retain messages in the transport dumpster queue. Microsoft uses a value of 07.00:00:00, which corresponds to seven days.

Note: Microsoft IT uses the same hardware configuration on active and passive CCR cluster nodes to maintain the same performance level after a failover to the passive node.

Reducing Storage Costs and Configuration Complexities

The new Mailbox server design with CCR required Microsoft IT to double the storage and the storage arrays in order to remove the data and storage points of failure. Microsoft IT looked into the cost of deploying Mailbox servers with SAN technology and determined that CCR with SAN-based storage would require redundant SAN environments, which were cost prohibitive. Instead, Microsoft IT decided to use direct attached storage, which met the requirements, reduced costs, and reduced operational complexity for Microsoft IT.

With direct attached storage, Microsoft IT eliminated the need to use SAN, Internet SCSI (iSCSI), or other shared storage technologies in the cluster configuration. Every cluster node can use its own direct attached storage subsystem to maintain a separate copy of mailbox databases, which also helps to improve the failover behavior, because a server failure due to hardware or software problems that affect a database on the first node is unlikely to affect the recovery operation on the second node. By removing single points of failure that existed in the Exchange Server 2003 architecture and by using improved failover behavior in the Mailbox server configuration, Microsoft IT felt confident in making the switch from SAN to direct access storage (DAS) storage as the basis of its server architectures in the Exchange Server 2007-based messaging environment.

Optimizing the Storage Design for Reliability and Recoverability

Although CCR is an effective high-availability feature, it is important to recognize that the technology does not eliminate the need for reliability and recoverability provisions at the storage and server levels. For example, failing over an entire cluster with thousands of mailboxes between the active and the passive nodes might not be an effective measure if only a single disk in a disk array experienced a failure or if the transaction log volume is running short of disk space.

Microsoft IT uses the following features to ensure reliability and recoverability at the storage level:

  • Redundant array of independent disks (RAID)   Microsoft IT uses stripe sets of mirrored disks according to RAID level 1+0, also called RAID 10, to achieve optimum disk drive performance and fault tolerance at the physical disk level. In the new server design for 6,000 mailboxes, the RAID 10 drives for transaction logs use eight disks, which means that the transaction log drives can tolerate a maximum of four disk failures, and the I/O performance only drops by 25 percent in the event of a single disk failure. The RAID 10 drives for database files use 14 disks in the Mailbox server design for 6,000 users, which corresponds to an even higher level of resilience to disk failures. A maximum of seven disks can fail, and in the event of a single disk failure, the I/O performance only drops by roughly 15 percent.

    Note: Microsoft IT does not use a hot spare in the RAID configuration, because most Microsoft data centers are staffed 24 hours, seven days per week. In the event of a disk failure, an IT specialist is readily available to replace the affected disk with minimum delay.

  • Separate transaction logs from database files   Placing transaction log files and database files on separate physical drives ensures that recent transactions are still available in the transaction logs even if the database RAID group fails. Separating transaction logs from database files is one of the most basic strategies to improve the performance and fault tolerance of any version of Exchange Server.
  • No circular logging on Mailbox servers   In a CCR server cluster, when circular logging is enabled, the process will not expunge the transaction log until it has been replicated and replayed into the target database.
  • Configure multiple storage groups per Mailbox server   To ensure timely backup and restore operations according to SLA requirements, which require backup operations to be complete within four hours, Microsoft IT wanted to keep the size of individual databases per Mailbox server below 200 GB and implement a schedule with weekly full and daily incremental backups, as explained later in this paper. Keeping individual database file sizes below 200 GB also helps Microsoft IT complete online maintenance cycles regularly, which is critical for keeping the Mailbox server healthy, the system performance stable, and the sizes of database files under control. Because the CCR configuration only supports one mailbox database per storage group, Microsoft IT must configure multiple storage groups per Mailbox server. For example, in the Mailbox server design for 6,000 users, Microsoft IT uses 42 storage groups with approximately 143 mailboxes in each.

Standardizing the Storage Design

Microsoft IT greatly benefits from a standardized storage layout for all Mailbox servers. Mailbox servers host up to 6,000 mailboxes in 42 separate storage groups that require identical storage group and database locations on all CCR cluster nodes. Standardizing the storage design for the CCR server clusters helps Microsoft IT prevent human errors in disaster-recovery situations, simplify hardware installation and configuration tasks, accelerate server deployments, and ultimately save operational costs.

To standardize the storage layout for Exchange Server 2007, ensure reliability, and provide scalability, Microsoft IT developed the concept of a universal storage building block (USBB). A USBB is a self-contained unit of two physical storage enclosures, combined to provide database and transaction log drives with the necessary level of hardware component redundancy. The number of disk drives—identified through individual logical unit numbers (LUNs) at the small computer system interface (SCSI) level—that Microsoft IT can use in a USBB depends on the capacity of the storage enclosures and the required number of LUNs per RAID drive to provide the desired data capacity and I/O performance. The USBB count per server depends on the number of mailboxes and the mailbox quotas that Microsoft IT wants to maintain on the server.

The standardized storage layout based on USBBs provides Microsoft IT with the benefits of a flexible and scalable server design. For example, in the Mailbox server design for 6,000 users, Microsoft IT uses storage enclosures with 25 small form factor (SFF) disks (146-GB 10,000-RPM serial attached SCSI [SAS] 2.5-inch SFF). Microsoft IT prefers small form factor disks because of the lower power requirements, lower cost, and higher performance and reliability in comparison to the large form factor 3.5-inch disks. Microsoft IT configures the hardware RAID controller in such a way that it mirrors the disks across the two storage enclosures, and then includes these mirrors in stripe sets to create the desired number of RAID 10 drives. Seen from an individual node level, one RAID controller introduces a single point of failure to the configuration. Yet, Microsoft IT did not find it necessary to install redundant RAID controllers on a single node because the CCR cluster as a whole eliminates the storage subsystem as a single point of failure.

Figure 12 shows a USBB configuration with 25 disks per storage enclosure, three RAID 10 LUNs for the databases, and one RAID 10 LUN for all transaction logs combined. Microsoft IT uses this particular USBB configuration in the Mailbox server design for 6,000 users by attaching multiple instances of it to the same server to achieve desired scale.

Bb894728.image012(en-us,TechNet.10).jpg

Figure 12. Universal storage building block design

To determine the required number of disks per database LUN, Microsoft IT considered the following three factors:

  • Backup schedule   To facilitate a backup schedule of weekly full and daily incremental streaming backups, Microsoft IT places seven storage groups on each database LUN, also known as the "2 LUNs per Backup Set" model. Because the CCR configuration requires one mailbox database per storage group, each database LUN must accommodate seven database files. For more information about the 2 LUNs per Backup Set model, see http://technet.microsoft.com/en-us/library/bb738147.aspx.
  • Mailbox database capacity requirements   As explained earlier, Microsoft IT distributes mailboxes across 42 storage groups to keep the size of individual database files below 200 GB. In a Mailbox server configuration for 6,000 users, this means that each database file must host 143 mailboxes. Assuming a maximum mailbox size of 500 MB plus 69 percent overhead (54 percent database overhead, 5 percent content indexing overhead, and 10 percent reserve for unexpected database growth), an individual database can grow up to approximately 120 GB (500 MB * 143 * 171% = 119.4 GB). It follows that an individual database drive must have a minimum capacity of 840 GB to store seven mailbox databases. In a RAID 10 configuration with 146-GB SFF disks, 14 disks are required to provide the necessary storage capacity, because each individual mirror in the stripe set only has a usable capacity of 137 GB (840 GB / 137 GB = 6.13, which means 7 mirrors are required per stripe set * 2 disks per mirror = 14 disks per stripe set).

    Note: Microsoft IT determined the database overhead factor of 54 percent based on the average message traffic per user in the corporate production environment, the desired deleted items retention time of 14 days, and internal database overhead. The average message traffic per user (both sent and received items) is approximately 10 MB per day, which means that the dumpster for the deleted items must be able to hold 140 MB per user (14 days * 10 MB per day = 140 MB). In addition, Microsoft IT takes into account 20 percent overhead based on the maximum mailbox and dumpster size to accommodate internal database structures, table indexes, and so forth. Accordingly, the total database overhead is approximately 54 percent of the maximum mailbox size (140 MB + [(500 MB + 140 MB) * 20% / 100%] * 100% / 500 MB = 53.6%).

  • Input/output performance   To achieve optimal Mailbox server response times, the storage subsystem must be able to sustain the load that users generate in terms of IOPS without creating a bottleneck. A single 146-GB SFF 10,000-RPM SAS disk can perform approximately 160 IOPS with response times of less than 20 milliseconds. The question now is how many disks are necessary to satisfy the I/O load that 6,000 concurrent users place on a Mailbox server. The answer depends on the usage patterns of the users and the hardware configuration of the server.

    To determine the I/O load, Microsoft IT continuously monitors the Disk Transfers/sec, Disk Reads/sec, and Disk Writes/sec performance counters on Mailbox servers in the corporate production environment. Values measured since the initial rollout show that Microsoft employees are generally heavy users who generate approximately 0.4 IOPS (Redmond full-time employees) on Exchange Server 2007 with a read/write mix of 1/1. However, these values come from Mailbox servers with 5 MB of memory per user. It is important to remember that the new Mailbox server design has 60 percent less memory (2 MB per user) to cache mailbox data. Due to the smaller data cache, read operations will increase and the read/write mix will change to 2/1. Accordingly, Microsoft IT expects a rise in read activities to a range between .35 and .7 IOPS. This range of IOPS represents a dramatic reduction in I/O requirements on a Mailbox server running Exchange Server 2007. It provided Microsoft IT with enough headroom to balance the I/O load against server memory in order to support greater numbers of users per server.

    Supporting approximately 1,000 users per database LUN (143 users per storage group * 7 storage groups = 1,001 users) with 0.7 IOPS in a 2/1 read/write mix means the RAID controller receives 468 read and 234 write requests per second. Corresponding to the RAID 10 configuration, each write request equals two write operations behind the controller, so the database LUN must be able to perform 468 read and 468 write requests per second, or 936 IOPS in total. With 160 IOPS per disk, only six disks are necessary per database drive to reach the required performance level (936 / 160 = 5.8). The USBB design uses 14 disks per database LUN because of capacity requirements and does not create a performance bottleneck under normal operating conditions.

Having ensured that the standardized USBB design provides the required storage capacities and I/O performance levels, it is straightforward for Microsoft IT to design the overall storage subsystem. In the Mailbox server design for 6,000 users, each USBB stores 3,000 mailboxes (3 database LUNs * 1,000 mailboxes per LUN = 3,000 mailboxes), so the Mailbox server needs two USBBs per cluster node with corresponding drive letter assignments, as depicted in Figure 13.

Bb894728.image013(en-us,TechNet.10).jpg

Figure 13. Microsoft IT storage design for Mailbox servers (6,000 users)

Backup and Recovery

In addition to eliminating single points of failure at the Mailbox server and storage levels, Microsoft IT continues to use backups to protect against loss or damage of data. Backups provide an additional measure of protection in the event of a single-node failure. If one node in a two-node CCR cluster fails, only one node remains with the data until Microsoft IT repairs the databases on the affected node. Especially in this situation, database backups provide necessary redundancy. If the second node fails before the first node is restored or in the unlikely event that both nodes fail simultaneously, backups might provide the last resort to recover the data.

For the rollout of Exchange Server 2007 in the corporate production environment, Microsoft IT defined the following backup and recovery requirements:

  • Support mailbox capacities of 500 MB and 2 GB, depending on the Mailbox server design.
  • Reduce backup costs by eliminating tape as the backup media for Exchange data.
  • Complete server backup operations within four hours and database restore operations within one hour, according to existing SLAs.
  • Maintain 14 days' worth of database backups.

Performing VSS-Based Backups on Passive Node

The backup solution that Microsoft IT had to use during the initial production rollout of Exchange Server 2007 relied on the Windows Backup utility (NTBackup.exe) to perform streaming online backups controlled through command-line scripts. Microsoft IT would have preferred to use Volume Shadow Copy Service (VSS) technology instead, because VSS provides the ability to offload backup processes to the passive node, yet during the initial deployment time frame, VSS-based solutions for Exchange Server 2007 were not available.

The disadvantage of performing streaming backups is that backup processes must run on the active node in a CCR server cluster, which limits the backup window. For example, to minimize the impact of backup operations on user processes and online maintenance, Microsoft IT requires backup cycles to be complete within four hours, between 8 P.M. and midnight. This represents a considerable challenge, because Microsoft IT designed the Mailbox servers to host up to 10 terabytes of messaging data on some servers.

Microsoft IT plans to move to software-based VSS backups as soon as Microsoft System Center Data Protection Manager (DPM) 2007 becomes available to perform backups on the passive nodes, as illustrated in Figure 14. This allows backup operations in much more frequent intervals than streaming backups on active nodes. In fact, using DPM 2007, Microsoft IT can copy transaction logs as often as every 15 minutes to the DPM server. Accordingly, Microsoft IT gains the ability to restore Exchange Server data to any 15-minute point in time to the original server or to a different server. This is the backup solution Microsoft IT needs to go beyond current Mailbox server sizes in terms of numbers of users and gigabytes per mailbox in the corporate production environment.

Bb894728.image014(en-us,TechNet.10).jpg

Figure 14. The future Microsoft IT backup solution for Exchange Server 2007

Eliminating Backups to Tape

Despite the limitations of the streaming backup solution, Exchange Server 2007 enabled Microsoft IT to achieve improvements in the backup design. Specifically, Microsoft IT eliminated tape backups and their associated operational costs. This was possible because Microsoft IT is not required to keep data on tape for long-term archiving. CCR helped Microsoft IT reduce the dependency on backup, because each Mailbox server already maintains two copies of the mailbox data. With the lack of strict "backup to tape" requirements, Microsoft IT saw an opportunity to implement a more reliable and cost-effective solution by performing backups to disk.

To provide the required backup storage, Microsoft IT uses a separate RAID controller with a set of 500-GB and 750-GB Serial Advanced Technology Attachment (SATA) disks in a RAID 5 configuration on the active node. This RAID 5 drive provides enough capacity to store 14 days' worth of online database backups so that Microsoft IT always has two full backups available. As illustrated in Figure 15, the passive node does not have this extra storage, because Microsoft IT cannot perform online backup operations on the passive node until the DPM 2007-based backup solution is available. During the transition to DPM 2007, Microsoft IT plans to move the local backup storage and reuse the disks on the DPM host.

Note: It should be noted that the current architecture introduces another single point of failure if both active and copy databases are lost and the disk array housing the backups is lost.

Bb894728.image015(en-us,TechNet.10).jpg

Figure 15. Disk-based backup storage for streaming online backups

Moving from backup tapes to disk-based backup storage provided Microsoft IT with the following advantages:

  • Reduced backup costs and complexities   Eliminating backup tapes alone enabled Microsoft IT to reduce costs by approximately $5 million per year. Microsoft IT achieves further cost savings by removing tape systems from the data protection infrastructure and reducing maintenance complexities.
  • Increased reliability of restore operations   Microsoft IT operational statistics show a 17 percent annual failure rate for tape drives. By moving to disk-based backup storage, Microsoft IT can reduce the likelihood of restore failures during recovery operations due to unreadable backup media or corrupted backup catalogs.
  • Increased performance and throughput   Tape-based restores require time for locating and mounting the required backup media and building indexes. Disk-based backup storage provides much quicker access and faster I/O performance during recovery operations.

Optimizing Backup Cycles According to SLAs

Prior to Exchange Server 2007, Microsoft IT performed daily full database backups. This was possible because mailbox quotas were limited to 200 MB. An Exchange 2003 server with 4,000 mailboxes would host approximately 1 terabyte of messaging data. Using Windows Backup in non-buffered I/O mode to bypass Windows Cache Manager and backing up four storage groups in parallel enabled Microsoft IT to complete backup cycles within the SLA-prescribed time frame of four hours. However, with increased mailbox quotas of 500 MB and 2 GB and up to 6,000 mailboxes per Mailbox server running Exchange Server 2007, data volumes would overtax the existing backup processes.

Being unable to deploy a VSS-based backup solution during the initial production rollout, Microsoft IT decided to switch from daily full to daily incremental and weekly full database backups. This approach enabled Microsoft IT to reduce the daily backup volume in order to stay within the required four-hour backup window. On the downside, however, daily incremental and weekly full backups complicate restore processes. Recovering a database now requires restoring the last full backup and all incremental backups since the last full backup, which takes more time and requires multiple restore operations to accommodate the full and incremental backup restorations.

To stagger the full and incremental online backups across a seven-day period, Microsoft IT places seven storage groups on each database LUN on the Mailbox servers. Streaming technology supports backing up mailbox databases in different storage groups in parallel. Using a separate backup session also enables Microsoft IT to perform full backups for each mailbox database on a different weekday, as summarized in Table 4. According to the business requirement to maintain 14 days' worth of database backups, the backup storage provides sufficient capacity to store two full backups and 12 incremental backups of the Mailbox server's messaging databases.

Table 4. Legacy Streaming Backup Schedule per Database LUN

Storage group

Mon

Tue

Wed

Thu

Fri

Sat

Sun

SG 1

Full

Inc

Inc

Inc

Inc

Inc

Inc

SG 2

Inc

Full

Inc

Inc

Inc

Inc

Inc

SG 3

Inc

Inc

Full

Inc

Inc

Inc

Inc

SG 4

Inc

Inc

Inc

Full

Inc

Inc

Inc

SG 5

Inc

Inc

Inc

Inc

Full

Inc

Inc

SG 6

Inc

Inc

Inc

Inc

Inc

Full

Inc

SG 7

Inc

Inc

Inc

Inc

Inc

Inc

Full

Note: Switching to disk-based backup storage and keeping mailbox database sizes below 200 GB were crucial measures to ensure that restores of individual databases from backups could still be completed within the allowed one-hour time window, despite the fact that CCR technology greatly eliminates the need for restore operations from backups. In case of database corruption or hardware failures on a single node, Microsoft IT does not need to perform restores from backup. Instead, Microsoft IT simply reseeds the databases on the restored cluster node from the second cluster node that still has a healthy database copy.

Client Access Server Topology

Reliability and performance of Mailbox servers are crucial for the availability and quality of messaging services in an Exchange Server 2007 organization. Another important component is the Client Access server, which provides access to messaging items in mailboxes, availability information, and address book data in a number of scenarios. For example, users might work with Office Outlook Web Access in a Web browser session or synchronize mobile devices by using the Exchange ActiveSync protocol. Users can also work with the full Office Outlook 2003 or Outlook 2007 client over Internet connections, accessing their mailboxes through RPC over HTTPS (Outlook Anywhere). Office Outlook 2007 clients specifically communicate with Client Access servers in a number of additional scenarios, such as to retrieve profile configuration settings by using the Autodiscover service, checking free/busy data by using the Availability service (which is part of Exchange Web Services), and downloading Offline Address Book (OAB) from a virtual directory on a Client Access server. In all these cases, users communicate with a Client Access server, which in turn communicates with the Mailbox server by using the native Exchange Server Messaging Application Programming Interface (MAPI).

To provide Microsoft employees with reliable access to their mailboxes from practically any location with network access, Microsoft IT defined the following requirements for the Client Access server deployment:

  • Establish flexible and scalable client access points for each geographic region that can accommodate the existing mobile user population at Microsoft and large spikes in mobile messaging activity.
  • Preserve common URL namespaces (such as https://mail.microsoft.com) that were established during Exchange Server 2003 rollout for all mobile messaging clients within each individual geographical region.
  • Deploy all mobile messaging services on a common standardized Client Access server platform.
  • Maintain security for mobile messaging client access from the Internet by using the capabilities of ISA Server 2006.
  • Provide seamless backward compatibility with mailboxes still on the Exchange Server 2003 platform during transition and provide support for cross-forest access to enable availability of free/busy information.

Preserving Existing Namespaces for Mobile Access to Messaging Data

Each month, Client Access servers in the corporate production environment support approximately 60,000 Office Outlook Web Access unique users, 60,000 Outlook Anywhere connections, and 30,000 Exchange ActiveSync sessions. To distribute this load, Microsoft IT uses multiple URL namespaces according to geographic regions. Microsoft IT established this topology during the Exchange 2000 Server time frame, which means that Microsoft employees became accustomed to these URLs over many years. Preserving these URLs during the transition to Exchange Server 2007 and providing uninterrupted mobile messaging services through these URLs was correspondingly an important objective for the production rollout.

To preserve the existing URL namespaces, Microsoft IT devised the following strategy:

  1. Deploy the Exchange Server 2007 Client Access servers in the corporate production environment in each data center/Active Directory site where Exchange Server 2007 Mailbox servers were planned.
  2. Test access to Exchange Server 2003 and Exchange Server 2007 resources through Client Access servers in all locations by manually pointing clients to them.
  3. Switch mobile messaging namespaces for each regional location in DNS to point to the Client Access servers.
  4. Move mailboxes from legacy Exchange 2003 servers to new Exchange 2007 servers and enable new mobile messaging features for transitioned users.

As explained in the "Message Routing Topology" section earlier in this white paper, Exchange Server 2007 extensively uses the concept of Active Directory sites, such as to define logical boundaries for message routing and server-to-server communications. For client access scenarios, this means that each Active Directory site with Mailbox servers must also include Client Access servers to ensure a fully functional messaging system. Accordingly, Microsoft IT deployed Client Access servers locally in the data centers of Dublin, Sao Paulo, Singapore, and Redmond. As illustrated in Figure 16, Microsoft IT heavily focused the deployment on dedicated servers with varying Mailbox-to-Client Access server ratios in order to establish flexible and scalable messaging services that can accommodate large spikes in user activity. In Sao Paulo only, Microsoft IT deployed a multiple-role server hosting Hub Transport and Unified Messaging server roles in addition to the Client Access server role due to the moderate number of users in the South American region.

Bb894728.image016(en-us,TechNet.10).jpg

Figure 16. Global Client Access server deployment

Client Access servers only communicate directly with Exchange Server 2007 Mailbox servers in their local Active Directory site. For requests to Mailbox servers in remote Active Directory sites, Client Access servers must proxy or redirect the request to a Client Access server that is local to the target Mailbox server. Microsoft IT prefers to redirect Office Outlook Web Access users. Keeping client connections local within each geographic region minimizes the impact of network latencies between the Client Access server and Mailbox server on client performance and mitigates the risk of exhausting thread pools and available connections on central Client Access servers during large spikes in messaging activity. To redirect Office Outlook Web Access users to Client Access servers that are local to the user's Mailbox server, Microsoft IT registers the relevant external URL on each Internet-facing Client Access server in the ExternalURL property for all Office Outlook Web Access virtual directories.

Redirection is not available for other services, such as Exchange ActiveSync, Exchange Web Services, and Outlook Anywhere. To support these clients, remote Client Access servers act as proxy servers for local Client Access servers (Exchange ActiveSync, Exchange Web Services) or communicate with Mailbox servers directly by using RPCs (Outlook Anywhere). However, Microsoft employees know and prefer to use their local Internet access points.

Note: An exception to the distributed approach to mobile messaging services at Microsoft is the Internet access point for the Autodiscover service used for automatic profile configuration of clients such as Outlook 2007. Microsoft IT provides centralized access to the Autodiscover service because of its reliance on the primary SMTP address of the users. All Microsoft users in the Corporate Forest have the same SMTP domain name worldwide (that is, @microsoft.com). Outlook 2007 derives the Autodiscover URL from the user's primary e-mail address and attempts to access the Autodiscover service at https://autodiscover.<SMTP domain>/autodiscover/autodiscover.xml or, if this URL does not exist, at https://<SMTP domain>/autodiscover/autodiscover.xml. Microsoft employees working with Outlook 2007 over the Internet connect to https://autodiscover.microsoft.com/autodiscover/autodiscover.xml regardless of the user's geographic location.

Increasing Security Based on ISA Server 2006

To provide adequate protection for Client Access servers from the Internet, Microsoft IT continues to use Microsoft Internet Security and Acceleration (ISA) Server 2006. ISA Server 2006 provides many features specifically designed to publish Exchange Server resources, including mobile messaging scenarios. For example, ISA Server 2006 includes a New Exchange Publishing Rule Wizard that facilitates the configuration of publishing rules for Office Outlook Web Access, Exchange ActiveSync, and Outlook Anywhere.

Among other things, ISA Server 2006 provides stateful inspection and application-layer filtering for mobile messaging connections coming from the Internet. Stateful inspection enables ISA server to block any traffic that appears out of context, such as requests to initiate a connection on an established session. Yet, to perform this function, ISA Server must analyze the payload in data packets, which requires ISA Server 2006 to decrypt the Secure Sockets Layer (SSL) stream. Accordingly, Microsoft IT terminates SSL connections from the Internet on the ISA server and then reestablishes a new SSL connection between the ISA server and the Client Access server. This SSL bridging process enables ISA Server 2006 to filter invalid data packets before the traffic reaches the Client Access servers while maintaining the confidentiality of client-to-server communication as it transits both external and internal networks. For Internet connections, Microsoft IT uses an externally trusted SSL certificate, installed directly on the ISA servers. For connections between ISA Server 2006 and Exchange 2007 Client Access server, Microsoft IT could have used internally trusted SSL certificates but chose to use the same externally trusted ones for consistency reasons.

Providing Load Balancing and Fault Tolerance for External Client Connections

ISA Server 2006 features also enable Microsoft IT to establish a highly scalable load-balancing infrastructure for external mobile messaging clients and avoid complicated load distribution and client session affinity issues that frequently arise in the scenarios with network address translation (NAT) and load-balancing components that rely on the source IP address of the connection. Figure 17 illustrates the load-balancing architecture that Microsoft IT uses for its Client Access server scenarios. The same architecture is replicated in each of the regional data centers where Exchange 2007 Mailbox and Client Access server infrastructure exists.

Bb894728.image017(en-us,TechNet.10).jpg

Figure 17. Client access architecture for external clients

Microsoft IT uses the following technologies to provide load balancing and fault tolerance for Internet-based client access:

  • Integrated network load balancing for ISA server farm   To distribute incoming client connections over multiple computers running ISA Server 2006, Microsoft IT uses integrated NLB in single-affinity mode. Single affinity helps to maintain the session state by directing subsequent requests from the same client or a specific client IP address to the same ISA server. Maintaining the session state is important when using HTTPS connections, because the client can then reuse the existing SSL session identifier (ID) for multiple requests. If directed to a different ISA server, the client would have to negotiate a new SSL session ID. Although this process is transparent to the user, it requires five times the amount of overhead as reusing the existing SSL session ID. Correspondingly, single affinity reduces SSL-related performance overhead on the ISA server farm.
  • Web server farm load balancing for Client Access servers   Using the New Server Farm Wizard in ISA Server 2006, Microsoft IT creates a server farm object that includes the IP addresses of all Client Access servers in the specified location. The server farm object enables ISA Server to treat all servers in the farm as a single load-balanced entity. The server farm object also defines a connectivity verifier for each farm member to determine the state of each individual Client Access server and exclude from load balancing those servers that are temporarily unavailable.

    It is important to note that Web server farm load balancing does not require NLB on the Client Access servers, yet session affinity requirements remain because ISA servers and Client Access servers communicate over HTTPS. For published Office Outlook Web Access paths (/exchange/*, /owa/*, and /public/*), ISA Server 2006 automatically uses new cookie-based load-balancing methods to direct all requests issued by the same Web browser session to the same Client Access server. This is important to ensure that once authenticated against a particular Client Access server the user is not redirected to another Client Access server in the cluster, which would cause new authentication prompts. The cookie-based load-balancing method of ISA Server 2006 enables Microsoft IT to perform effective load balancing even for scenarios where the multiple clients are hidden behind the NAT device or a Web proxy on the originating side, which does not expose their unique IP address.

    For all remaining paths (/Microsoft-Server-ActiveSync/*, /RPC/*, /Autodiscover/*, /EWS/*, and /UnifiedMessaging/*) that do not rely on the Web browser, ISA Server uses IP-based load balancing, which is comparable to a single-affinity configuration in that requests from the same client IP address are sent to the same Client Access server.

Providing Load Balancing and Fault Tolerance for Internal Client Connections

Although mobile messaging connections coming from the Internet through the ISA Server 2006 infrastructure do not require NLB on Client Access servers, internal clients, such as Outlook Web Access clients and Outlook 2007 clients in the Corporate Forest, can establish HTTPS connections to Client Access servers directly and must be load-balanced. To bypass ISA Server 2006 within the corporate production environment, Microsoft IT registered internal Client Access server IP addresses for the URL namespaces in the internal DNS.

Note: Microsoft IT uses a split DNS configuration to accommodate the registration of internal Client Access servers, provide load balancing, and provide fault tolerance for internal client connections.

Figure 18 illustrates the load-balancing configuration that Microsoft IT established for internal messaging clients. Similar to the ISA server farm, Microsoft IT uses NLB with single affinity to direct requests from a particular client IP address always to the same Client Access server to maintain the session state and reuse SSL session IDs.

Bb894728.image018(en-us,TechNet.10).jpg

Figure 18. Client access architecture for internal clients

Optimizing Offline Address Book Distribution

Exchange Server 2007 introduces a new mechanism to distribute OAB information to Outlook 2007 clients. The new mechanism uses Windows Background Intelligent Transfer Service (BITS) over HTTPS connections instead of downloading the OAB files from a public folder by using MAPI. Especially with large OAB files, BITS provides significant advantages over the traditional OAB download method that previous versions of Office Outlook use, because BITS requires less network bandwidth, downloads the OAB files asynchronously in the background, and can resume file transfers after network disconnects and computer restarts. Exchange Server uses HTTP as the default communication method. Microsoft IT uses a trusted SSL certificate and enables SSL on the appropriate OAB directory for internal clients.

Within the corporate production environment, Microsoft IT generates four different OABs according to each geographical region. This approach of generating and distributing OABs locally helps Microsoft IT to minimize OAB traffic over WAN connections while providing users with relevant address information in cached or offline mode. Accordingly, Microsoft IT configured an Exchange 2007 Mailbox server in each geographic region as the OAB Generation (OABGen) server and local Client Access servers as hosts to download the regional OABs. The Exchange File Distribution Service, running on each Client Access server, downloads the OAB in the form of XML files from the OABGen server into a virtual directory that serves as the distribution point for Outlook 2007 clients. Outlook 2007 clients can determine the OAB download URL by querying the Autodiscover service.

Figure 19 illustrates the OAB distribution architecture. Outlook 2007 clients on the Internet access the OAB virtual directory through ISA Server 2006 by using an encrypted HTTPS connection. Within the internal network, Outlook 2007 clients can access the OAB virtual directory on Client Access servers directly without the need to go through ISA Server 2006. Office Outlook 2003 clients can still download the OABs from the public folder by using MAPI and RPCs directly or, in the case of external Outlook 2003 clients, by using MAPI and RPCs through Outlook Anywhere (formerly known as RPC over HTTPS).

 

Bb894728.image019(en-us,TechNet.10).jpg

Figure 19. OAB download scenarios

Enabling Cross-Forest Availability Lookups

Another important business requirement that Microsoft IT needed to address in the Client Access architecture concerned the integration of free/busy and availability information across multiple forests to facilitate meeting management for all Microsoft employees. As mentioned earlier in this white paper, Microsoft IT maintains several forests with Exchange Server organizations. Some of these forests serve the purpose of compatibility testing with previous product versions, whereas others run pre-release software. Accordingly, Microsoft IT had to provide for seamless integration and backward compatibility in the cross-forest availability architecture.

Three components need to work together for backward-compatible, seamless availability integration. First, it is necessary to synchronize address lists across all forests, which Microsoft IT accomplishes by using Microsoft Identity Integration Server 2003, as mentioned earlier in this white paper. Second, if Outlook 2003 or earlier clients are used, it is also necessary to synchronize free/busy items between messaging environments by using the Microsoft Exchange Server Inter-Organization Replication tool so that users with legacy clients can see the full set of availability information for all Microsoft users. To remain backward-compatible, Microsoft IT maintains free/busy public folders in all relevant Exchange Server 2007 organizations. Outlook 2007 and Office Outlook Web Access 2007 still use these public folders to publish free/busy information for down-level clients. The third component that is important for cross-forest availability integration is the Client Access server, specifically the Availability API, which Outlook 2007 clients and Outlook Web Access 2007 use to obtain availability information for all Microsoft users.

Note: The Exchange Server 2007 Availability service is a Web service on Client Access servers to provide Outlook 2007 clients with access to the Availability API. Office Outlook Web Access uses the Availability API directly and does not use the Availability service.

Figure 20 shows the cross-forest availability architecture that Microsoft IT established in the corporate production environment. Although clients using Outlook 2003 or clients with mailboxes on Exchange Server 2003 continue to work with free/busy items in public folders as usual, Client Access servers communicate differently to process availability requests from users with mailboxes on Exchange Server 2007 who work with Outlook 2007 or Outlook Web Access. If the target mailbox is in the local Active Directory site, Client Access servers use MAPI to access the calendar information directly in the target mailbox. If the target mailbox is in a remote Active Directory site, the Client Access server proxies the request via HTTP to a Client Access server in the target mailbox's local site, which in turn accesses the calendar in the user's mailbox. The same communication mechanism applies if the target mailbox is in a remote forest with Client Access servers. If the target mailbox is on Exchange Server 2003, Client Access servers must revert to accessing free/busy items in public folders via the /public/* virtual directory.

Bb894728.image020(en-us,TechNet.10).jpg

Figure 20. Cross-forest availability architecture

To enable cross-forest availability lookups in Exchange Server 2007, Microsoft IT implemented the following configuration:

  • Trusted forests   By using the Add-AvailabilityAddressSpace cmdlet, Microsoft IT specifies the per-user access method (-AccessMethod PerUserFB) to create the Availability Space with support for most detailed availability information. Subsequently, Microsoft IT grants all Client Access server accounts the necessary ms-Exch-EPI-Token-Serialization extended right through the Add-ADPermission cmdlet.
  • Non-trusted forests and forests without Client Access servers   By using the Add-AvailabilityAddressSpace cmdlet, Microsoft IT specifies the public-folder access method (-AccessMethod PublicFolder) to continue using free/busy items in this Availability Space.

Note: Client Access servers can use an organization-wide access method (-AccessMethod OrgWideFB) for communication with remote Client Access servers in non-trusted forests. However, Microsoft IT does not use this access method in order to avoid the need for maintaining special system account credentials in each forest. For more information about cross-forest availability lookups, see http://technet.microsoft.com/en-us/library/bb125182.aspx.

Unified Messaging

Microsoft IT has provided users with unified messaging capabilities for many years, starting with Exchange 2000 Server. The Microsoft voice-mail infrastructure before Exchange Server 2007 relied on various Private Branch eXchange (PBX) telephony devices and a third-party unified messaging product that interfaced with Exchange servers to deliver voice mail to users' inboxes. The third-party product required a direct physical connection to the PBX device for each location that provided unified messaging services. This requirement meant Microsoft IT placed third-party Unified Messaging servers in the same physical location as the PBX. The Exchange Server 2007 Unified Messaging server role provided Microsoft IT with the opportunity to redesign the environment and prepare the infrastructure for VoIP telephony across the entire company.

For the rollout of unified messaging services to multiple locations worldwide, Microsoft IT defined the following requirements:

  • Provide high-quality, next-generation VoIP services with clear voice-mail playback and voice conversations at all locations.
  • Roll out unified messaging services to all feasible locations.
  • Increase security through encrypted VoIP communication.
  • Replace third-party unified messaging implementation with a native Exchange Server 2007 implementation.
  • Provide Active Directory-based management of unified messaging users and devices.
  • Increase redundancy and fault tolerance of unified messaging services and devices.
  • Ensure smooth migration between third-party unified messaging systems and Exchange Server 2007-based Unified Messaging.
  • Reduce administrative overhead by educating and enabling users for self-service.

Unified Messaging Topology

Microsoft IT faced the decision either to replace the third-party Unified Messaging servers with Exchange 2007 Unified Messaging servers at each location, or to consolidate the locations of Unified Messaging servers to the four regional data centers. To maintain a centralized environment, Microsoft IT chose to deploy Unified Messaging servers in the four regional data centers that already housed Mailbox and Hub Transport servers, which provided the following benefits:

  • Reduced server footprint and costs   Unified Messaging servers must reside in the same Active Directory sites as Hub Transport and Mailbox servers. Deploying Unified Messaging servers in a decentralized way would require a decentralization of the entire messaging environment, with associated higher server footprint and operational costs.
  • Optimal preparation for future communication needs   Having Unified Messaging servers centralized in four locations, it is straightforward to deploy technological advancements in VoIP technology. It is also easier to integrate and maintain centralized Unified Messaging servers in a global unified communications infrastructure.

Figure 21 illustrates the Microsoft IT Unified Messaging server deployment in the corporate production environment.

Bb894728.image021(en-us,TechNet.10).jpg

Figure 21. Unified Messaging server topology

Having chosen to deploy Unified Messaging servers in the four regional data centers, Microsoft IT faced the goal of ensuring high quality of voice for all unified messaging users. To ensure high quality of voice data, Microsoft IT distributed Unified Messaging servers according to the number of users each region supports. For example, for Australia and Asia, Microsoft IT determined that two Unified Messaging servers provide adequate capacity for all the locations enabled for services. When assigning server partners for VoIP gateways in a specific location, Microsoft IT picks the site with the least latency.

The connectivity requirements to PBXs at Microsoft locations vary according to the call load. Microsoft IT deployed these connections years ago as part of a voice-mail solution. There are existing PBXs with T1 Primary Rate Interface (PRI) or Basic Rate Interface (BRI) trunks grouped logically as a digital set emulation group. The T1 trunks can use channel associated signaling (CAS) where signaling data is on each channel (24 channels for T1), or Q.SIG where there are 23 channels and a dedicated channel for signaling.

The VoIP gateway decisions depend on the type of telephony connection. VoIP gateways support specific signaling types and trunk sizes. Microsoft IT considers the signaling types and the size of the trunk, and then ensures that the combination meets the user load. From monitoring performance, Microsoft IT concluded that the existing connectivity more than met call load and did not require expansion.

Unified Messaging Redundancy and Load Balancing

With Exchange Server 2007, Microsoft IT sought to increase the redundancy and load-balancing capabilities of the unified messaging environment. As shown in Figure 22, there were several opportunities to design flexibility and scalability into each location and for the overall infrastructure. Microsoft IT took advantage of these opportunities and built in redundancy and fault tolerance into the connectivity, VoIP gateway, and Unified Messaging servers.

Bb894728.image022(en-us,TechNet.10).jpg

Figure 22. Unified messaging redundancy configuration

Microsoft IT decided that the minimum level of scalability and flexibility in the unified messaging environment required at least two VoIP gateways communicating with at least two Unified Messaging server partners. Microsoft IT based this decision on a few considerations. First, using two VoIP gateways and two Unified Messaging servers ensures that if one telephony link or network link fails at any given time, users can still receive unified messaging services. Second, if one VoIP gateway fails, requires configuration changes, or requires updated firmware, Microsoft IT can temporarily switch all traffic to the other gateway. Third, two or more Unified Messaging servers ensure that in case one server fails, the other server can take over. Microsoft IT considered redundancy for PBXs and, based on previous experience, decided those PBXs provided stable service with built-in redundancy through multiple telephony interface cards and multiple incoming telephony links to the telephone company.

After deciding to use a minimum of two VoIP gateway devices by using two Unified Messaging servers as communication partners, Microsoft IT considered the type and capacity of VoIP gateway to use. Microsoft IT followed several technical and business requirements for making VoIP gateway selections for each location, as follows:

  • Connectivity type   For Microsoft IT, the connectivity type came down to two choices of digital connections: PRI T1 or BRI emulated as a digital set. Analog connections were eliminated immediately because of cost and scalability factors. For small sites, Microsoft IT uses BRI (PIMG80PBXDNI); for large sites, Microsoft IT uses either TIMG300DT or TIMG600DT. Whereas TIMG300DT supports a single T1 for each device, TIMG600DT supports dual T1s. Microsoft IT varied the number of T1s depending on usage, employing dual T1s in Redmond and single T1s in the Silicon Valley location. Other sites used BRI trunks emulated as a digital set, with either eight or 16 lines per gateway, depending on user load.
  • Simplified Message Desk Interface (SMDI)/signaling integration   Intel gateways provide a standard, supported SMDI integration, which is a decision factor for Microsoft IT. To accomplish SMDI integration with Intel gateways, Microsoft IT connected multiple gateways to the same SMDI link by using two primary gateways and multiple secondary gateways. By doing this, Microsoft IT can switch over from one primary gateway to another, enabling gateway firmware updates with no service interruption.

Increased Unified Messaging Security

There are many security concerns associated with a unified messaging environment. For example, Session Initiation Protocol (SIP) proxy impersonation, network sniffing, session hijacking, and even unauthorized phone calls can compromise network security. Microsoft IT can choose from several methods to help secure the unified messaging environment, especially Unified Messaging servers and traffic between VoIP gateways and Unified Messaging servers:

  • Secure protocols   In the unified messaging environment, all traffic that uses SIP can use Mutual Transport Layer Security (MTLS). This includes the traffic between VoIP gateway devices and the Unified Messaging servers.
  • Trusted local area networks (LANs)   To prevent network sniffing and reduce overall security risks, Microsoft IT places VoIP gateways on a virtual LAN (VLAN) separate from the corporate production environment. This makes traffic access possible only for authorized individuals with physical access to VoIP gateways. Moreover, Unified Messaging servers only communicate with gateways explicitly listed in the dial plan.
  • IPSec   The Microsoft corporate network uses IPSec for all IP communication within the network. However, the VoIP gateway to Unified Messaging server traffic is already encrypted by using MTLS. To ensure optimal performance, and according to product recommendations, Microsoft IT created an exception for VoIP gateway to Unified Messaging server traffic to bypass IPSec while maintaining security through MTLS.

In addition to these security measures, Microsoft IT enforces general security practices such as using strong authentication methods and strong passwords.

Unified Messaging Feature Considerations

Exchange Server 2007 Unified Messaging servers include various configuration options, such as dial plans, VoIP gateway communication partners, hunt groups, mailbox policies, and so on. Some configuration options represent default configurations or require inputting the necessary values, such as the IP addresses for VoIP gateways. For other options, such as dial plans and hunt groups, Microsoft IT considered the feature set necessary to meet business requirements and configured settings accordingly.

One consideration Microsoft IT faced when configuring settings on Unified Messaging servers involved dialing rules and mailbox policies. Dialing rules represent logical groupings of PBXs and specify details about sets of numbers and mailbox extensions, whereas mailbox policies enable Microsoft IT to apply a common set of policies or security settings such as personal identification number (PIN) details and dialing restrictions to a collection of unified-messaging-enabled mailboxes. Microsoft IT considered which phone calls to allow users to make: local only or local and long-distance calls. After considering the costs, Microsoft IT enabled calling for users in North America to anywhere in North America and restricted other sites to local-only calls.

User Education

Microsoft IT also addressed the needs of unified messaging users by providing comprehensive documentation for the new environment. Before enabling users for unified messaging services, Microsoft IT ran a pilot where a select group of users tested features and functionality to verify that the system performed as expected. In order to run the pilot and prepare the way for enabling all the users for a specific location, Microsoft IT considered the following:

  • Customized e-mail templates   Exchange Server 2007 sends out messages to users during the rollout process. The first message provides an initial PIN, and the second message notifies the user of rollout and migration completion. Exchange Server 2007 stores the e-mail templates in a configuration file. Microsoft IT customizes these e-mail templates to include intranet links to locations that provide further help and instructions for common tasks.
  • User documentation   Microsoft IT created an information and support repository for users that consists of help documents located on the corporate intranet. Microsoft IT maintains these documents to ensure that they provide up-to-date information for users.

Internet Mail Connectivity

Microsoft IT capitalizes on native Exchange Server 2007 anti-spam and antivirus capabilities to help protect the company's Internet mail connectivity points against spammers and attacks at the messaging layer. Specifically, Microsoft IT uses Exchange Server 2007 Edge Transport servers and Forefront Security for Exchange Server to help protect the corporate network from outside threats.

Microsoft IT defined the following goals for the Internet mail connectivity design:

  • Increase security by using built-in Exchange Server 2007 features in a perimeter network that is strictly separate from the corporate production environment.
  • Adopt flexible spam filtering methods through Edge Transport agents.
  • Optimize spam filtering to keep unwanted messages out while delivering legitimate messages.
  • Develop a fault-tolerant system that balances both incoming and outgoing message traffic.
  • Enable inbound and outbound scanning for viruses at multiple levels, including the perimeter network, to stop viruses at the earliest point possible.

Inbound and Outbound Message Transfer

In the Exchange Server 2003 environment, Microsoft IT used a total of six Internet mail gateway servers in Redmond and Silicon Valley as the main points of contact for inbound and outbound Internet message transfer and four additional outbound-only Internet mail gateway servers in Dublin and Singapore. Concentrating the incoming Internet message traffic through the six Internet mail gateway servers in Redmond and Silicon Valley enabled Microsoft IT to limit internal resource exposure, concentrate spam filtering, and centralize security administration. Maintaining four additional outbound-only Internet mail gateway servers in Dublin and Singapore eliminated the need to transfer messages to Internet recipients from these regions across the Microsoft WAN to an Internet mail gateway server in Redmond or Silicon Valley. To provide good performance and redundancy at each data center, Microsoft IT decided to use three Internet mail gateway servers in Redmond, three in Silicon Valley, two in Dublin, and two in Singapore. Because this design proved reliable and adequate, Microsoft IT retained this topology in the Exchange Server 2007 design as well.

Microsoft IT replaced the Internet mail gateway servers in a straightforward way with the same number of Edge Transport servers. Microsoft IT subscribed the Edge Transport servers in Redmond and Silicon Valley to the Active Directory site ADSITE_REDMOND-EXCHANGE. This causes the Hub Transport servers in North America to relay outbound messages through these Edge Transport servers. Similarly, Microsoft IT subscribed the Edge Transport servers in Dublin and Singapore to the Active Directory sites of ADSITE_DUBLIN and ADSITE_SINGAPORE so that Exchange Server 2007 routes outbound messages to the Internet primarily through the Edge Transport servers in each region. Figure 23 shows the Internet mail connectivity and topology with Exchange Server 2007.

Bb894728.image023(en-us,TechNet.10).jpg

Figure 23. Internet mail connectivity topology

Redundancy and Load Balancing

Because Microsoft IT must meet stringent performance and availability SLAs, balancing the traffic load and providing redundancy is a vital consideration for Internet mail connectivity. Internally, Microsoft IT uses multiple Hub Transport servers in the regions with Edge Transport servers. Within each region, all Hub Transport servers can transfer outbound messages to their local Edge Transport servers. In the opposite direction, Edge Transport servers can also choose any of the Hub Transport servers in the local region to transfer inbound messages.

Externally, for inbound message transfer from the Internet, Microsoft IT uses DNS round-robin and MX records with a preference value of 10. This method has proven to be effective for distributing traffic load; therefore, Microsoft IT continues to use it with all six Edge Transport servers in North America. Because all records have the same preference value, Internet clients are free to select any one of these MX records at random. The A record for each MX record specifies two possible servers that reside in separate data centers. In this way, if an SMTP client constantly uses the same MX record, the SMTP client can choose between two servers. This configuration of two servers for each record provides an added degree of load balancing.

The following name server lookup (nslookup) response shows the configuration of the public DNS zone for microsoft.com that Microsoft IT established for the Edge Transport servers in Redmond and Silicon Valley.

microsoft.com MX preference = 10, mail exchanger = maila.microsoft.com

microsoft.com MX preference = 10, mail exchanger = mailb.microsoft.com

microsoft.com MX preference = 10, mail exchanger = mailc.microsoft.com

 

maila.microsoft.com internet address = 131.107.115.212

maila.microsoft.com internet address = 205.248.106.64

mailb.microsoft.com internet address = 131.107.115.215

mailb.microsoft.com internet address = 205.248.106.30

mailc.microsoft.com internet address = 131.107.115.214

mailc.microsoft.com internet address = 205.248.106.32

Note: In addition to DNS host (a) and mail exchanger (MX) records for round-robin load balancing, Microsoft IT maintains Sender ID (Sender Policy Framework, or SPF) records. The Sender ID framework relies on SPF records to identify messaging hosts that are authorized to send messages for a specific SMTP domain, such as microsoft.com. Internet mail hosts that receive messages from microsoft.com can look up the SPF records for the domain to determine whether the sending host is authorized to send mail for Microsoft users. If the sending host is not one of the six Edge Transport servers in Redmond and Silicon Valley or one of the four outbound Edge Transport servers in Dublin and Singapore, the receiving Internet host can block the submission attempt or perform any other action specified by the Internet host's administrator, such as increasing the messages' spam confidence level.

Increasing Perimeter Network Security

Microsoft IT took advantage of new features available with Edge Transport servers on the 64-bit platform to increase security in the perimeter network. For example, Edge Transport servers do not need to be part of the internal Active Directory environment, which enabled Microsoft IT to tighten access rules on the firewall between the perimeter network and the corporate production environment. Furthermore, the increased processing power available on the 64-bit platform enabled Microsoft IT to perform virus scanning directly on the Edge Transport servers to stop viruses before they enter the corporate production environment. This is a significant improvement over previous conditions where Microsoft IT had to perform virus scanning for performance reasons only in the production environment on dedicated hub servers running Exchange Server 2007 but not on the Internet mail gateway servers.

Figure 24 illustrates the deployment of Edge Transport servers in regional perimeter networks. Microsoft IT separates the perimeter network through inner and outer firewalls.

Bb894728.image024(en-us,TechNet.10).jpg

Figure 24. Edge Transport server security

Edge Transport servers must communicate with Internet hosts as well as the internal Exchange organization. To accomplish this, Microsoft IT enabled traffic by opening the following ports for specific services:

  • Inbound SMTP TCP port 25   The only inbound port opened on both the internal and external firewall was port 25 for SMTP. Microsoft IT required no other ports, because Edge Transport servers have the dedicated role of accepting Internet mail, processing it, and forwarding it to a Hub Transport server.
  • Outbound SMTP TCP port 25   For outbound SMTP traffic, Microsoft IT opens outgoing traffic on port 25 on both firewalls. This enables Hub Transport servers to route mail traffic to Edge Transport servers and enables Edge Transport servers to relay messages to Internet hosts.
  • Antivirus updates TCP port 80   Microsoft IT opens port 80 on the firewall between the perimeter network and Internet hosts to enable antivirus definition updates.
  • DNS TCP/User Datagram Protocol (UDP) port 53   Edge Transport servers use DNS records for MX host resolution and to check SPF records for Sender ID. Accordingly, Microsoft IT enables port 53 on the firewall between the perimeter network and Internet hosts.
  • Terminal services TCP port 3389   To perform maintenance and operations tasks, Microsoft IT opens port 3389 for Terminal services on the inner firewalls.
  • EdgeSync TCP port 50636   Although only port 25 is necessary to support message transfer from Edge Transport servers to Hub Transport servers via SMTP, Edge Transport servers also need access to Active Directory data through EdgeSync. Accordingly, Microsoft IT opens port 50636 for the Exchange EdgeSync service. It is important to note that the direction of this data flow is following the push model from the internal network to the perimeter network. There is no need for Edge Transport servers to actively connect and pull the directory information from the corporate network.

Note: Although it is not necessary to deploy Edge Transport servers in an Active Directory environment, Microsoft IT chose to deploy all Edge Transport servers on computers that are members of the Extranet Forest. The Extranet Forest is separate from the Corporate Forest. This form of deployment helps Microsoft IT maintain a consistent management framework, apply a common set of policies to all Edge Transport servers, and support single sign-on for server administrators.

Server Hardening

In addition to securing firewalls on both sides of the perimeter network, Microsoft IT took into consideration the security configuration for Edge Transport servers. This process of server hardening took into account the following components:

  • Ports   Following the classic concept of a perimeter network, Edge Transport servers use separate NICs, with one card facing Internet hosts and the other card facing Hub Transport servers. The Internet-facing card has all unnecessary protocols and services, such as NetBIOS over TCP/IP and File and Printer Sharing, disabled and accepts traffic only on port 25.
  • Services   Microsoft IT uses the Security Configuration Wizard to analyze the unnecessary services to disable. Microsoft IT disables all unnecessary services, such as Internet Information Services (IIS).
  • File shares   Microsoft IT removed the Everyone group from all shared folders. All shares must have the security groups applied that contain only the users who need access to the shares. Microsoft IT does not apply open security groups to shares such as Authenticated Users, Domain Users, or Everyone.
  • Accounts   After obtaining the proper security groups developed during the permissions and administration model design for the environment, Microsoft IT adds security groups as members of the local administrators group on the Exchange Server and verifies that the computer is in the correct organizational unit. Microsoft IT also enforces passwords of 15 characters or more to meet strong password requirements. Microsoft IT requires smart-card authentication to administer servers in the Extranet Forest through Terminal Services.
  • Security updates   Microsoft IT monitors the installation of security updates and security configurations on server platforms by using Microsoft Baseline Security Analyzer (MBSA). Edge Transport servers must have all current security updates to help ensure security.

Optimizing Spam and Virus Scanning

In terms of design for transport agents on Edge Transport and Hub Transport servers, Microsoft IT considered what settings required customization to meet performance, security, and reliability demands. Many agents work with the default settings and do not require additional configuration. Other agents need varying levels of customization, such as:.

  • Connection-filtering configuration   Settings that relate to the connection-filtering configuration include IP block-list and IP allow-list providers, and Sender Reputation Level (SRL). For block-list provider settings, Microsoft IT uses the information provided from third-party block-list vendors. For the SRL, Microsoft IT uses the default configuration to perform an open proxy test when determining the sender confidence level, blocks senders with an SRL of seven, and adds these senders to the IP block list for the duration of 24 hours. In addition to using block-list providers and locally maintained block lists, Microsoft IT keeps up to date through automatic IP reputation updates from Microsoft Research.
  • Recipient-filtering configuration   For recipient filtering, Microsoft IT considered the following aspects: blocking messages to invalid recipients, blocking messages to mailboxes and global distribution lists that are for internal use only, blocking messages sent to recipients not listed in the recipient list, and blocking specified recipients in the properties of the Recipient Filtering Agent.
  • Content-filtering configuration   Microsoft IT configured a spam confidence level (SCL) store threshold of five on Mailbox servers and an SCL reject messages threshold of seven on Edge Transport servers for the Content Filter Agent. These values came from testing and previous experience running Exchange servers. Microsoft IT did not configure an SCL threshold value to delete messages or quarantine messages.
  • Attachment-filtering configuration   Microsoft IT uses Forefront Security for Exchange Server file filtering to remove critical attachments that correspond to Level 1 file types blocked by default in Outlook 2007. Forefront Security for Exchange Server filters attachments based on Multipurpose Internet Mail Extensions (MIME) type, files within a container file, files based on extension, and critical files with an arbitrary extension.

Microsoft IT installs other agents enabled with default settings, except for the inbound and outbound address-rewriting agents. The address-rewriting agents on Edge Transport servers modify sender and recipient SMTP addresses based on predefined address alias information. Because this is not a consideration for the Exchange Server environment at Microsoft, Microsoft IT disables these agents.

Optimizing Outbound Message Transfer

Edge Transport servers communicate with Internet hosts as well as the internal Exchange Server environment through SMTP send and receive connectors. These connectors offer built-in protection, including a header firewall, connection tarpitting, and SMTP backpressure. Microsoft IT uses two receive connectors and four send connectors on all Edge Transport servers for all mail transfer:

  • Inbound mail   For inbound mail from the Internet to the internal Exchange Server organization, Microsoft IT configured one receive connector that faces the Internet and accepts SMTP messages, and one send connector that transfers these inbound messages to Hub Transport servers for further routing.
  • Outbound mail   For outbound mail destined for Internet hosts, Microsoft IT configures one receive connector that faces Hub Transport servers and three send connectors for relaying outbound messages to Internet hosts. Edge Transport servers use the three send connectors in different scenarios. The first send connector is dedicated to encrypted communication with remote domains that support TLS. Microsoft IT configures the second connector to communicate with destinations that do not support Extended SMTP. The third send connector is a general Internet connector that Edge Transport servers use for all destinations that do not match the address space definitions of the TLS and HELO connectors. For more information about the SMTP connectors, see the "Microsoft Exchange Server 2007 Edge Transport and Messaging Protection" white paper at http://technet.microsoft.com/en-us/library/bb735142.aspx.

Deployment Planning

Deployment planning is a critical element of the Microsoft IT planning and design process. It addresses the question of how to implement the new messaging environment with minimal interference on existing business processes and provides all members of the Exchange Messaging team with a clear understanding of when to perform the required deployment steps.

The high-level deployment plan the Messaging Engineering team recommended for the transition to Exchange Server 2007 in the corporate production environment included the following phases:

  1. Introduce Exchange Server 2007 into the corporate production environment.
  2. Verify the successful integration of Exchange Server 2007.
  3. Fully deploy Client Access servers in North America.
  4. Fully deploy Hub Transport servers in North America.
  5. Deploy Mailbox servers in North America.
  6. Introduce Edge Transport servers in North America.
  7. Deploy Forefront Security for Exchange Server 2007.
  8. Deploy Exchange Server 2007 in regional data centers.
  9. Switch the messaging backbone to Exchange Server 2007.
  10. Complete the transition to Exchange Server 2007.

Note: In addition to business and technical requirements, the Messaging Engineering team had to consider several unique software issues because the designs were based on beta versions of Exchange Server 2007. Important features, such as OAB generation on Mailbox servers in CCR configuration, and software products, such as Forefront Security for Exchange Server, were not yet ready for deployment at the time Microsoft IT started the production rollout. The Microsoft IT deployment plans reflect these dependencies, which became obsolete with the release of Exchange Server 2007.

Introducing Exchange Server 2007 into the Corporate Production Environment

Microsoft IT introduced Exchange Server 2007 into the Corporate Forest in May 2006. This work included the preparation of Active Directory, the implementation of the administrative model, and the installation of the first Hub Transport and Client Access servers, as required to integrate Exchange Server 2007 with an existing Exchange Server 2003 environment. Microsoft IT configured the initial routing group connector between the routing group RG_REDMOND and EXCHANGE ROUTING GROUP (DWBGZMFD01QNBJR) because Microsoft IT focused the initial deployment activities on the data center in North America that corresponded to the routing group RG_REDMOND.

Verifying the Successful Integration of Exchange Server 2007

This initial phase also included the installation of a first Mailbox server for approximately 250 production mailboxes. Moving 250 power users to Exchange Server 2007 enabled Microsoft IT to verify the successful integration of Exchange Server 2007 into the existing messaging environment.

Fully Deploying Client Access Servers in North America

With the availability of Exchange Server 2007 Beta 2, Microsoft IT began the deployment of Client Access servers at full scale in North America. This included building and testing the Client Access server farm and switching the redmond.microsoft.com URL namespace from the Exchange 2003 front-end servers to the Exchange 2007 Client Access servers. Switching to Client Access servers early on was necessary to preserve the existing namespaces. Exchange 2003 front-end servers do not support access to Mailbox servers running Exchange Server 2007.

Figure 25 illustrates how Client Access servers support users with mailboxes on Exchange Server 2003 and Exchange Server 2007. Users with mailboxes on Exchange Server 2003 simply continue to use the legacy virtual directories that Client Access servers provide for backward compatibility. The Microsoft Office Outlook Web Access 2003 proxy component in Exchange Server 2007 proxies the user requests to the Exchange 2003 back-end server where the mailbox resides. The back-end server renders the HTML response as usual, which the Exchange 2003 back-end server passes back to the client via the Client Access server. In this scenario, the Client Access server behaves exactly like an Exchange 2003 front-end server.

Bb894728.image025(en-us,TechNet.10).jpg

Figure 25. Office Outlook Web Access coexistence

If users with mailboxes on Exchange Server 2007 access the legacy virtual directories (that is, /public/* and /exchange/*), Exchange Server 2007 redirects the user to the /owa/* virtual directory, which provides access to Outlook Web Access. It is important to note, however, that Outlook Web Access 2007 did not yet support public folders (support for public folders was added in Exchange Server 2007 SP1). A pre-SP1 Mailbox server could only redirect public-folder requests to an Exchange 2003 back-end server. For this reason, Microsoft IT decided to retain public-folder servers running Exchange Server 2003 in the messaging environment until Exchange Server 2007 SP1 was available.

Note: Microsoft IT maintains multiple public-folder servers running Exchange Server 2007 in all four locations with Mailbox servers for redundancy. In addition to these public-folder servers supporting Outlook Web Access and legacy clients, such as Office Outlook 2003, Microsoft IT maintains two public-folder servers running Exchange Server 2007 in Redmond. These two Exchange 2007 servers host all non-system public folders.

Fully Deploying Hub Transport Servers in North America

Prior to deploying Mailbox servers in large numbers, Microsoft IT deployed Hub Transport servers to provide the necessary redundancy and scalability for the message routing functions within the Exchange Server 2007 environment. All messages sent between Mailbox servers must pass through a Hub Transport server. The work also included creating SMTP send and SMTP receive connectors to Hub Transport servers in other forests and changes to the routing group connector configuration for messaging connectivity with Exchange Server 2003. Microsoft IT specified all Exchange Server 2003 computers as bridgeheads so that messages from Exchange Server 2003 to Exchange Server 2007 would not have to pass through the legacy hub servers. This "all to many" relationship corresponds to the configuration that Microsoft IT also uses in all other routing group connectors among Exchange Server 2003 routing groups.

Deploying Mailbox Servers in North America

With the Client Access server and Hub Transport server topologies in place, Microsoft IT began to deploy additional Mailbox servers to move more and more mailboxes to Exchange Server 2007. The plan was to increase the number of mailboxes in the Exchange Server 2007 environment to 16,000 mailboxes before approaching other deployment tasks, such as deploying Edge Transport servers. As Microsoft IT moved approximately 1,000 mailboxes every day, the load on Hub Transport servers continuously increased. Because the Hub Transport servers ran beta software, Microsoft IT monitored server response times and workload very closely during this time frame.

Introducing Edge Transport Servers in North America

With the confidence that Hub Transport servers operated reliably, Microsoft IT prepared to decommission the legacy hub servers in order to move more and more of the messaging traffic onto Exchange Server 2007 and simplify the routing topology. This also included introducing the first Edge Transport servers into the Redmond perimeter network to handle inbound and outbound Internet traffic. Yet, up to this point, Microsoft IT had not yet deployed Forefront Security for Exchange Server for virus scanning on Hub Transport servers, because the software was not yet ready for production deployment. Accordingly, Microsoft IT routed all incoming and outgoing messages through the dedicated hub servers running Exchange Server 2003 and Microsoft Antigen for Exchange. Routing all incoming and outgoing Internet messages through dedicated hub servers running Exchange Server 2003 required manual changes regarding the connector configuration between Hub Transport servers and Edge Transport servers. Nevertheless, deploying Edge Transport servers provided Microsoft IT with the opportunity to get practical experience with important Edge Transport features, such as EdgeSync, and test the reliability of this server role for production use.

Deploying Forefront Security for Exchange Server 2007

Having virus-scanning capabilities on Hub Transport servers was crucial for Microsoft IT before introducing any further changes to the routing topology. Accordingly, Microsoft IT began to test Forefront Security for Exchange Server as soon as the software was available in a stable version. The criteria that Microsoft IT defined for the deployment in the corporate production environment included the following points:

  • The antivirus solution works reliably (that is, without excessive failures) and with acceptable throughput according to the expected message volumes.
  • The software is able to find known viruses in various message encodings, formats, and attachments.
  • If the software fails, message transport must halt so that messages do not pass the Hub Transport servers without scanning.

Having completed these tests successfully, Microsoft IT was able to deploy Forefront Security and change the routing configuration. The Edge Transport servers and legacy hub servers running Exchange Server 2003 in the perimeter network now routed incoming messages through Hub Transport servers. The dedicated hub servers running Microsoft Antigen for virus scanning were no longer involved in message transfer. Microsoft IT decommissioned these servers at this stage.

Deploying Exchange Server 2007 in Regional Data Centers

With the core of the message routing functions and virus scanning now reliably performed by Hub Transport servers in North America, Microsoft IT was ready to approach the deployment of Hub Transport servers, Client Access servers, and Mailbox servers in the regional data centers. In general, the deployment processes followed the approach Microsoft IT had so successfully used in North America. Microsoft IT established routing group connectors between the Hub Transport servers in each region and the local routing groups, switched the regional URL namespaces to Client Access servers, began to move mailboxes to Exchange Server 2007, and transitioned the outbound Internet mail gateway servers in the regional data centers to Edge Transport servers.

Switching the Messaging Backbone to Exchange Server 2007

The final switch to Exchange Server 2007 in the messaging backbone did not require many further changes. One important step was to optimize the message routing topology by using Exchange Server-specific Active Directory site connectors in order to implement a hub/spoke topology along the physical network links. For details regarding message routing optimization, see the section "Message Routing Topology" under "Architecture and Design Decisions" earlier in this white paper.

Another important step was to install the remaining Edge Transport servers for inbound and outbound Internet traffic in the Redmond and Silicon Valley perimeter networks. Furthermore, with the completion of the Forefront Security deployment, Microsoft IT was able to decommission the Exchange Server 2003-based gateway and hub servers in the perimeter networks and the corporate production environment.

Completing the Transition to Exchange Server 2007

At this point, the Exchange Server 2007 environment no longer depended on the legacy topology for message routing. The most important remaining tasks involved completing the Mailbox server deployment and establishing the UM environment. Corresponding to the commitment to finish the production rollout with the release of Exchange Server 2007 to manufacturing, Microsoft IT moved between 1,000 and 1,500 mailboxes to the new environment every day during September through November 2006 and finished the deployment within the same week the product shipped.

Figure 26 shows the messaging environment at Microsoft after completion of the production rollout in December 2006.

Bb894728.image026(en-us,TechNet.10).jpg

Figure 26. The Microsoft messaging environment after transitioning to Exchange Server 2007 (December 7, 2006)

Best Practices

While designing a messaging environment with multiple worldwide data centers using both IP and telephony technologies, Microsoft IT employed design phases during which engineering teams analyzed the existing environment, considered the possible decisions, and arrived at a fitting design for the Exchange Server 2007-based messaging production environment. During the process of considering business needs and choosing the features to address business requirements, Microsoft IT developed the following best practices that supported a solid design and ensured a smooth transition to Exchange Server 2007.

Planning and Design Best Practices

Microsoft IT relied on the following best practices in its planning and design activities:

  • Clearly define goals   Exchange Server 2007 includes roles and configuration options that enable numerous topology and design scenarios. The mix of server roles, enabled options, and settings depends on the business needs and messaging goals of the organization.
  • Design for production in mind   To meet business requirements, Microsoft IT checks design considerations against practical real-world constraints that exist in the production environment. This helps to produce a smooth transition to the new environment after implementation of the design.
  • Design for peak load days   Microsoft IT uses the concept of peak load days, or snow days, to plan for the event when a large number of people use the messaging infrastructure from outside the corporate network. The messaging design takes into account the possibility of some days when the majority of users work from home or remotely.
  • Test in lab environment   With the many options to meet business requirements, Microsoft IT validates the chosen designs in a test environment. This enables Microsoft IT to determine stability and finalize design plans before rolling out a planned infrastructure in the production environment.
  • Identify key risks   Microsoft IT practices sound project management practices as part of the MSF processes. These practices include identifying risks present with design decisions as well as overall system risks. By identifying risks early on, Microsoft IT can develop mitigation strategies to address the risks.
  • Develop rollback and mitigation procedures   It is important for Microsoft IT to have rollback procedures when designing the Exchange Server 2007-based messaging environment. Often, Microsoft IT accomplishes this through a period of coexistence where both Exchange Server 2003 and Exchange Server 2007 process mail. Microsoft IT later decommissions the Exchange Server 2003 server after verifying functionality of the new environment.

Server Design Best Practices

Microsoft IT relied on the following best practices in its server designs:

  • Use multiple-core processors and design storage based on both capacity and I/O performance   During testing, Microsoft IT determined that multi-core processors provide substantial performance benefits. However, processing power is not the only factor that determines Mailbox server performance. It is also important to design the storage subsystem according to I/O load and capacity requirements based on the desired number of users per Mailbox server.
  • Use VSS-based backup   VSS provides the ability to offload backup processes to the passive node. This enables backup operations in much more frequent intervals than streaming backups on active nodes.
  • Eliminate single points of failure   When designing an Exchange environment, it was vital for Microsoft to create redundancy at all points possible. Microsoft IT relied on multiple data centers, multi-homed NICs, redundant Hub Transport and Edge Transport servers, multiple VoIP gateways for unified messaging, multiple Client Access servers, and CCR on Mailbox servers.

Deployment Best Practices

Microsoft IT relied on the following best practices during the transition to Exchange Server 2007:

  • Establish flexible and scalable messaging infrastructure   Microsoft IT focused on single-role server deployments throughout the Exchange Server 2007 organization. Using dedicated server roles enabled Microsoft IT to fine-tune the server designs and place the necessary number of servers to handle the load at each location.
  • Carefully plan URL namespaces   At Microsoft, Client Access servers handle approximately 150,000 mobile user sessions per month. To distribute this load, Microsoft IT uses multiple URL namespaces, where a URL represents access points for clients in specific geographic regions. Microsoft IT chose to preserve these namespaces to provide a seamless transition for mobile users.
  • Manage permissions through security groups   Deploying Exchange Server 2007 in an existing organization does not affect any existing rights granted on Exchange Server 2003 resources through user accounts or security groups. This provides Microsoft IT with the opportunity to clean up legacy permission assignments and manage permissions through security groups.
  • Use fewest permissions necessary   Microsoft grants only necessary rights to administrators, opting to grant full Domain or Enterprise Admin rights only when there is a business or technical reason to do so. Even in these cases, the rights are given temporarily to accomplish a specific task. Doing so helps maintain network security.
  • Use Forefront and multiple layers of protection   Microsoft IT designed the messaging environment to provide many layers of messaging protection against viruses, spam, and other unwanted e-mail. Microsoft IT deploys Forefront Security for Exchange Server on Edge Transport and Hub Transport servers to enable bidirectional scanning of e-mail messages and enforce protection at multiple organization levels.
  • Place Edge Transport servers in a perimeter network   Microsoft IT helps protect the corporate production environment by using perimeter networks, removing unnecessary client receive connectors, and enabling transport encryption. Perimeter networks provide the capability to restrict access to the Edge Transport servers in terms of open ports, services, permissions, and users, and also to disable incoming mail flow easily in case of severe attacks.
  • Use ISA Server 2006 to publish Client Access servers   Microsoft IT capitalizes on the close integration of ISA Server 2006 and Exchange Server 2007 to provide security-enhanced access to the internal messaging environment over the Internet. Among other things, ISA Server 2006 supports stateful packet inspection and application-layer filtering. Microsoft IT also benefits from Web Publishing Load Balancing, which enabled Microsoft IT to establish a highly scalable load-balancing infrastructure for external messaging clients and avoid complicated load distribution and client session affinity issues.

Conclusion

With the completion of the production rollout at the RTM date of the product, Microsoft IT demonstrated the enterprise readiness of Exchange Server 2007. The Microsoft messaging environment hosts 130,000 mailboxes with 500-MB and 2-GB quotas in four data centers on 62 clustered Mailbox servers with 99.99 percent availability, achieving SLA targets. The corporate production environment also includes 26 Client Access servers, 15 Hub Transport servers, 10 Edge Transport servers, and 11 Unified Messaging servers.

With the transition to Exchange Server 2007, Microsoft IT was able to reduce operational costs, including costs for server hardware, storage, and backup. Among other things, Microsoft IT replaced third-party solutions (such as UM systems) with features that are directly available in Exchange Server 2007, replaced expensive SAN-based storage with a more cost-efficient DASD-based solution, and eliminated tape backups with associated cost savings of approximately $5 million per year.

Microsoft IT continues to achieve cost savings by scaling up Mailbox servers. Initially, Microsoft IT limited the scale of Mailbox servers to fewer than 3,600 mailboxes in order to support increased mailbox capacities while using a streaming backup solution with limited throughput. In preparation of deploying a new VSS-based solution that can perform backup operations on the passive node in CCR server clusters, Microsoft IT began to scale up to 6,000 users per Mailbox server. Based on the new server design, Microsoft IT was able to lower the overall number of Mailbox servers in the corporate production environment from 62 to fewer than 30 server clusters.

By increasing mailbox quotas to 500 MB and 2 GB and deploying new productivity features that are readily available in Exchange Server 2007, such as UM, Microsoft IT helps to increase the productivity of Microsoft employees. Users can store all messages on the server, including e-mail, voice mail, and fax messages, and access these messages from any suitable stationary or portable client, including standard telephones. Office Outlook 2007 also helps employees to increase productivity. By using Office Outlook 2007 as the primary messaging client in the corporate production environment, employees can benefit from new and advanced information management features, such as instant search, managed folders, and more.

Increasing user productivity also entails maintaining availability levels for messaging services according to business requirements and SLAs. To accomplish this goal, Microsoft IT heavily focuses on single-role server deployments with Exchange Server 2007. Dedicated Mailbox servers support CCR, which Microsoft IT uses to increase resilience from storage-level failures. Single-role server deployments also enable Microsoft IT to establish scalable and flexible middle-tier services that are capable of handling large spikes in messaging activity, such as during snow days in Redmond when the vast majority of Microsoft staff at headquarters prefers to work from home over mobile messaging connections.

Another important aspect is messaging protection and security. To achieve the highest security and protection levels while maintaining a flexible environment, Microsoft IT encrypts all server-to-server message traffic by using IPSec or TLS to help prevent spoofing and help protect confidentiality for messages and deployed Edge Transport servers in the perimeter networks. Edge Transport servers increase the security in the perimeter network and reduce the number of legitimate messages incorrectly identified as spam. For antivirus protection, Microsoft IT deployed Forefront Security for Exchange Server on all Edge Transport and Hub Transport servers.

Exchange Server 2007 enabled Microsoft IT to capitalize on 64-bit technologies and cost-efficient storage solutions to increase the level of messaging services in the corporate environment. The Exchange Messaging team is now in a better position to respond to new messaging trends and accommodate emerging needs as Microsoft continues to grow as a company.

For More Information

For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada information Centre at (800) 563-9048. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information through the World Wide Web, go to http://www.microsoft.com.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft