Export (0) Print
Expand All

Operating a Global Messaging Environment by Using Exchange Server 2010

Technical White Paper

Published: August 2010

The following content may no longer reflect Microsoft’s current position or infrastructure. This content should be viewed as reference documentation only, to inform IT business decisions within your own company or organization.


Download Technical White Paper, 423 KB, Microsoft Word file




Products & Technologies

During the migration from Exchange Server 2007 to Exchange Server 2010, Microsoft IT began leveraging a number of key features such as Database Availability Groups, Client Access server arrays, and Just-A-Bunch-of-Disks (JBOD). With the introduction of new technologies, the Exchange Messaging team needed to enhance how they operate the environment to maintain a consistently high level of availability.

To support the changes to the architecture, the Exchange Messaging team re-organized the internal support teams and processes. This involved organizational changes, modifications to the change control process, and the addition of new monitoring tools, such as Microsoft System Center Configuration Manager.

  • The support teams benefit from improved problem notification through Microsoft System Center Configuration Manager event correlation engine.
  • End Users benefit from the enhanced SLAs that ensure a more available system.
  • The Exchange Messaging team benefits from streamlined processes that allow maintenance and upgrades to occur in a timely fashion.
  • Windows Server 2008 R2
  • Active Directory Domain Services
  • Microsoft Exchange Server 2010
  • Microsoft Outlook 2010
  • Microsoft System Center Configuration Manager

Ff934521.arrow_px_down(en-us,TechNet.10).gif Executive Summary

Ff934521.arrow_px_down(en-us,TechNet.10).gif Introduction

Ff934521.arrow_px_down(en-us,TechNet.10).gif Overview of the Exchange Messaging Team

Ff934521.arrow_px_down(en-us,TechNet.10).gif Service SLAs and Targets

Ff934521.arrow_px_down(en-us,TechNet.10).gif Incident Management and Response

Ff934521.arrow_px_down(en-us,TechNet.10).gif Problem Handling

Ff934521.arrow_px_down(en-us,TechNet.10).gif Change Control

Ff934521.arrow_px_down(en-us,TechNet.10).gif Operations Process Improvement

Ff934521.arrow_px_down(en-us,TechNet.10).gif Real-life Operations Scenario

Ff934521.arrow_px_down(en-us,TechNet.10).gif Best Practices

Ff934521.arrow_px_down(en-us,TechNet.10).gif Conclusion

Ff934521.arrow_px_down(en-us,TechNet.10).gif For More Information

Executive Summary

Enterprise IT organizations, including the Microsoft Information Technology (Microsoft IT) group, deal with service level agreements (SLAs) and power users accustomed to high levels of performance, availability, and responsiveness. The 180,000-plus users at Microsoft send over 15 million internal e-mail messages a day from more than 150 offices worldwide, as well as from home and while on the road. At Microsoft, many business-critical communication processes depend on the availability of messaging services provided through Microsoft® Exchange Server 2010.

Managing the complex Microsoft IT infrastructure is a team effort that involves many different groups, such as the Datacenter team, the Network Infrastructure team, the Active Directory® team, and the Exchange Messaging team. Overall, Microsoft IT manages two distinct environments: a pre-release production environment to test new product versions and upgrades prior to their release to manufacturing (RTM) and a corporate production environment to provide IT services to Microsoft users. Within these environments, the Microsoft IT Exchange Messaging team handles all Exchange-related operation, management, administration, and process optimization. In that role, the Exchange Messaging team works with many other peer teams at Microsoft IT, sharing its operations and process optimization expertise to help those teams implement efficient and reliable operations processes.

The Messaging Operations group within the Exchange Management team must meet several reliability, availability, and performance targets (such as 99.99 percent availability of Exchange services). To meet these targets, the Messaging Operations group makes use of industry-standard methodologies such as Microsoft Operations Framework (MOF), Microsoft Solutions Framework (MSF), and Information Technology Infrastructure Library (ITIL). For example, the operations model that the Messaging Operations group implemented based on the ITIL framework relies on structured incident management, problem handling, configuration management, and change control processes. These processes enable the Messaging Operations group to capitalize on Exchange Server 2010 administrative features, such as the Exchange Management Console (EMC), to reduce operations costs and ensure efficiencies.

The key to success in daily operations is the right combination of technology, people, and processes. For example, the Messaging Operations group uses technical tools, such as the built-in product features of Exchange Server 2010 and Microsoft System Center Operations Manager, combined with a clear team structure and work processes that facilitate collaboration. Built-in product features of Exchange Server 2010, such as Database Availability Groups (DAGs), help the Messaging Operations group meet 99.99 percent availability and performance targets. New tools and software features, and optimization opportunities gained through customer feedback, enable the Messaging Operations group to analyze and implement changes when necessary to keep pace with the innovative and agile business landscape at Microsoft.

This white paper is for business decision makers, technical decision makers, and operations managers. It assumes that the reader has a working knowledge of Microsoft Windows Server® 2008, Active Directory, Exchange Server 2010, and Microsoft System Center Operations Manager. Because many of the principles and procedures discussed in this paper are based on standard operations methodologies, a high-level understanding of the MOF, MSF, and ITIL models is also helpful.

Note: For security reasons, the sample names of forests, domains, internal resources, organizations, and internally developed security file names that are used in this paper do not represent real resource names used within Microsoft and are for illustration purposes only.


Since the earliest days of Microsoft Exchange Server, the Exchange Messaging team operated an enterprise-messaging environment with an emphasis on using Microsoft technologies wherever possible to keep total cost of ownership (TCO) as low as possible.

In the Exchange Server 2007 time frame, the Exchange Messaging team began moving toward a lower cost-messaging platform by reducing storage and backup costs, moving to an architecture that supported more automation, and redesigning the operations team to better support these goals. With the introduction of the Exchange 2010 DAGs (Database Availability Groups) the Exchange Messaging team achieves a backup less system by leveraging dumpster 2.0 and maintaining a copy of its mailbox data on three copies of the mailbox database at all times.

Moving beyond the Exchange Server 2007 goals, the Messaging team set out to offer a more cost effective and robust messaging environment by expanding on their previous strides forward in reducing storage costs and moving to automated deployments, as well as proving the value available out of the box by removing backups from the environment and increasing the availability goals.

To operate the Exchange Server 2010 environments at Microsoft, the Exchange Messaging team developed work processes based on common IT operations frameworks such as ITIL, MOF, and MSF. According to these frameworks, messaging operations involve managing people, processes, and technology, and more often than not these elements span the boundaries of individual teams and their respective areas of jurisdiction. For example, an important aspect of an incident and its resolution is that the Exchange Messaging team must collaborate with other teams also involved in handling the incident, such as front-line operators who identify the incident, analysts who resolve or escalate it, and technical leads and managers responsible for change management and process improvement.

To offer a more robust messaging platform, Exchange Server 2010 provides enabling technologies that help the Exchange Messaging team meet its performance and availability SLAs. Exchange Server 2010 represents the next generation messaging system, designed to simplify and streamline operations tasks. Among other things, Exchange Server 2010 provides new features such as improved storage design, data replication capabilities (on a per database level), for high availability, the client connections have been moved to the Client Access Server (CAS), role based access controls have been added to streamline access, and an increasing toolset of graphical and command line based management tools has been added.

The new features in Exchange Server 2010 provide the following advantages for the Exchange Messaging team:

  • Less delivery overhead The integration of Exchange Server 2010 with hardware load balancing enables Microsoft IT to establish a scalable, load-balancing infrastructure for external messaging clients and to avoid complicated load distribution and client session affinity issues.

  • Improved Service Time Prior to Exchange Server 2007, the Mailbox server was responsible for hosting Messaging API (MAPI) connections to the mailbox. In the event a database needed to be moved to another node to allow maintenance to occur or during a system failure, Outlook had to reconnect to the server that was now hosting the database adding seconds to the shrinking number of seconds allowed by the SLA. By moving these connections to the Client Access Server, database failovers no longer causes outlook to reconnect increasing the service availability time and improving the perception of the service quality.

  • Greater resource specialization Exchange Server 2010 makes it possible to separate overall messaging services into individual services that each server role provides, and establish specialists for each server role. Single-role server deployments help to establish reliable, flexible, and scalable middle-tier services in the messaging environment, and to enable systems analysts to focus on being experts in one server role or technology area. The operational impact is that knowledge and resolution of issues can take place with greater efficiency than when all-purpose generalists research and solve issues.

  • Increased high availability options Exchange Server 2010 enables Microsoft IT to eliminate all crucial single points of failure in the messaging environment. This was not possible with previous versions of Exchange Server because replication of mailbox data was limited to two node clusters. With the introduction of the DAG features in Exchange Server 2010, Microsoft IT is able to provide redundancy of both services and data for Mailbox servers across more than two nodes. Exchange Server 2010 provides enhanced high availability options for other server roles as well. Microsoft IT uses hardware load balancers with Client Access servers internally and externally, and multiple Hub Transport servers to provide redundancy and load balancing for message delivery.

    Note: Detailed information about high availability with Exchange Server 2010 is available on Microsoft TechNet at http://technet.microsoft.com/en-us/library/bb123523.aspx.

  • More control over change management tasks Remote PowerShell provides the Exchange Messaging team with the means to create scripted and automated initial server configurations, configuration changes, and auditing. This is a key feature for the Exchange Messaging team to empower front-line operators. By running tested and approved scripts, front-line operators can respond to known issues without needing detailed product knowledge. With the addition of Remote PowerShell, the need to keep the versions of the administrative tools that are installed on the front-line operators' computers has been removed.

  • Easier unified messaging service expansion Prior to the deployment of Microsoft Exchange Server 2007 Unified Messaging, Microsoft IT maintained Unified Messaging servers in the main office of each regional location. Exchange Server 2007 enabled Microsoft IT to consolidate these Unified Messaging server locations into the four datacenters that also contain the Mailbox servers, and to integrate the Unified Messaging servers with Microsoft Office Communications Server 2007. This architecture was extended into Exchange Server 2010 where support for additional languages will allow greater internal customer satisfaction.

Overview of the Exchange Messaging Team

The Microsoft IT messaging environment now consists of 77 Mailbox servers (in a total of 9 Database Availability Groups), 27 Hub Transport servers, 15 Unified Messaging servers with supporting VoIP gateways, and 72 Client Access servers. There are over 180,000 mailboxes in the corporate production environment, and approximately 6,000 in the pre-release environment . Operating these environments requires structured human resources with clearly defined roles.

Operation Service Interdependence

Some teams at Microsoft IT are responsible for a specific service, and some teams are responsible for a specific function. For example, the Exchange Messaging, Collaboration Services, and Communications teams provide end-to-end operations for their respective services, as shown in Figure 1. Each team makes use of its dedicated management resources, yet the teams often work together because of the interdependent nature of the Microsoft IT environment. The services these teams support rely on a common infrastructure that includes the physical TCP/IP network and the Active Directory environment.

Figure 1. Microsoft IT service teams' structure as of June 1, 2010

Figure 1. Microsoft IT service teams' structure as of June 1, 2010

Microsoft IT includes the following teams:

  • Collaboration Services The Microsoft environment has many Microsoft Office SharePoint® Server sites, which require significant coordination to operate. The Collaboration Services team manages all aspects related to Office SharePoint Server, including planning for and performing upgrades.

  • Communications This team handles all aspects of design, deployment, operations, administration, and management dealing with Office Communications Server.

  • Exchange Messaging This team handles the entire messaging environment at Microsoft. For more details about the functions and structure of the team, see the section below titled “Messaging Service Structure and Functions.”

  • Shared Services Microsoft IT created the Shared Services team to reduce overlapping responsibilities and cut costs. Before the Shared Services team existed, each service team had its own human resources for managing the tasks that the Shared Services team now assumes. These tasks include common monitoring and other front-line services for all operations teams within the messaging and collaboration-related service teams. The Shared Services team consists of the following groups:

  • Process Engineering This group looks at the processes of the Shared Services team to ensure that they meet the requirements of all peer teams that the Shared Services team supports.

  • Client Support This is a tier 2 support group for Microsoft users. The Client Support group focuses on issues related to end-user connectivity and productivity.

  • Monitoring This group performs all the front-line monitoring for other service groups, such as the Exchange Messaging group. The performance goal for the Shared Services team's Monitoring group is to resolve at least 80 percent of incidents. Therefore, the people in this group must have general server administration and resolution knowledge, and must follow product-specific resolution instructions to resolve incidents. The Messaging Operations group within the Exchange Messaging team creates the necessary Exchange Server–specific knowledge base and resolution instructions, and provides training on general resolution and response processes.

Messaging Service Structure and Functions

Just as multiple teams handle the overall design, deployment, and operations functions of Microsoft IT, the functions within the Exchange Messaging team are similarly distributed. Exchange Messaging team members manage the messaging service from end to end. This entails monitoring messaging-related incidents; coordinating changes; collaborating with the Exchange Server product group and with other Microsoft IT teams. Within the Exchange Messaging team, people have specialized roles and work together in specialized groups, each of which handles a portion of the overall responsibilities, as shown in Figure 2.

Figure 2. Exchange Messaging team organizational structure as of June 2010

Figure 2. Exchange Messaging team organizational structure as of June 2010

Exchange Server 2010 not only enables IT organizations to capitalize on expert knowledge according to individual server roles; it also provides cost-efficient opportunities to cover basic and general operational aspects via Shared Services teams. As shown in Figure 2, Microsoft IT takes advantage of this possibility with Exchange Server 2010 by using a Shared Services team to perform all front-line management tasks. This frees the Exchange Messaging team to focus on escalated issues and complex tasks such as root cause analysis.

In order to carry out Exchange Server 2010–specific monitoring and incident response, the Shared Services team must have specific resolution steps, which the Exchange Messaging team provides. Specialists in the Messaging Operations group use their expert knowledge to create these detailed resolution steps. If an incident arises that these detailed steps do not cover, then the Shared Services team escalates the incident to the Messaging Operations group. Because the Shared Services team handles the vast majority of incidents without escalation, the Exchange Messaging team can apply expert knowledge in an increasingly targeted way.

Messaging Engineering Team Functions

The Messaging Engineering team within the Exchange Messaging team designs the messaging systems in the corporate production environment. This broad goal includes many complementary tasks, such as interacting with the developers in the Exchange Server product group, analyzing performance and scalability of server designs, technology evaluation, and performing research to accomplish these tasks. To design the messaging environment, the Messaging Engineering team verifies the recommended system parameters and configuration options set by the Exchange Server product group as well as the initial performance and configuration recommendations from the pre-release production environment. As part of designing the corporate production environment, the Messaging Engineering team also creates and maintains documentation that details overall environment design aspects, messaging topology, server specifications, and Exchange Server 2010 configuration settings.

Note: The Messaging Engineering team does not design the pre-release production environment. That design evolves from recommendations from the Exchange Server product group. However, it is the task of the Leads team in the Messaging Operations group within the Exchange Messaging team to deploy and operate the pre-release production environment. In this way, the Messaging Operations group verifies performance and functionality based on the default settings before making any customizations and changes for settings in the production environment.

The Messaging Engineering team designs the corporate production environment. Its design is based on the results of capacity planning; enterprise design and architecture practices; the results of lab evaluations and testing; and proven and verified results from the pre-release production environment. The latter entails collaborating with the product group to transfer knowledge such as the configuration and hardware settings, deployment steps, and best practices. Because of this close collaboration, members of the Messaging Operations group participate in many engineering projects and even get the chance to own some aspects of the corporate production environment design. By gaining real-world experience in collaboration with the Messaging Engineering team, members of the Messaging Operations group can move on with their careers as messaging engineers.

Feature Program Manager Functions

The Feature Program Manager (Feature PM) role within the Exchange Management team owns each feature end-to-end from ensuring a smooth interface with the product team, to ensuring the Shared Services team within Microsoft IT provides adequate networking and The Program Manager's primary responsibility is to ensure the features that they are responsible for are implemented optimally and functioning smoothly. This requires working with the Engineering team on the design, ensuring the Service Leads Team will support the new design, ensuring the Build and Change team implement the changes as prescribed, and ensuring the SLAs associated with their feature are met. Additionally, they will need to work with Service Management to ensure that all client-facing features are communicated to end-users and Helpdesk is prepared to support them. This role is the strategic arm of the Messaging Operations group.

Messaging Operations Group

The Messaging Operations group is responsible for the dual goals of product improvement and providing highly reliable and available messaging services. The Messaging Operations group works very closely with the Exchange Server product group, and runs a pre-release production environment dedicated to trying new builds, verifying functionality, and discovering improvement opportunities before the Exchange Server product is released to manufacturing. The in-depth knowledge gained from this close collaboration enables the Messaging Operations group to create thorough incident response documentation for front-line Monitoring group operators, discover problem root causes, and oversee changes to the environment and product releases.

The Messaging Operations group consists of two teams that perform specialized, yet sometimes overlapping tasks. The broad Exchange Server product knowledge that the members of each team possess enables team members to help each other when there is an unexpected workload, such as during an unscheduled outage. The two teams perform the following specific functions:

  • Incident Management team This team is composed of Exchange Server specialists who focus on technologies and monitoring, rather than a specific server role, to handle any unresolved issues passed on to them by the Shared Services group. Team members also possess knowledge of IT operations and server administration. Their responsibilities include fine-tuning of the Microsoft System Center Operations Manager Exchange Management Packs, operational training for the team, incident handling and escalation. The team works closely with the Leads team in its specific areas of expertise in order to gain new knowledge, for career growth, and to provide as-needed resources.

    The Incident Management team performs additional operations work that is not 100 percent related to the Exchange Server product. For example, the team acts as configuration management advisors for System Center Configuration manager (SCCM), change approver for each service, and interacts with customers by presenting in the Computer Information Technology/Foundation (CITF) program.

  • Service Leads team The team consists of people who specialize in a specific server role and work on operational projects such as automation work and scripting work. This strategic team in messaging operations approves the operational feasibility of features for the messaging service, and defines deployment guidelines (including feature validation and new product functionality verification). The Leads team ensures operational efficiency, manages service deployment readiness and documentation for operational staff and works with the product group closely during deployment and new design change requests (DCRs). Additionally, the team works closely with the Messaging Engineering team to transfer knowledge learned during product and feature verification.

  • Build team This team primarily follows pre-established documentation to build environments. The pre-established documentation and procedures are put together by the engineering team and intended for mass rollouts being handled by this team. This leaves the build team free to focus on perfecting implementations and quality assurance as well as providing support through the rest of the larger messaging operations team.

  • Change team This team is responsible for implementing all changes after the product has been implemented in production. This can include changes to items such as transport rules, security access for application relays, or hotfixes, along with a variety of other items that fall into the category of modification to the existing infrastructure.

Note: It is the task of the Messaging Operations group to resolve all issues escalated by the Monitoring group within Shared Services. The Messaging Operations group involves the

Service SLAs and Targets

Microsoft IT pursues an end-to-end approach to SLAs, which means being responsible for meeting the SLAs, and for all the services and components that contribute to SLAs. For example, if a network connectivity issue prevents users from accessing their mailboxes, Microsoft IT considers the messaging service unavailable even if all Exchange Server 2010 servers are up and running because the end-user experience is an unavailable messaging service. Exchange Server 2010 integrates tightly with Active Directory and depends on the TCP/IP network infrastructure and other technologies to unfold its features so that the Exchange Messaging team becomes aware of incidents not just specific to Exchange Server 2010. In fact, many teams at Microsoft IT, such as the Windows Server team and the Active Directory team, proactively report incidents to the Exchange Messaging team before anyone else does.

End-to-end operations provide Microsoft IT with many advantages over the previous server-centric approach. With end-to-end operations, incidents are resolved faster because teams own their incident tickets until an incident is resolved; processes are more flexible; costs are reduced through selective usage of specialists; and the overall performance of the Microsoft IT organization becomes the shared responsibility of all teams. The Exchange Messaging team is ultimately a heavy user of the underlying physical network infrastructure such as Active Directory, DNS, and firewalls. Yet, the team can only meet its SLAs if the other teams also meet theirs. At the core of no-excuses SLAs are individual teams that are responsible for specific areas but share overall accountability.

The no-excuses SLA policy came about when Microsoft IT management examined its organizational hierarchy and realized that users only see an outage, and not its causes. From a user’s point of view, if a service is unavailable the user is witnessing a service outage. The cause of the outage may be an issue with the TCP/IP network, telecommunications provider, or underlying Active Directory infrastructure. None of these causes is Exchange-specific, yet each causes a messaging service outage, which counts against the availability SLA. With the no-excuses SLA, the source of the issue does not matter; the Exchange Messaging team owns the incident and its resolution, and has the responsibility to introduce changes to prevent issue recurrence.

Business Drivers for SLAs

The ambitious SLAs that the Exchange Messaging teams set, as discussed in the next section, “Performance Goals,” are not strictly necessary to meet business needs and user expectations at Microsoft. Historically, Microsoft IT did not always have the 99.99 percent availability SLA, and the business still functioned profitably even at the performance level of 99.9 percent availability.

The Exchange Messaging team moved to SLAs that are more aggressive as a way to push the envelope and prove the possibilities with Exchange Server for even the most demanding customers. For many years previously, the team had gathered performance statistics and reviewed them weekly as scorecards that listed the SLAs and performance. The scorecards had a green indicator for met SLAs and a red indicator for unmet SLAs. Often there were green lights on these scorecards, yet there was an underlying service issue that affected business operations. Moving to end-to-end operations and the adoption of the no-excuses SLA changed this for operations at Microsoft. Now when a category shows green with 99.99 percent targets reached, it means that end-users experienced four nines of service, measured to the second of uptime.

Note: In addition to operating a pre-release production environment, Microsoft runs beta testing programs, and programs with partners where pre-release code is deployed in partner IT environments. This focus on partners means that Microsoft IT does not just look to its internal needs, but is always mindful of the needs of external customers.

Implementation Goals

Microsoft has an e-mail-centric culture. In a typical month, there are over 92,000 OWA connections and 63,000 ActiveSync connections, up from 70,000 and 48,000 at the start of the Exchange Server 2010 rollout. Additionally, the environment supports trusted partner connections, multiple forests, and the global presence of Microsoft. Therefore, performance optimization and improvement is a vital task directly related to SLAs.

With such an e-mail-centric culture, and aggressive demands on the underlying e-mail platform, Microsoft added an additional set of goals to their operational deployment with Exchange Server 2010. The purpose of these additional goals is to ensure that the experience the internal Messaging Operations team had when they implemented Exchange Server 2010 is the same experience customers will have when they implement the product. These goals include:

  • Drive lightweight, resource un-intensive monitoring capabilities.

  • Ensure Database Availability Group (DAG) monitoring was robust enough to support removing backups from the environment.

  • When a System Center Operations Manager Management Pack is deployed, no customizations need to be made to get affective monitoring (works out of the box)

Beyond these customer-focused goals, the Messaging team has an internal company to support as well. For the internal customer, Microsoft IT set overall organization-wide SLAs that would be managed and reported to the company. Beyond these goals, the Messaging Operations group analyzed the various dependencies and developed targeted, team-specific SLAs. These team-specific SLAs yield a finer granularity in controlling and gathering statistics about outages for reports, which enables more accurate self-assessment and trend analysis. Although the team-specific SLA are not as rigorous, they enable a closer inspection of the environment and ensure achievement of the organization-wide SLAs.

Organization-wide SLAs

Organization-wide SLAs represent broad performance goals in the Microsoft IT messaging environment. These SLAs represent a commitment to users, customers, and the Exchange Server product group that Exchange Server 2010 can deliver mission-critical results. The SLAs cover the important messaging aspects, such as delivery times and availability. More specifically, Microsoft IT defined the following SLAs:

  • Delivery of 99.99 percent of all internal messages to their final destination must take 90 seconds or less.

  • Availability Overall availability of messaging services must be 99.99 percent or greater. This includes all aspects of the service experience to end users such as database uptime, Client Access server (CAS) uptime, Ability of Outlook to connect, Internal SMTP sending/receiving mail and is derived from out of the box SCOM rules.

  • Business continuance Business will continue with messaging service in one minute or less for a failure within the server.

  • Supportability Helpdesk has a SLA of 90% resolution with a time to resolution (TTR) of one hour. Once the issue is escalated to Tier 2 Client Support, the TTR is 72 hours with a time to update (TTU) of every 24 hours.

Team-Specific Service Level Agreements and Key Performance Indicators

Team-specific Service Level Agreements (SLAs) and Key Performance Indicators (KPIs) focus on specific server roles, technologies, physical locations, or similar criteria to provide a convenient means for reporting and analysis. These Service Level Agreements (SLAs) and Key Performance Indicators (KPIs) address not only the technical aspects, but also “soft” factors such as user satisfaction. The Messaging Operations group tracks the following SLAs and KPIs:

  • Core e-mail: Overall mailbox availability (SLA)

  • Core e-mail: Outlook Availability (SLA)

  • Client Access server (CAS) per protocol availability: (KPIs)

    • ActiveSync (Mobile)

    • Outlook Web App (OWA)

    • Web Services (EWS)

  • Transport: Mail Flow Statistics (KPI)

  • Client availability (SLA)

  • Client performance (SLA)

  • Unified Messaging: overall availability

Incident Management and Response

The Microsoft IT environment is global and has regional IT teams that are responsible for managing the site-specific hardware. Overall, IT experts work at Microsoft IT; 50 percent of those IT experts are vendors. Running this enterprise IT organization requires established workflow processes, communication paths, and coordination to respond to incidents and resolve them in a timely manner.

The Messaging Operations group is a leader within Microsoft IT in overseeing cost-cutting and process-improvement measures for incident management. The group accomplishes this by decreasing the workload on specialists and transferring that knowledge to front-line Monitoring group operators that respond to incidents. For Microsoft IT, it means the Messaging Operations group can focus specialist resources on more involved processes.

Incident management provides the following advantages:

  • Increased specialization opportunities Part of the method of increasing efficiency and lowering costs is to create specialists within a specific body of messaging knowledge. However, using specialists to solve all operational issues can be expensive. To maintain cost-efficiency, there also must be people who can take over some of the systematic aspects of operations, such as responding to an incident and following prescribed and documented resolution steps. With the Shared Services team acting as front-line monitoring operators for multiple services, each service group can develop service specialists.
  • Fast response by front-line Monitoring group operators Front-line monitoring operators work in a 24-hour, 7-days-per-week datacenter where an operator watches the monitoring screen at all times. The Messaging Operations group takes this very seriously: if an operator wants a break, there must be another person monitoring the console. During incident reviews, one of the aspects of the review involves verifying that an operator was indeed present and watching the monitors when the incident occurred.
  • Uniform and standardized handling of incidents With scripted and prescribed resolution steps that are tested and verified, front-line monitoring operators can follow identical resolution paths no matter the level of personal experience or expertise.
  • Decreased support requirements for product experts If specialists transfer their knowledge of how to resolve incidents to front-line monitoring operators who do not have deep, Exchange-specific product knowledge, then the specialists can focus their energies on other tasks. To accomplish this, the specialists in the Messaging Operations group must perform the knowledge transfer and documentation tasks so that front-line monitoring operators have clear instructions for how to resolve Exchange-specific incidents.
  • Measurable results By separating the overall operational processes according to individual components, the Messaging Operations group can measure each component to gather performance statistics. Having accurate statistics is important because it enables management to have an accurate picture of the environment, and they can therefore spot trends or process inefficiencies. Additionally, assigning primary incident response work to the front-line operators is more cost-efficient than having senior-level specialists resolve incidents.
  • Focus on product validation Well-defined processes and roles help the Messaging Operations group focus on the dual goals of maintaining high standards for the messaging environment and providing product validation to the Exchange Server product group and Microsoft customers. By freeing up specialist resources to work in the pre-release production environment, the Messaging Operations group can devote more time to checking features and functionality of beta builds before the product reaches the marketplace. This results in the Messaging Operations group identifying over 90 percent of product issues in beta code before anyone else does.

Incident Life Cycle

The Messaging Operations group follows a structured framework for dealing with and resolving incidents. The people involved in responding to and resolving incidents follow a scripted series of processes. As discussed below, the life cycle of an incident involves both front-line monitoring operators from the Shared Services team and members of the Messaging Operations group, with defined roles, processes, and tools used from the initial response to final resolution. The teams go through the following incident life cycle:

  • · Awareness/notification Front-line monitoring operators become aware of incidents via several sources. Most incidents originate with Microsoft System Center Operations Manager, which acts as the monitoring and detection system. Microsoft System Center Operations Manager includes rules to check the status of thousands of individual factors, such as queue length and service status in the Exchange organization at various levels of depth. Microsoft System Center Operations Manager accomplishes this via an Exchange Management Pack, which includes thousands of rules specifically for monitoring Exchange Server 2010. These rules are then correlated through the System Center Operations Manager correlation engine to ensure that front-line support only receives root cause alerts. For example, instead of triggering three different alerts during an outage that includes the database going offline, the DAG failing over, and a hard drive failing, the correlation engine only triggers one alert that is related to the hard drive failing as that is the problem that needs to be fixed for the correlated events to become resolved. The Messaging Operations group customizes the Exchange Management Pack by modifying rule settings and alert triggers, as some examples show in Table 1. Another way the Monitoring group becomes aware of incidents is via users who report to the Helpdesk. While the Helpdesk resolves most user issues, some incidents require escalation to the front-line monitoring operators. If they cannot resolve an incident, then it is escalated to the Messaging Operations group.
  • Note: Because Microsoft System Center Operations Manager delivers alerts in real time, and because alerts are proactive, front-line monitoring operators can resolve most incidents before users ever report them to Helpdesk. The Messaging Operations group resolves most of the incidents related to Exchange Server 2010 before users ever become aware of them.

  • Response Microsoft IT uses an incident tracking system that integrates with Microsoft System Center Operations Manager. Alerts that generate from rules in the Exchange Management Pack also automatically generate a ticket in the tracking database. The alerts include two knowledge databases about how to resolve the alerts: the default knowledge base that comes with the Exchange Management Pack, and a messaging-specific knowledge base. The Exchange Messaging team created the messaging-specific knowledge base to gather very detailed information about incidents, in order to help with product improvement. A goal of the Messaging Operations group is to create scripted resolution guidance detailed enough for any front-line monitoring operator to follow the procedure and resolve the incident. To clarify resolution procedures, members of the Messaging Operations group routinely update the knowledge database with the latest guidance, based on their experiences.
  • Management As already mentioned, Microsoft System Center Operations Manager and the incident tracking system provide a way for operations personnel to view details about incidents, such as incident type, occurrence, existing knowledge for resolution guidance, and status. The tracking system enables front-line operators to escalate incidents for resolution if the knowledge base instructions do not resolve an incident. In addition to these tools, the Exchange Messaging team uses OpsWeb, an internal line-of-business (LOB) application that is available to the Helpdesk for viewing tickets, grouping them in selected views, appending Helpdesk tickets to a master ticket, and checking for existing issues in order to avoid repeat escalations.
  • Resolution After resolving an incident, the members of the Monitoring group mark the ticket as complete in the ticket tracking system and archive tickets older than three months. Only the most difficult incidents or those flagged for further investigation and detailed root cause analysis, reach the Leads team. If the Messaging Operations group does not resolve an incident, then the incident further escalates to the Exchange Server product group, which has additional resources such as developers who can perform a live debug. During debugging, developers examine memory dumps to check for causes. Incidents that require additional research, product updates, or a major change generate another ticket in the development database as a change request. In this way, Microsoft IT helps the Exchange Server product group by providing developers with a real-world environment for deep debugging and detecting product issues that require code changes.
  • Review As part of incident management within the incident life cycle, the Messaging Operations group performs daily reviews of the incidents. The purpose of the review is to identify major problems that have not been resolved by front-line staff. The major problems are then assigned to a problem manager who champions the problem with the specialized resources available within the Messaging team and the Product team. After identifying an incident for closer inspection, members of the Messaging Operations group analyze it to determine whether the incident is indicative of a larger underlying problem. At this point, the Messaging Operations group may decide to investigate it further and determine the root cause by using problem-handling processes, as discussed in the next section, “Problem Handling.”

Table 1. Exchange Management Pack Customizations



Exchange store database: RPC Averaged Latency

Changed to alert if latency is sustained above 70 milliseconds for 5 minutes

Unreachable Queue Length

Changed to alert if queue size is greater than 1 for 60+ minutes.

Retry Remote Delivery Queue Length

Changed to alert if queue length is greater than 1 for 30+ minutes.

The database volume space

Changed to alert if database volume space free is lower than 25%

UMWorkerProcess Rejected Calls Percentage

Changed to alert if the % of calls rejected in an hour is greater than 10.

Problem Handling

Whereas Microsoft IT incident management deals with restoring service as quickly as possible, problem handling deals with minimizing the impact of an incident and preventing recurring incidents by seeking to discover incident root causes. The problem-handling discipline focuses on the resolution of the underlying causes of the problem, rather than the speed of the resolution.

For the Messaging Operations group, problem handling involves helping the Exchange Server product group ship the best software possible. This means not only solving issues as they arise, but also finding root the cause of an issue, documenting it, and making sure that there is either a published workaround, or permanent change in the product, that addresses the issue. Problem handling fosters change because it takes into account the people, processes, and technology involved in a particular incident. Evaluating how an incident arose, how it was resolved, and digging down to the root cause means considering what contributed to the problem: people, processes, technology, or a mix of these factors.

The Messaging Operations group uses two environments in its problem-handling processes:

  • Pre-release This environment provides the Messaging Operations group with great flexibility in determining incident root causes because it is set up for the expressed purpose of product improvement and validation. Therefore, meeting rigorous SLAs and resolving incidents as quickly as possible is secondary to determining an incident's root cause, working with developers to replicate and understand product behaviors, and trying workarounds or product updates to rectify an incident. In this environment, it is acceptable to take longer to analyze and resolve an incident, because the analysis should determine the root cause of the incident.
  • Corporate production environment In the corporate production environment, problem handling complements incident management by finding, if possible, the root cause of an incident. The Messaging Operations group allows extended downtime only if an incident is not reproducible in another environment to ensure that the developers implement necessary fixes in the product code before Microsoft releases Exchange Server 2010 to customers. Because of the rigorous SLAs, and because the Messaging Operations group must demonstrate product readiness, the Messaging Operations group documents the settings and configurations that led to an incident, in order to use that information when working to discover the root cause of the incident.

Selection Process

The Messaging Operations group selects which incidents to investigate based on two factors: a list of incident types that require mandatory investigation and as-needed inquiries for remaining incidents. The list of events that require the creation of a problem ticket includes serious incidents, such as a queue size of more than 10,000 messages; UM service disruption, server outage, and so on. The Messaging Operations group creates a problem ticket for any incident that severely affects any service they support.

To select other tickets for investigation, the Messaging Operations group conducts weekly service review meetings with representatives from the product team to discuss all outstanding problem tickets and review incidents from the previous week to determine whether any require the opening of a new trouble ticket. The most important criteria used to select as-needed incidents for further investigation is incident frequency and trends. When the same type of incident repeatedly occurs, it often signals an underlying problem that is not isolated to just a few servers. Trend analysis helps to evaluate frequency over a larger period to help determine whether to further investigate some incidents.

After narrowing down the list of incidents to investigate, the Messaging Operations group performs a sanity check on the incident summary to ensure that all the necessary components are present to open a problem ticket. For example, the incident report must include a full set of notes that document the incident through every step, from initial alert to final resolution. Without these components, the Messaging Operations group cannot select what to investigate because there is not enough data. After opening a ticket, the Messaging Operations group uses its tools to assign the ticket to a team member, who is responsible for following up at least once a week until the issue is resolved.

The Messaging Operations group maintains its own database tool to create and manage problem tickets. In addition, the team uses a custom Office SharePoint Server site for problem ticket review. These tools track progress, help assigning resources for a problem, and facilitate managing problem status.

Problem Review

Problem management also includes problem review and metrics. After finding the root cause of an issue, the team member responsible for handling the issue must create a corresponding entry in the knowledge base or similar documentation within seven days of resolution of the issue. Through experience, the Messaging Operations group discovered that if a problem occurs once on a specific server or group of servers, it often recurs with other servers. Therefore, it is most efficient to provide comprehensive resolution steps for front-line monitoring operators as soon as possible. The Messaging Operations group reviews problems at weekly meetings to provide status updates and to open new problem tickets.

At monthly meetings, the Messaging Operations group tracks metrics and trends, which presents an opportunity to view a scorecard of statistics and trends for the month. The scorecard consists of various MPR data viewed through pivot tables, including the number of tickets opened and the number of tickets resolved, with the root cause determined.

The most serious problems are escalated to the Joint availability task force. This is a task force that composes the leadership across the groups within Microsoft IT and the Exchange Server Product Group. The purpose of this group is to ensure all teams involved in the making and implementation of the product have visibility into the largest issues. This ensures the right resources are focused on the issues and that the issues are prioritized appropriately.

During an incident that has not been resolved but requires this level of escalation, the task force is engaged via e-mail and kept up to date throughout the issue.

Problem Notification

Notifying end users and helpdesk staff is a critical component to ensuring users of the system are satisfied with the service they are using. Notification can take several forms from updating an internal website where status of the service is displayed, to ensuring the helpdesk can properly describe the problems that are occurring in the environment to end users, to directly notifying end users.

Microsoft's internal web site OpsWeb is used for general notification of issues with the system. Details about what components of the system are working and not working, such as e-mail flow to and from the internet are displayed on the site using green, yellow, red indicators. This tool is available to all users of the system for viewing and is also the place where the helpdesk staff interact with the ticketing system on a day-to-day basis.

Notification directly to end users of a system outage or problem is extremely rare. Most problems are resolved before the majority of end users are aware there is a problem. For the subset of users who are affected by these shorter outages, it makes more sense to ensure they are updated with resolution ETAs through the OpsWeb site or the helpdesk team. In the event of a prolonged service outage, an e-mail communication is sent by the Service Management team to inform users of root cause and resolution which helps maintain a high client satisfaction level.

Change Control

For most IT organizations, introducing changes to a production environment typically involves a source for inputting changes (such as user feedback) and a way to design and verify the changes in a sandbox environment. Then they roll out changes in a software update or similar mechanism to the production environment. Additionally, change control incorporates review processes and management tools to ensure that teams working on changes track and complete change requests.

For the Messaging Operations group specifically, change control encompasses the traditional processes of accepting change requests from multiple sources and then designing, verifying, and rolling out changes. The underlying goal is to increase the prescribed handling of the change processes. The Messaging Operations group, in working through its change control processes, attempts to reduce the workload on specialists and distribute work to others by thoroughly documenting steps and procedures for implementing changes. This enables those who are not Exchange Server 2010 specialists to apply changes uniformly to all computers in the messaging environment.

The Messaging Operations team accomplishes this goal through a Change Advisory Board (CAB). The Change Advisory Board (CAB) meets weekly to review the proposed changes to the system. These changes may be proposed as part of a problem resolution step, a service enhancement, or a preventative maintenance measure. Several things happen during the CAB meeting

  • Request for change creation Change packages are presented to the board for review. These packages contain the details regarding what is being requested to be changed, what will be impacted, step-by-step instructions of how the change will be implemented, and justification for the change.
  • Selection The Messaging Operations group accepts Change Packages if they meet approval criteria such as completeness of information. Information completeness includes detailed rollback and rollout instructions, severity, and expected turnaround time frames. In dealing with changes, the Messaging Operations group maintains a staged set of instructions that a non-specialist can follow to implement the change.
  • Severity and impact analysis The board categorizes changes based on SLA impact, capacity, security, and disaster recovery readiness, and assigns minor, major, or automatic severity status. The automatic status is used for small changes that can be implemented without further investigation because they are either critical to performance and availability or pose no significant risk.
  • Prioritization The change urgency complements its severity status. Some changes represent emergency solutions and require implementation in 24 hours or less, whereas others may be moderately urgent and can be scheduled for completion over one or more weeks. This prioritization feeds the negotiation of when the change will actually occur. Some change windows are assigned during the day for changes that will not impact the system, other changes require after-hours implementations, and even others may require notification to end users.
  • Implementation As part of the implementation process, the Messaging Operations group assigns changes to the Build/Change team to execute the change package based on all of the previously negotiated parameters. Once the Build/Change team executes the change, the Test team is engaged to ensure the change was successful and the back-out instructions do not need to be used.

Team Involvement

In Microsoft IT, many people contribute during change control tasks, but the ultimate responsibility for resolving issues rests with the Messaging Operations group, which oversees the performance of Exchange Server 2010 across all Microsoft IT environments. Especially in the pre-release environment, the Messaging Operations group has many opportunities to verify specific product functionality of builds.

The Messaging Operations group is the first in the line of contributors that submit change requests to the Exchange Server product group. After identifying an issue that requires changes to the Exchange Server product code, the Messaging Operations group works with developers to create a design change request in the developer database, which automatically becomes the responsibility of the Exchange Server product group and developers. Although others may create product updates and other changes, the Messaging Operations group is responsible for requesting, verifying, and approving builds and updates for rollout to production environments. The Messaging Operations group provides a disciplined process for introducing required changes into a complex IT environment with minimal disruption to ongoing operations. The Messaging Operations group remains closely aligned with the release management process, and manages the release and deployment of changes into the production environment.

Code and Product Improvement

Another aspect of change control processes involves product validation, which the Messaging Operations group performs in collaboration with the Exchange Server product group. During beta testing and pre-release partner deployments, the Exchange Server product group may decide to implement changes to the code based on tester and partner feedback. Although the Exchange Server product group controls the code, the Messaging Operations group is responsible for validating the functionality of changed features and proving enterprise readiness by using it in the pre-release production environment after the Exchange Server product group completes typical quality assurance tasks.

Operations Process Improvement

Within the incident management, problem handling, and change control processes that the Messaging Operations group performs, there is a constant effort to improve processes and thereby realize new levels of efficiency, scalability, repeatability, and cost savings.

User Feedback

A key source of process improvement comes from end users. Although the Helpdesk at Microsoft deals with first-tier support issues related to messaging, the Messaging Operations group participates in a satisfied-user initiative, which results in gathered feedback from users regarding functionality and performance in the messaging environment. The Messaging Operations group uses surveys to request feedback from users on satisfaction in a particular messaging service area, such as response times and availability, as well as to check users' general satisfaction. Some of the internal SLAs cover user satisfaction; meeting those SLAs and analyzing sources of dissatisfaction leads to an analysis of the people, processes and technology that are used to deliver messaging services. When this analysis results in the discovery of better processes or different combinations of people, processes, and technology, the Messaging Operations group makes appropriate changes to enact these improvements.

Interaction with Microsoft Customers

The Messaging Operations group actively shares its knowledge with the messaging community and uses this interaction as a method to gather feedback and use that knowledge to improve operations processes. There are many ways the Messaging Operations group shares knowledge. For example, members participate in industry conferences, conduct seminars and presentations, and share operational knowledge with MCS, which then uses it for specific customers. They also participate in partner programs to perform product validation during alpha and beta releases of Exchange Server.

Another way the Messaging Operations group engages with customers is through the IT Fellowship series. Customers can talk with Microsoft IT about IT operations and specific services, and discuss general best practices during this two-week program.

Real-life Operations Scenario

As previously mentioned, the Messaging Operations group interacts with many Microsoft IT service teams as well as the Exchange Server product group to accomplish its dual goals of meeting SLAs and providing product validation to customers. Because of the volume of activities and work, the group follows a structured model of operations based on MOF and ITIL. The theories provide a framework and guidance for messaging operations, yet operations architects must ultimately make decisions based on what works in real world IT environments. As Figure 5 shows, the Messaging Operation group follows an orderly operations workflow with straightforward escalation paths and clear task assignments:

Figure 5. Messaging Operations group workflow

Figure 3. Messaging Operations group workflow

In its workflow, the Messaging Operations group defined the day-to-day tasks of operating the messaging environment, including responding to incidents, resolving incidents, determining the root cause of incidents, changing and improving the environment, and enforcing consistent hardware and software configurations across all servers. The following example demonstrates how all these processes fit together. It shows how the Messaging Operations group resolved a specific performance issue in the messaging environment.

Incident Response

An example of what these steps look like in a real scenario are outlined below. This anecdote is taken from a real scenario that the Operations team experienced and includes the initial steps taken.

The situation arose when a front-line monitoring operator from the Shared Services team noticed an alert that Microsoft System Center Operations Manager issued to the monitoring console. The alert indicated that an Exchange server was experiencing several problems, there was an alert that a database had failed over, an alert that a disk had died, and a replication error alert for one of the databases on a mailbox server.

The front-line monitoring operator followed the steps in the troubleshooting guide that was attached to the alert, considering the failover occurred there was no service outage and a ticket with the site operations team was immediately opened. Had there not been a database failover event, the troubleshooting guide had additional steps that included step by step instructions to logon to the server and run predefined scripts to attempt a remount of the database.

In this case, the database did failover and a ticket was opened with the site operations team directly. The site operations team is responsible for all datacenter activity and in this case responded to the ticket within one hour to replace the disk in the JBOD array. On replacement, the site operations team marked the ticket as resolved and the front-line monitoring operator was able to move the incident to the next steps.

The next step in this case is to open a ticket with the Tier 2 Exchange Monitoring Team. This team is responsible for final resolution of disk failures such as this one. They received the ticket that a disk had failed, the database automatically failed over, and the disk had subsequently been replaced. Their tasks now included rebuilding the mount point for the new disk that was replaced in the system and reseeding the database. The database reseed process can take up to eight hours to complete due to the large database sizes and limited JBOD IO. To ensure this long reseed time does not pose a further risk, it is monitored through completion.

This event resulted in a fast resolution and did not require further escalation. Had the database not failed over automatically due to a problem such as the last log available being more than seven logs old, the issue would need to be escalated to Tier 2 immediately. Under certain scenarios such as a scenario where only one copy of any single database is available, the issue would be escalated as a priority one, severity A issue.

Escalation and Major Problem Resolution

With priority one, severity A issues, Tier 2 owns the problem through resolution. They are tasked with opening a bridge line and assembling a task force of resources to see the problem through resolution. If the problem cannot be resolved in the first 30 minutes by the Tier 2 resources themselves, additional resources will be brought on to assist the Tier 2 resources in resolution. The first resources Tier 2 looks to for assistance is a case with CSS. These are the same resources that customers escalate to and CSS assists in the same way. They get on the bridge line with the Tier 2 support staff and work towards resolution.

Continued escalation occurs every 30 minutes with status notifications going out to senior management every sixty minutes. Tier 2 has a defined SLA of one- four hours to resolve open issues and brings on necessary resources through escalations in CSS or through internal escalations to the leads team or directly to the product team.

As issues are found that are unresolvable and brought directly to the product team as bugs, the product team works with both the Messaging Team and CSS to open product bugs in the same fashion that customers can escalate product bugs directly to the product team. This both ensures that the internal messaging team experiences the product in the same way that all other customers experience the product and the Exchange product team gets feedback from both external customers and their internal customer about the features of Exchange and the methods for escalating bugs to the product group.

Best Practices

By adopting an operations framework based on industry standards such as MOF and ITIL, the Messaging Operations group was able to identify best practices that cover daily tasks of operations and provide guidance for IT professionals for designing and operating an enterprise-messaging environment based on Exchange Server 2010. These best practices sometimes apply to all operations such as the best practice of adopting scalable and flexible processes, and sometimes only to specific disciplines such as incident management or change management.

The Messaging Operations group relies on the following best practices:

  • Use tools for tracking and management The Exchange Messaging team uses many tools as part of its Exchange Server operations. For example, Microsoft System Center Operations Manager combined with a ticket-tracking database provides the capability to monitor the environment in real time, including configuration data, and track incidents from inception to resolution. The Messaging Operations group uses specialized tools, such as custom LOB applications for problem review and change implementation, in addition to custom scorecards for metrics and diagnostic and troubleshooting tools.
  • Implement review processes for each discipline The Messaging Operations group specifically includes review processes for incident review, problem handling, configuration management, and change control. This enables the group to optimize its processes on an ongoing basis and foster a culture that embraces change and emphasizes improvements.
  • Perform monitoring centrally Microsoft IT relies on a Shared Services team for all monitoring in order to gain the benefit of centralized monitoring without the cost of using a distinct team for each service. With Microsoft System Center Operations Manager, centralized monitoring provides event correlation, at-a-glance summaries of system status, and detailed reports and alerts.
  • Conduct regular reviews The Messaging Operations group reviews incidents daily, changes packages, and problem-handling tickets weekly, and trending issues monthly.
  • Systematize resolution steps and transfer knowledge to front-line operators With each new incident, the Messaging Operations group has an opportunity to improve the resolution guidance for front-line operators. The Messaging Operations group both reviews existing steps to improve guidance and documents resolution steps for new incidents from data gathered during problem handling and root-cause analysis processes.
  • Measure statistics The Messaging Operations group measures not only overall SLAs, but also specific internal SLAs, which enables easier trend spotting and targeted performance improvement.


From the earliest days of operating the corporate messaging environment with Exchange Server to the present, Microsoft IT has continually increased its performance and availability targets and the scope of its goals for the messaging environment. Microsoft IT delivers consistent verification to even the most demanding customers that Exchange Server technology can meet rigorous availability and performance requirements by providing messaging services with a no-exception, end-to-end policy from a user's point of view and by achieving 99.99 percent availability.

For the Messaging Operations group delivering consistent results requires using the right mix of technology, people, and processes. Exchange Server 2010, Microsoft System Center Operations Manager, and other Microsoft server products provide a sound technological foundation upon which people and processes can rely. The modular and flexible product design of Exchange Server 2010, based on server roles, promotes technical specialization within the Exchange Messaging team. The MOF and ITIL frameworks are the bases for implementing clear communication paths, team hierarchies, and escalation procedures.

Among other things, Exchange Server 2010 helps the Messaging Operations group to decrease operational costs through improved management and administration tools, while offering new technologies such as Database Availability Groups (DAGs) and Client Access Server Arrays for increased performance and availability. These benefits are not specific to Microsoft IT because they are repeatable in other environments that follow proven operational processes based on industry standard frameworks.

The Messaging Operations group continues to drive forward process improvement and knowledge sharing with other service teams as well as Microsoft customers. This includes all levels of IT operations: incident handling and response, problem handling, change management, configuration management, and even showing other IT organizations how to improve processes through reviews and improvement initiatives. The Exchange Messaging team is the originator of many change requests submitted to the Exchange Server product group for implementation in product updates, service packs, and future versions of Exchange Server. Internal experiences and customer feedback are the main sources. The close collaboration between the Exchange Messaging team and the Exchange Server product group ensures that Exchange Server technology continues to meet the present and future needs of real-world customers.

For More Information

For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:



The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.


Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

© 2010 Microsoft Corporation. All rights reserved.

Microsoft, Windows Server 2008 R2, Active Directory Domain Services, Microsoft Exchange Server 2010, Microsoft System Center Configuration Manager, Microsoft Outlook 2010 are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

All other trademarks are property of their respective owners.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
© 2015 Microsoft