Export (0) Print
Expand All

Maintaining High Availability for Microsoft.com

Published: March 2010

The following content may no longer reflect Microsoft’s current position or infrastructure. This content should be viewed as reference documentation only, to inform IT business decisions within your own company or organization.

The Microsoft corporate Web site, Microsoft.com, is one of the largest and most heavily visited sites on the Internet, yet it maintains consistently high availability ratings. The team that operates the site meets these demands through a combination of carefully planned infrastructure; collaboration with other teams; and use of technology for maintenance, monitoring, and change management.

Download

Intended Audience Products & Technologies

Download Article, 264 KB, Microsoft Word file

Download Video, TechNet Radio

IT managers, Web site managers, IT pros, business decision makers responsible for Web strategy, and chief information officers.

  • Microsoft Windows Server 2008 R2 Enterprise
  • Internet Information Services 7.5
  • Web Capacity Analysis Tool
  • TinyGet
  • Microsoft SQL Server 2008
  • Log shipping
  • Database mirroring
  • Microsoft System Center Configuration Manager R2
  • Microsoft System Center Operations Manager 2007
  • Clustered servers

Introduction

During the past eight years, Microsoft.com has achieved one of the highest rankings on the Internet in terms of site availability as measured by Keynote Systems Inc., an independent third party. According to the Keynote reports, Microsoft.com has been available more than 99.8 percent of the time for the past five consecutive years, and more than 99.9 percent of the time for the past two years. The site generates more than 1.2 billion hits per day from more than 57 million unique Internet Protocol (IP) addresses. This traffic generates 200 million daily page views, averages 30,000 Hypertext Transfer Protocol (HTTP) requests per second, and results in an average of 750,000 concurrent client connections.

The Microsoft.com Operations (MSCOM Ops) team within Microsoft Information Technology (Microsoft IT) operates more than 300 production servers that host approximately 900 Web applications. Based on Internet Information Services (IIS) and Microsoft® SQL Server® database software, the infrastructure design takes advantage of newly released tools and features in support of the team's goal to be an early adopter of Microsoft technologies.

This article describes how MSCOM Ops identifies and mitigates potential points of failure to deliver continuous availability for Microsoft.com—even while adopting new Microsoft technologies in the production environment. This article includes best practices developed over years of operating a highly available, large-scale, and continuously performing Web infrastructure. Any organization that seeks to host a highly available Web presence on Microsoft technologies may be able to use these best practices. The best practices address:

  • How to identify and address availability issues through building in redundancy and evaluating the need to design solutions to geographic-segmentation challenges.
  • Process guidance, including suggestions about when, during the software development life cycle (SDLC), operations and applications engineers can work together to support delivering high availability.
  • General guidance on planning for, building, and using proper monitoring based on understanding site traffic.

Infrastructure Architecture

MSCOM Ops supports Microsoft.com servers in four Internet data centers, though the Microsoft.com Operations Engineers do not directly manage the physical data center environment. The team's responsibility for administrative control and for the server infrastructure begins as soon as the servers are physically installed in the data center.

The Microsoft.com infrastructure supports hundreds of databases Web applications, all of which must remain highly available for customers, partners, and content providers from around the company. Each type of user for the Microsoft corporate Web site needs access to these properties from around the world 24 hours a day.

Note: A property, for the purposes of discussion in this article, includes the front end, middle tier, back end, and network that host all the infrastructure and code that make up a Web entity.

These applications run on IIS 7.5 in 12 application pools on 128 Web servers. These applications rely on 10 servers predominantly running Microsoft SQL Server 2008; a few are still being upgraded from Microsoft SQL Server 2005.

Note: For more information about SQL Server 2008, go to http://www.microsoft.com/sqlserver2008.

Design Considerations

The key to providing high availability is to build redundancy into the architecture with the specific goal of minimizing the potential for a single point of failure. MSCOM Ops built in redundancy not only within each data center but also across data centers. In line with standard industry approaches, MSCOM Ops has taken the following measures to cope with network or regional power outages, data center accidents, and other system infrastructure issues:

  • The strategic use of access control lists (ACLs) for access control of router hardware
  • Global load balancing and health checking obtained from content delivery network (CDN) partners who maintain parts of the infrastructure in distributed geographic locations
  • Both hardware-based and software-based local load balancing

Figure 1 shows a high-level and conceptual overview of the MSCOM Ops infrastructure design. As a conceptual diagram, only two data centers are depicted to illustrate key points of redundancy built into the network and system.

Figure 1. Overview of MSCOM Ops infrastructure design

Figure 1. Overview of MSCOM Ops infrastructure design

The following key design considerations enable MSCOM Ops to deliver a high level of availability.

Scalable Units

A scalable unit consists of a grouping of a defined number of Web servers and of SQL Server–based servers designed to meet a specified load and performance profile. The process of identifying a scalable unit includes conducting performance and load tests with the goal of determining the optimum capacity of the unit. After completing this testing process, MSCOM Ops uses the scalable unit with the associated benchmark metrics as an integral component to planning, building, and maintaining a flexible and scalable operating environment for MSCOM Ops hosted Web properties.

Configuration Standards for Servers and Clusters

After MSCOM Ops identifies scalable units, it applies this construct within the context of the segmented infrastructure design to develop configuration standards for servers and clusters. The team keeps these standards as identical as possible, considering the business and technical requirements of the Web applications being hosted. Enforcing use of these standard configurations results in the following benefits:

  • MSCOM Ops can more readily identify and resolve issues that occur at the infrastructure level. Because Web applications are designed to work with standard server configurations, determining whether a particular incident is an application or infrastructure issue requires less investigation than if highly specialized server configurations were the norm.
  • MSCOM Ops can more easily repurpose clusters or servers if additional capacity is needed, even in the case of supporting a different property than the server or cluster was originally deployed for.

Segmentation of the Infrastructure to Isolate and Help Protect Data Storage

In the MSCOM Ops infrastructure, database servers do not face the internet directly. By keeping them on the back end of the architecture, MSCOM Ops reduces the risk of malicious attack and thereby increases the overall availability of the SQL Server–based servers that act as databases for the hundreds of applications hosted on Microsoft.com.

Redundancy Within a Data Center

The MSCOM Ops infrastructure design includes redundant hardware for areas identified as having a risk of failure due to occasional spikes in demand or because they have a higher risk of malicious attack. The infrastructure includes multiple routers, local load balancers, and front-end servers that host Web applications. Each data center contains a read/write and a read-only SQL Server–based server cluster to allow for failover if the primary cluster must be taken offline. In addition, MSCOM Ops uses mirroring as appropriate based on the performance and availability requirements of a particular application database.

Note: To learn about the types of replication available and for guidance about using this technology to establish and maintain high availability in an enterprise infrastructure, see "SQL Server 2008 Replication" at http://msdn.microsoft.com/en-us/library/ms151198.aspx.

Hardware for Load Balancing of Dynamic Content and Application Code

MSCOM Ops currently uses hardware load balancers to provide local load balancing for Web servers in data centers. These devices facilitate high availability in three ways:

  • The hardware load balancers provide additional monitors that aid in determining whether the root cause of an issue is occurring at the application level or the operating system level. For example, in a shared hosting environment like that used for Microsoft.com, if one Web application has an issue, taking the server or cluster of servers that host it offline until the issue is resolved will disrupt the availability of all the other applications that are hosted on the same shared infrastructure. Knowing which alerts relate to the health of the servers—as opposed to those that relate to the health of the applications being hosted on the servers—facilitates the ability to more quickly remove unhealthy servers from rotation (stop them from taking live traffic) without negatively affecting overall customer experience.
  • MSCOM Ops can remove individual servers from rotation for updates, maintenance, or upgrades, conduct preliminary operational tests on them, and then put them back in rotation without taking an entire cluster offline.
  • When engaging in the adoption of new technology, MSCOM Ops can remove a single server from rotation to use it in testing. For example, the team can deploy a beta build of the next operating system to one server without affecting an entire cluster. The team can send a very small amount of load to that server initially, and then incrementally increase the load to normal and beyond to test the new operating system or components.

Redundancy Across Data Centers

Various tactics support providing redundancy across data centers to help prevent a single point of infrastructure failure. To prepare for the possibility that a data center might become unavailable and to enable a speedier recovery if such an incident occurs, MSCOM Ops has optimized the architecture design to facilitate:

  • Content and application code replication.
  • SQL log shipping from one data center to a backup cluster in another.

Note: To learn how to send transaction logs from one SQL Server–based server to another, see "Log Shipping" at http://msdn.microsoft.com/en-us/library/bb895393.aspx.

Anomalous Traffic

The architecture enables MSCOM Ops to apply multiple tactics to help protect the hosting infrastructure from the effects of anomalous traffic:

  • Devices that detect anomalous traffic sit at the top of the data center infrastructure and discard suspicious network traffic before it reaches the data center. These devices also provide data used in analyzing traffic patterns and improving filtering. Eliminating suspicious traffic before it affects Web servers is a key element to providing high availability.
  • The hardware load balancers use synchronization (SYN) cookies to manage a high volume of Transmission Control Protocol (TCP) connections with low resource impact on the hardware. This implementation means that the validation process for TCP connections requires fewer infrastructure resources.
  • The hardware load balancers also enable application-layer filtering with the ability to drop or redirect traffic based on numerous characteristics within the request.
  • On the Web servers, MSCOM Ops uses URL Rewrite with pattern matching to isolate anomalous traffic. For more information about URL Rewrite, see "IIS URL Rewrite Module" at http://technet.microsoft.com/en-us/library/ee215194(WS.10).aspx.

Access Layer Networking

Access to Web sites is limited through router ACLs where only TCP ports 80 and 443 are open. If a particular property or application requires access to other services such as Simple Mail Transfer Protocol (SMTP), MSCOM Ops requests the needed port access from the networking group who opens the specific ports for those services on separate, isolated virtual local area networks (LANs). Segmenting services this way, a practice closely related to access layer networking, mitigates the potential harm from distributed denial of service attacks and therefore supports a high level of availability.

Content Delivery Network Services

MSCOM Ops identified, as a component of its strategy for delivering high availability and performance, the need to address certain geographically based challenges with its hosting infrastructure. Overcoming some of these challenges would have required either building additional data centers or placing servers in certain strategic locations. The team determined that both solutions would be too expensive to implement in the long term.

As a result, the team brokered agreements with Content Delivery Network (CDN) partners to use caching services in strategic locations and to perform certain ongoing technical services. These types of services may not be necessary or appropriate for all large-scale Web sites. However, using services available through CDNs enables MSCOM Ops to optimize the architecture to perform well under variable loads in disparate geographical areas.

The caching servers that CDN partners own and host help provide high availability and performance through application of the following practices:

  • Edge caching. The CDNs cache static content, such as image and style-sheet files, on globally dispersed caching servers. This technique speeds delivery of content to the user by delivering the content from a location that is generally in close proximity to the user. Caching static content results in better page load times for the end user.
  • Global load balancing. This practice, in conjunction with global health checking described below, produces better Web page delivery for users by delivering requested content from a server with closer geographic proximity to the requestor based on the location of the requesting IP address. It also allows clusters to be removed from rotation to be patched without disrupting availability of the site to the end user. MSCOM Ops uses this service to shape traffic to particular clusters by sending more or less traffic based on hardware capabilities, capacity testing, and testing for new technology adoption, as required.
  • Global health checking. To determine whether a particular cluster of servers should receive end-user requests, the CDNs request a specific test file on front-end Web servers. If failures are detected (based on receipt of an HTTP status code other than 200), the errant cluster is removed from rotation until the error has been investigated and resolved. At that point, the testing will succeed and the cluster will automatically go back into rotation.

End-to-End Monitoring

MSCOM Ops currently uses one management group across four domains in Microsoft System Center Operations Manager 2007 for event and performance data collection as a solution to provide proactive monitoring and alerting. Microsoft.com is one of several properties monitored by this one management group, which contains eight management servers that function across six gateways. These management servers have more than 2,500 agents and collect approximately 11,000 alerts per month and as many as 189 million performance counter samples per day. The alerts also include those collected from 45 agents that Keynote™ hosts. The automation currently available in Microsoft System Center Configuration Manager 2007 and Operations Manager enable the team to see when a new computer has been added to the operating environment, including detecting what operating and platform systems are on it. For more information about System Center solutions, go to the following product Web pages:

The MSCOM Ops monitoring system also collects SQL Server storage performance data and alerts based on SQL Server 2008 agents and on application-level alerts. Alerts are collected as they occur, and performance data is collected every 90 seconds.

Collaboration with Application Development Teams

MSCOM Ops collaborates with application development teams to identify critical application functions and methods that need monitoring. This collaboration enables developers to build actionable alert triggers into applications that MSCOM Ops can then monitor. Combining this application development best practice with a mature monitoring system helps identify clearer support processes and enables support engineers to quickly address incidents—in many cases, before customers become aware that an incident occurred.

In addition to working with application developers to provide clarification and guidance on building actionable event alerts into applications, the architects and engineers in MSCOM Ops communicate with developers about the implications of designing an application for the Microsoft.com hosting infrastructure. In particular, designs must take into account the synchronization of application data and code across geographically distributed clusters in multiple data centers. Operations architects interact with application engineers at key phases of the software development life cycle (SDLC), as shown in the following table.

Table 1. How MSCOM Ops Collaborates with Application Development Teams

SDLC phase

Collaboration tasks

Requirements and design

Operational feasibility analysis, including reviewing performance, monitoring and availability requirements.

Architectural reviews, including capacity planning and hardware evaluation.

Operational input on initial test plans.

Implementation and verification

Participation in application and database code reviews, including providing input on monitoring requirements. Reviews in this phase include considering operations architecture and performance results.

Release and production

Collaboration in final application testing, deployment, and verification of system health in production.

Creation of support documentation and procedures for escalating issues to application and operations engineers.

During these interactions, communication flows both ways as MSCOM Ops architects and engineers learn about the operational needs of a particular application. Unearthing these requirements early in the design process allows developers and operations engineers to more effectively work together to build in appropriate monitoring, plan for and test performance and availability requirements, and establish appropriate support processes for when the application is released.

Note: For planning and deployment guidance, go to the "Windows Web Platform Hosting Guidance" page at http://www.microsoft.com/hosting/solutions/windowsserverhostingguidance.mspx.

Adoption and Availability of New Technology

Early-adoption efforts allow MSCOM Ops to provide valuable feedback to product groups while sharing best practices with Microsoft customers. Throughout these efforts, MSCOM Ops benefits from the scalability intentionally included in the architecture of its hosting infrastructure. The flexibility inherent in this design enables the team to deliver to the same high availability and performance requirements while evaluating beta and pre-release Microsoft technologies. This evaluation serves the dual purpose of demonstrating the benefits of new tools and features while identifying any remaining real-world issues through use in the production environment.

Many of the design considerations discussed previously lay the foundation for maintaining high availability even during deployment and testing of pre-release versions of Microsoft technologies. Building redundancy into the infrastructure from the physical layer up through the presentation and application layers, including use of database mirroring and log shipping, allows MSCOM Ops to make changes without ever having to take all servers offline at once. MSCOM Ops applies the same basic process to making operating systems and server software updates as it does to patching those systems. These processes take advantage of local and global load balancing to facilitate load testing while managing operational risks. The team uses scripts as much as possible to mitigate the risk of manually introducing errors to migrations or updates.

Using Tools

MSCOM Ops uses tools, such as the Web Capacity Analysis Tool (WCAT), for analyzing performance and for monitoring server infrastructure changes before upgrading production servers to new versions of the Windows® operating system, IIS, or SQL Server. Using these tools enables the team to adapt its already mature operations processes to help ensure that availability, scalability, and security requirements continue to be met throughout MSCOM Ops hosting environments.

For example, when updates to front-end Web servers are necessary, MSCOM Ops takes the following steps:

  1. Removes a server from rotation in a local cluster.
  2. Applies the software update to the server.
  3. Tests the server by using WCAT and TinyGet to replicate production loads by replaying IIS logs of live Web traffic while the server is still out of rotation.
  4. Further investigates and resolves issues if it detects significant errors in the HTTP status codes.
  5. Puts the server back into rotation in the production environment and closely monitors the server for a specified period while it takes real traffic.
  6. Reviews any issues with the server's performance and confirms that the server is stable.
  7. Removes the entire cluster from rotation at the global level while it updates the remaining servers.

Note: For more information using these tools, see "Validating Content on Early Releases" available at http://technet.microsoft.com/en-us/library/cc656685.aspx.

MSCOM Ops applies updates to database servers at the cluster level. The offline mirror database is the first to receive such software changes. After it is unit tested and has been signed off by the application owner's test team, it is added back to the cluster and resynchronized. What was originally the offline mirror then becomes the principle server running SQL Server, which is closely monitored for a specified period. If issues arise, the team fails over to the not-yet-patched mirror to investigate resolve the incidents. If no issues occur while the updated database cluster is taking production traffic, the offline mirror goes through the same update process before being added back to the cluster.

These processes enable MSCOM Ops to incrementally increase traffic to changed servers and clusters. Stressing the system with live production traffic, exposes application and infrastructure issues that are difficult to identify in lab environments. The ability to quickly identify and resolve such issues before applying an infrastructure change to all servers and clusters enables the team to continue to deliver high availability.

Validating Availability During Platform Changes

The most recent example of new technology adoption occurred in January 2009, when MSCOM Ops rolled out a beta version of the Windows Server® 2008 R2 operating system and IIS 7.5 into production. This platform change was more than 10 months before the official October 22, 2009, launch date of these Microsoft server technologies. During this early-adoption period, the team logged more than 25 bugs that helped the Windows Server 2008 R2 product team improve its final product. On August 24, 2009, MSCOM Ops had all production Internet-facing Web servers that host Microsoft.com running the release to manufacturing (RTM) build of Windows Server 2008 R2 (Build 7600).

Note: For more information about Windows Server 2008 R2, go to http://www.microsoft.com/windowsserver2008. For more information about IIS 7.5 in Windows Server 2008 R2, go to http://www.microsoft.com/windowsserver2008/en/us/iis-r2.aspx.

Because the goal behind the operations of the MSCOM Ops team is availability, downtime is simply not an option. In fact, the team does not schedule downtime at all, even for process of adopting new technologies in the production environment. Availability reporting and tracking are important elements of delivering to these high-availability requirements.

Figure 2 shows how MSCOM Ops has sustained—and even improved—this availability while adopting new versions of Windows in the production environment.

Figure 2. MSCOM Ops product adoption and site availability, 2002–2010

Figure 2. MSCOM Ops product adoption and site availability, 2002–2010

Note: Keynote™, an independent third party, measured site availability.

Best Practices

The MSCOM Ops team has identified the following best practices for maintaining availability of a complex, high-traffic Web site and its associated properties.

Include Redundancy for Hardware and Software in the Infrastructure

Planning for high availability for a high-traffic site that contains hundreds or thousands of Web properties, like Microsoft.com, requires infrastructure architects to think about what can and will go wrong. Approaching this problem from the perspective of planning for redundancy both in and across the data centers that contain the infrastructure enables a team to foster greater resiliency. The process of determining how to configure server clusters and which types of clustering to use must take into account uptime and performance requirements.

Evaluate and Address Geographic-Segmentation Issues

Having a clear definition of what site availability means and an understanding of where visitors come from is crucial to deciding whether to use techniques like edge caching, global load balancing, or global health checking to provide high availability for a Web property. Establishing a baseline understanding of whom the site serves and how to track where users are is critical to making appropriate investments in hosting infrastructure.

Collaborate as Needed with Application Development Teams

Having a highly available Web hosting infrastructure requires application developers to think about developing applications to certain system requirements. For example, application developers will likely need to think about the application running at multiple data centers and the impacts that this has to a particular application's architecture. Consultation with operations engineers at key points in the SDLC cycle can help application developers identify and validate application performance benchmarks. Including infrastructure engineers in reviewing application designs in the planning and pre-deployment phases of building an application helps ensure adherence to security, availability, and performance standards.

Create and Use Scripts for Building and Patching Servers

The ability to automate routine maintenance on servers stems from using a standard approach to the architecture design. When an operations team can ensure that server configurations are as identical as possible throughout the various environments, system engineers can perform work more efficiently. Avoiding inconsistencies in server settings also mitigates the risk that manual changes will introduce new issues. Using standard operating system images to build baseline servers makes it easier to take advantage of the automation available through Windows Server Update Services and System Center Configuration Manager.

Because the deployment of a new operating system build or a significant update to a platform system typically requires a restart, operations teams must consider whether downtime is tolerable. If high availability is the goal, following a process that uses both scripts and automation reduces the amount of time that a server or a cluster must be offline. Scripts are best used to assist in the process of removing servers from rotation in a production environment and in verifying that the newly deployed software does not introduce new performance or availability issues.

Implement and Maintain Proper Monitoring

Proper monitoring should inform the team about application or system errors in as close to real time as possible. Collaboration between operations engineers and application development teams is important because it helps ensure that actionable alerts are built into applications. Actionable alerts, in turn, help operations engineers to determine the root cause of the issue and to resolve it. Monitoring all layers of the hosting infrastructure for event and performance issues is as important as monitoring the hosted applications, including network, operating system, and platform (for example, IIS and SQL Server). An organization should designate team members who are dedicated to overseeing monitoring systems and who know how to properly route trouble tickets if they cannot resolve issues themselves.

Understand Anomalous Traffic

The first step toward understanding abnormalities is having a good understand of normal, peak, and low traffic for a specified period. Collecting baseline data is crucial to succeeding in this area. MSCOM Ops uses various sources for collecting information about traffic levels, including system, network, and application performance counters; IIS and HTTP.SYS log data; and event log messages. Each has a different place in the analysis of current traffic, and each is used in concert with the others. With a baseline for routine traffic from each of these sources in place, analysis and the resulting recommendations for configuration changes or adjustments to application code, whichever is needed, will follow more rapidly.

Note: For more details about establishing a baseline and analyzing attack traffic, see "Inside Microsoft.com: Analyzing Denial of Service Attacks" at http://technet.microsoft.com/en-us/magazine/2006.03.insidemscom.aspx.

Be Methodical in Rolling Out Routine Changes

Even before an application is released, the opportunity exists to prepare for performing maintenance activities. Having application developers and operations engineers collaborate during the SDLC can result in the creation of acceptance testing scenarios that can be applied at the first release and during future routine maintenance cycles. When it is time to perform maintenance, an organization can then take a particular server cluster offline, apply updates or application rollouts, test the servers, and then expose them to traffic. After making sure that everything is functional and that the servers have passed acceptance tests, the organization can put the cluster back into rotation and move on to the next cluster.

Apply Rigorous Process When Engaged in Adoption of New Technology

Every IT operations team that provides the hosting infrastructure for Web properties for a particular business entity faces similar challenges over time. Sooner or later, infrastructure and application changes must occur. The needs of business units, product teams, or key partners largely determine the type and frequency of a change. Assembling the key stakeholders early when an organization is considering an infrastructure change for any reason is crucial to meeting uptime and performance requirements throughout the change process. Throughout the process, maintaining communication and involving those stakeholders in the validation of functionality and stability of the existing or new properties on the updated infrastructure provides a greater chance of success when changes are ultimately applied in the production hosting environment.

Restrict or Prohibit Maintenance Activities During Major Events

A big event, such as a product launch, is no time to lose access to a high-traffic Web site. Performing routine maintenance during major online events, such as a marketing update to more than one Web property in support of a new release, presents an unnecessary risk for downtime. Fostering better cross-team communication allows information about maintenance windows to be clearly disseminated when major changes are scheduled. Having a launch manager within the operations team who is responsible for coordinating communication about major Web releases directly supports the ability to maintain high availability because business units and monitoring teams alike are aware of when changes are happening.

Conclusion

MSCOM Ops operates and manages the infrastructure for Microsoft.com through strategic analysis of key architectural considerations and rigorous application of best practices for Web hosting. Other organizations may be able to use these best practices while adopting new technology to improve the availability of their hosting environments.

Attempting to eliminate any single point of failure in the Microsoft.com hosting infrastructure has led the MSCOM Ops team to build redundancy into the architecture design in multiple ways. By imagining what happens if a cluster or an entire data center becomes unexpectedly unavailable, the team has identified what partnerships and hardware to use in addition to continuing to evaluate and adopt new clustering configurations for front-end and back-end servers.

Careful consideration of performance expectations has been crucial when the team must define scalable units for the Microsoft.com hosting infrastructure. Applying nearly identical server and cluster configurations facilitates faster resolution to production issues. The existence of these same standards makes it easier to apply changes through scripts. Reducing the chance of manual errors to the routine maintenance process mitigates the risk for unplanned downtime.

MSCOM Ops maintains strong partnerships with product development teams and application engineers throughout Microsoft. Collaboration with these resources at key phases of the SDLC enables operations engineers to offer guidance on building applications for the Microsoft.com hosting infrastructure. Understanding the performance and availability requirements early in this process enables the team to provide better input about inclusion of actionable alerts. Collaborating with application engineers also results in the creation of criteria for testing, releasing, and supporting new applications.

These practices have enabled MSCOM Ops to sustain a more than 99 percent availability rating while evaluating and adopting the latest Microsoft server technologies. Continuing to adopt new Microsoft server technologies has led to incremental and sustained improvements to those availability ratings, resulting in better than 99.9 percent availability for the past two years.

For More Information

To learn about how failover clusters work and how to set one up, see the "Failover Cluster Design Guide" at http://technet.microsoft.com/en-us/library/dd197569(WS.10).aspx.

For more information about purchasing Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:

© 2010 Microsoft Corporation. All rights reserved.

This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. Microsoft, SQL Server, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft