Availability

Applies To: Windows Server 2003, Windows Server 2003 R2, Windows Server 2003 with SP1, Windows Server 2003 with SP2

Q. Can Server clusters provide zero downtime for applications?

A.

  • No. Server clusters can dramatically reduce planned and unplanned downtime. However, even with Server clusters, a server could still experience downtime from the following events:

  • Failover time: If a Server cluster recovers from a server or application failure, or if it is used to move applications from one server to another, the application(s) will be unavailable for a non-zero period of time (typically under a minute).

  • Failures from which Server clusters can not recover: There are types of failure that Server clusters do not protect against, such as loss of a disk not protected by RAID, loss of power when a UPS is not used, or loss of a site when there is no fast-recovery disaster recovery plan. Most of these can be survived with minimal downtime if precautions are taken in advance.

  • Server maintenance that requires downtime: Server clusters can keep applications and data online through many types of server maintenance, but not all (for example: installing a new version of an application which has a new on-disk data format that requires reformatting preexisting data).

Microsoft recommends that clusters be used as one element in customers' overall programs to provide high integrity and high availability for their mission-critical server-based data and applications.

Q. Which types of applications and services benefit from Server clustering?

A. There are three types of server applications that will benefit from Server clusters:

  1. "In the box" services provided by the Windows platform: For example: File shares, print queues, Microsoft Message Queue Server (MSMQ) services, and Component Services (formerly known as Transaction Server) services.

  2. Generic applications: Server clusters include a point-and-click wizard for setting up any well-behaved server application for basic error detection, automatic recovery, and operator-initiated management (e.g., move from one server to the other). A "well behaved" server application is one which keeps a recoverable state on cluster disk(s), and whose client can gracefully handle a pause in service as the application is automatically re-started.

  3. Cluster-aware applications: Software vendors test and support their application products on Server clusters.