Failover Clustering (DW)---a Technical Reference Guide for Designing Mission-Critical DW Solutions
Want more guides like this one? Go to Technical Reference Guides for Designing Mission-Critical Solutions.
Failover clustering is a popular solution to meet the high availability (HA) needs of mission-critical applications. It has been available for several releases of Windows Server and Microsoft SQL Server, and has been significantly improved in Windows Server 2008 and SQL Server 2008. Failover clustering uses a shared storage model and requires a storage area network (SAN). SQL Server failover clustering works well with other technologies such as database mirroring and log shipping. It is typically used to provide fault tolerance at the server level within a data center, but can also be extended to a remote data center—a solution referred to as stretch cluster, extended cluster, or geocluster.
The following resources provide examples of customer scenarios for implementing failover clustering, in addition to general failover clustering reference material. (Note that the full URLs for the hyperlinked text are provided in the Appendix at the end of this document.)
The white paper SQL Server 2008 Failover Clustering1 covers failover cluster architecture and concepts for Windows Server 2003 and Windows Server 2008, in addition to SQL Server 2008. The white paper also covers installation of a SQL Server 2008 failover cluster, upgrades and updates to SQL Server 2008 failover clustering, and maintenance and administration of SQL Server 2008.
The white paper Six Failover Clustering Benefits Realized from Migrating to SQL Server 20082 contains quick tips and benefits of failover clustering in SQL Server 2008. This paper lists the significant and immediate benefits of using SQL Server 2008 failover clustering.
Unlike database mirroring, failover clustering supports distributed transactions. For more information, see the section "Microsoft Distributed Transaction Coordinator" in the white paper SQL Server 2008 Failover Clustering.3
Failover clustering has been improved in Windows Server 2008 and later versions, as described in the following articles. Note in particular though, that while Windows Server 2008 R2 clusters can be created across multiple subnets, SQL Server 2008 R2 requires cluster members to be part of the same subnet. When creating geoclusters, it is common to use virtual subnets to work around this limitation.
By using volume mount points, you can mount a target partition onto a folder on another physical disk, as shown in the article How to configure volume mount points on a server cluster in Windows Server 2008.6 You can mitigate the limitation on the number of drive letters provided by the English alphabet through the use of volume mount points.
Case Studies and References
SQL Server has successfully been deployed to achieve 4 and 5 nines by several customers. Example architectures are below:
The case study High Availability and Disaster Recovery at ServiceU: A SQL Server 2008 Technical Case Study7 provides a detailed description of the end-to-end HA and disaster recovery (DR) solution at ServiceU. ServiceU deploys failover clustering for local HA and database mirroring for DR.
The case study SQL Server High Availability and Disaster Recovery for SAP Deployment at QR: A Technical Case Study8 provides a detailed description of a geocluster solution at Queensland Rail for a large scale SAP deployment.
Questions and Considerations
This section provides questions and issues to consider when working with your customers.
Understand prioritized HA/DR requirements for the application.
Are your customers comfortable with a shared storage solution?
Failover clustering requires a shared storage solution, such as a SAN, and this means working closely with storage administrators.
With shared storage, there is only one copy of data. The shared storage system needs to include redundancy to avoid being a single point of failure.
- Failover clustering requires a shared storage solution, such as a SAN, and this means working closely with storage administrators.
What is the RPO (Recovery Point Objective)? A zero data loss solution may require augmenting failover clustering with either synchronous database mirroring or synchronous SAN replication.
What is the RTO (Recovery Time Objective)? A database mirroring HA solution may provide a better failover time than a failover clustering solution.
Consider a geocluster (or stretch cluster) as a combined HA/DR solution. This solution requires software to enable the cluster and storage-level replication and from the storage vendor.
Failover Clustering is often deployed along with database mirroring—clustering for local HA, and database mirroring for DR.
Following are the full URLs for the hyperlinked text.
1 SQL Server 2008 Failover Clustering http://sqlcat.com/whitepapers/archive/2009/07/08/sql-server-2008-failover-clustering.aspx
2 Six Failover Clustering Benefits Realized from Migrating to SQL Server 2008 http://sqlcat.com/top10lists/archive/2008/11/20/six-failover-clustering-benefits-realized-from-migrating-to-sql-server-2008.aspx
3 SQL Server 2008 Failover Clustering http://sqlcat.com/whitepapers/archive/2009/07/08/sql-server-2008-failover-clustering.aspx
4 What's New in Failover Clusters in Windows Server 2008 http://technet.microsoft.com/en-us/library/cc770625(WS.10).aspx
5 What's New in Failover Clusters http://technet.microsoft.com/en-us/library/dd443539(WS.10).aspx
6 How to configure volume mount points on a server cluster in Windows Server 2008 http://support.microsoft.com/kb/947021
7 High Availability and Disaster Recovery at ServiceU: A SQL Server 2008 Technical Case Study http://sqlcat.com/whitepapers/archive/2009/08/04/high-availability-and-disaster-recovery-at-serviceu-a-sql-server-2008-technical-case-study.aspx
8 SQL Server High Availability and Disaster Recovery for SAP Deployment at QR: A Technical Case Study http://sqlcat.com/whitepapers/archive/2010/09/20/sql-server-high-availability-and-disaster-recovery-for-sap-deployment-at-qr-a-technical-case-study.aspx