Export (0) Print
Expand All

SharePoint Server 2013 Disaster Recovery in Microsoft Azure

 

Applies to: Windows Server 2012, SQL Server 2012, SharePoint Server 2013 Enterprise, Office Professional 2013, Microsoft Azure, data and storage, AD DS

Topic Last Modified: 2014-06-05

Summary: Using Microsoft Azure, you can create a disaster-recovery environment for your on-premises SharePoint farm. This article describes how to design and implement this solution.

Use this article with the following solution model: SharePoint Disaster Recovery in Microsoft Azure.

Figure 1: SharePoint Disaster Recovery Solution in Azure

SharePoint disaster-recovery process to Azure

PDF file  PDF   |   Visio file  Visio   |   Zoom.it file: A high-resolution, zoomable image  Zoom.it

In this article:

Many organizations do not have a disaster recovery environment for SharePoint, which can be expensive to build and maintain on-premises. Azure Infrastructure Services provides compelling options for disaster recovery environments that are more flexible and less expensive than the on-premises alternatives.

Advantages for using Azure Infrastructure Services include:

  • Hosted secondary datacenter   Use Azure Infrastructure Services instead of investing in a secondary datacenter in a different region.

  • Lower-cost disaster-recovery environments   Maintain and pay for fewer resources than an on-premises disaster recovery environment. The number of resources depends on which disaster-recovery environment you choose: cold standby, warm standby, or hot standby.

  • Azure Infrastructure Services is elastic   In the event of a disaster, easily scale out your recovery SharePoint farm to meet load requirements. Scale in when you no longer need the resources.

There are less-complex options for organizations just getting started with disaster recovers and advanced options for organizations with high-resilience requirements. Definitions for cold, warm, and hot standby environments are a little different when the environment is hosted in a cloud platform. The following table describes these environments when building a SharePoint recovery farm in Azure.

Table: Recovery environments

Type of recovery environment Description

Hot

A fully-sized farm is provisioned, updated, and running on standby.

Warm

The farm is built and virtual machines are running and updated.

Recovery includes attaching content databases, provisioning service applications, and crawling content.

The farm can be a smaller version of the production farm and then scaled out to serve the full user base.

Cold

The farm is fully built, but the virtual machines are stopped.

Maintaining the environment includes starting the virtual machines from time to time, patching, updating, and verifying the environment.

Start the full environment in the event of a disaster.

It’s important to evaluate your organization’s Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO). These requirements determine which environment is the most appropriate investment for your organization.

The guidance in this article describes how to implement a warm standby environment. You can also adapt it to a cold standby environment, although you need to follow additional procedure to support this kind of environment. This article does not describe how to implement a hot standby environment.

For more information about disaster recovery solutions, see High availability and disaster recovery concepts in SharePoint 2013 and Choose a disaster recovery strategy for SharePoint 2013.

The warm standby disaster-recovery solution requires the following environment:

  • On-premises SharePoint production farm

  • Recovery SharePoint farm in Azure

  • Site-to-site VPN connection between the two environments

The following figure illustrates these three elements.

Figure 2: Elements of a warm standby solution in Azure

Elements of a warm standby solution in Azure

SQL Server Log shipping with Distributed File System Replication (DFSR) is used to copy database backups and transaction logs to the recovery farm in Azure.

  • DFSR transfers logs from the production environment to the recovery environment. In a WAN scenario, DFSR is more efficient than shipping the logs directly to the secondary server in Azure.

  • Logs are replayed to the SQL Server in the recovery environment in Azure.

  • You don’t attach the log-shipped databases until a recovery exercise is performed.

Perform the following steps to recover the farm:

  1. Stop log shipping.

  2. Stop accepting traffic to the primary farm.

  3. Replay the final transaction logs.

  4. Attach the content databases to the farm.

  5. Restore service applications from the replicated services databases.

  6. Update Domain Name System (DNS) records to point to the recovery farm.

  7. Start a full crawl.

After a recovery is performed, this solution provides the items listed in the following table.

Table: Solution recovery objectives

Item Description

Sites and content

Sites and content are available in the recovery environment.

A new instance of search

In this warm standby solution, search is not restored from search databases. Search components in the recovery farm are configured as similarly as possible to the production farm. After the sites and content are restored, a full crawl is started to rebuild the search index. You do not need to wait for the crawl to complete to make the sites and content available.

Services

Services that store data in databases are restored from the log-shipped databases. Services that do not store data in databases are simply started.

Not all services with databases need to be restored. The following services do not need to be restored from databases and can simply be started after failover:

  • Usage and Health Data Collection

  • State service

  • Word automation

  • Any service without a database

You can work with Microsoft Consulting Services (MCS) or a partner to address more complex recovery objectives. These are summarized in the following table.

Table: Other items that can be addressed by MCS or a partner

Item Description

Synchronizing custom farm solutions

Ideally, the recovery farm configuration is identical to the production farm. You can work with a consultant or partner to evaluate whether custom farm solutions are replicated, and the process is in place for keeping the two environments synchronized.

Connections to data sources on-premises

It might not be practical to replicate connections to back-end data systems, such as backup domain controller (BDC) connections and search content sources.

Search restore scenarios

Because enterprise search deployments tend to be fairly unique and complex, restoring search from databases requires a greater investment. You can work with a consultant or partner to identify and implement search restore scenarios that your organization might require.

The guidance provided in this article assumes that the on-premises farm is already designed and deployed.

Ideally, the recovery farm configuration in Azure is identical to the production farm on-premises, including the following:

  • Same representation of server roles

  • Same configuration of customizations

  • Same configuration of search components

The environment in Azure can be a smaller version of the production farm. If you plan to scale out the recovery farm after failover, it’s important that each type of server role is initially represented.

Some configurations might not be practical to replicate in the failover environment. Be sure to test the failover procedures and environment to ensure that the failover farm provides the expected service level.

This solution doesn't prescribe a specific topology for a SharePoint farm. The focus of this solution is to use Azure for the failover farm and implement log shipping and DFSR between the two environments.

In a warm standby environment, all virtual machines in the Azure environment are running. The environment is ready for a failover exercise or event.

The following figure illustrates a disaster recovery solution from an on-premises SharePoint farm to an Azure-based SharePoint farm that is configured as a warm standby environment.

Figure 3: Topology and key elements of a production farm and a warm standby recovery farm

Shows topology and key elements of a production farm and a warm standby recovery farm.

In this illustration:

  • Two environments are illustrated side-by-side: the on-premises SharePoint farm and the warm standby farm in Azure.

  • Each environment includes a file share.

  • Each farm includes four tiers. To achieve high availability, each tier includes two servers or virtual machines that are configured identically for a specific role, such as front-end services, distributed cache, backend services, and databases. It isn't important in this illustration to call out specific components. The two farms are configured identically.

  • The fourth tier is the database tier. Log shipping is used to copy logs from the secondary database server in the on-premises environment to the file share in the same environment.

  • DFSR copies files from the file share in the on-premises environment to the file share in the Azure environment.

  • Log shipping replays the logs from the file share in the Azure environment to the primary replica in the SQL Server AlwaysOn availability group in the recovery environment.

In a cold standby environment, most of the SharePoint farm virtual machines can be shut down. (We recommend occasionally starting the virtual machines, such as every two weeks or once a month, so that each virtual machine can sync with the domain.) The following virtual machines in the Azure recovery environment must remain running to ensure continuous operations of log shipping and DFSR:

  • The file share

  • The primary database server

  • At least one virtual machine running Windows Server Active Directory and DNS

The following figure shows an Azure failover environment in which the file share virtual machine and the primary SharePoint database virtual machine are running. All other SharePoint virtual machines are stopped. The virtual machine that is running Windows Server Active Directory and DNS is not shown.

Figure 4: Cold standby recovery farm with running virtual machines

Elements of a cold standby solution in Azure

After failover to a cold standby environment, all virtual machines are started, and the method to achieve high availability of the database servers must be configured, such as SQL Server AlwaysOn availability groups.

If multiple storage groups are implemented (databases are spread across more than one set of highly available SQL Server), the primary database for each storage group must be running to accept the logs associated with its storage group.

Multiple technologies are used in this disaster recovery solution. To make sure that these technologies interact as expected, each component in the on-premises and Azure environment must be installed and configured correctly. We recommend that the person or team who sets up this solution has a strong working knowledge of and hands-on skills with the following technologies:

Finally, we recommend scripting skills that you can use to automate tasks associated with these technologies. It’s possible to use the available user interfaces to complete all the tasks described in this solution. However, a manual approach is time-consuming, error prone, and delivers inconsistent results.

In addition to Windows PowerShell, there are also Windows PowerShell libraries for SQL Server, SharePoint Server, and Azure. Don’t forget T-SQL, which can also help reduce the time to configure and maintain your disaster-recovery environment.

Visual representation of the disaster-recovery roadmap.

This roadmap assumes that you already have a SharePoint Server 2013 farm deployed in production.

Table: Roadmap for disaster recovery

Phase Description

Phase 1

Design the disaster recovery environment.

Phase 2

Create the Azure virtual network and VPN connection.

Phase 3

Deploy Windows Active Directory and Domain Name Services to the Azure virtual network.

Phase 4

Deploy the SharePoint recovery farm in Azure.

Phase 5

Set up DFSR between the farms.

Phase 6

Set up log shipping to the recovery farm.

Phase 7

Validate failover and recovery solutions. This includes the following procedures and technologies:

  • Stop log shipping

  • Restore the backups

  • Crawl content

  • Recover services

  • Manage DNS records

Use the guidance in Microsoft Azure Architectures for SharePoint 2013 to design the disaster-recovery environment, including the SharePoint recovery farm. You can use the graphics in the SharePoint Disaster Recovery Solution in Azure Visio file to start the design process. We recommend you design the entire environment before beginning any work in the Azure environment.

In addition to the guidance provided in Microsoft Azure Architectures for SharePoint 2013 for designing the virtual network, VPN connection, Active Directory, and SharePoint farm, be sure to add a the file share role to the Azure environment.

To support log shipping in a disaster-recovery solution, a file share virtual machine is added to the cloud service where the database roles reside. The file share also serves as the third node of a Node Majority for the SQL AlwaysOn availability group. This is the recommended configuration for a standard SharePoint farm using SQL Server AlwaysOn availability groups.

Figure 6: Placement of a file server used for a disaster recovery solution

Shows a file share VM added to the same cloud service that contains the database server roles.

In this diagram, a file share virtual machine is added to the same cloud service in Azure that contains the database server roles. Do not add the file share virtual machine to an availability set with other server roles, such as the SQL server roles.

If you are concerned about high availability of the logs, consider taking a different approach by using SQL Server Backup and Restore with Azure Blob Storage Service. This is a new feature in Azure that saves logs directly to a blob storage URL. This solution does not include guidance about using this feature.

When you design the recovery farm, keep in mind that a successful disaster recovery environment accurately reflects the production farm that you want to recover. The size of the recovery farm is not the most important thing in the recovery farm’s design, deployment, and testing. Farm scale varies from organization to organization based on business requirements. It might be possible to use a scaled down farm for a short outage, or until performance and capacity demands require you to scale the farm.

Configure the recovery farm as identically as possible to the production farm so that it meets your service level agreement (SLA) requirements and provides the functionality that you need to support your business. When you design the disaster recovery environment, also look at your change management process for your production environment. We recommend that you extend the change management process to the recovery environment by updating the recovery environment at the same interval as the production environment. As part of the change management process, we recommend maintaining a detailed inventory of your farm configuration, applications, and users.

Connect an on-premises network to a Microsoft Azure virtual network shows you how to plan and deploy the virtual network in Azure and to create the VPN connection. Follow the guidance in the topic to complete the following steps:

  • Plan the private IP address space of the Azure virtual network.

  • Plan the routing infrastructure changes for the Azure virtual network.

  • Plan firewall rules for traffic to and from the on-premises VPN device.

  • Create the cross-premises virtual network in Azure.

  • Configure routing between your on-premises network and the Azure virtual network.

This phase includes deploying both Active Directory and DNS to the Azure virtual network in a hybrid scenario as described in Microsoft Azure Architectures for SharePoint 2013 and illustrated below.

Figure 4: Hybrid Active Directory domain configuration

Shows the Active Directory and Domain Name Services VMs added to the cloud service in Azure environment.

In the illustration two virtual machines are deployed to a dedicated cloud service. These virtual machines are each hosting two roles: Active Directory and DNS.

Before deploying Active Directory in Azure, read Guidelines for Deploying Windows Server Active Directory on Azure Virtual Machines. These guidelines help you determine if you need a different architecture or different configuration settings for your solution.

For detailed guidance on setting up a domain controller in Azure, see Install a Replica Active Directory Domain Controller in Microsoft Azure Virtual Networks.

Before this phase, you haven’t deployed virtual machines to the Azure virtual network. The virtual machines for hosting Active Directory and DNS are likely not the largest virtual machines you need for the solution. Before you deploy these virtual machines, first create the largest virtual machine that you plan to use in your Windows Azure virtual network. This ensures your solution lands on a “stamp” in Azure that allows the largest size you need. You do not need to configure this virtual machine at this time. Simply create it, and set it aside. If you do not do this, you might run into a limitation when you try to create larger virtual machines later (Azure won’t let you create the larger virtual machines). This size issue might not be a problem at some point in the future. However, it was an issue at the time we built the POC environment for this solution.

Deploy the SharePoint farm in your Azure virtual network according to your design plans. It might be helpful to review Planning for SharePoint 2013 on Azure Infrastructure Services before you deploy SharePoint roles in Azure.

Consider the following practices that we learned by building our proof of concept environment.

  • Create virtual machines using the Gallery, not Quick Create. This gives you more control over configuration, such as specifying cloud services.

  • Make use of Azure PowerShell. There are many good examples published. Get Started with Windows Azure Cmdlets describes how to use Azure PowerShell.

  • Azure and Hyper-V do not support dynamic memory. Be sure this is factored into your performance and capacity plans.

  • Restart virtual machines through the Azure interface, not from the virtual machine logon itself. Using the Azure interface works better and is more predictable.

  • If you want to shut down a virtual machine to save costs, use the Azure interface. If you shut down from the virtual machine logon, charges continue to accrue.

  • Use a naming convention for the virtual machines.

  • Pay attention to which datacenter location the virtual machines are being deployed.

  • The auto-scale feature in Azure is not supported for SharePoint roles.

  • Do not configure items in the farm that will be restored, such as site collections.

To set up file replication using DFSR, use the DNS Management snap-in. However, before the DFSR setup, log on to your on-premises file server and Azure file server and enabled the service in Windows.

From the Server Manager Dashboard, complete the following steps:

  • Configure the local server.

  • Start the Add Roles and Features Wizard.

  • Open the File and Storage Services node.

  • Select DFS Namespaces and DFS replication.

  • Click Next to finish the wizard steps.

Figure 14: DFS Replication Health Report.

Screenshot showing a DFSR health report.

The preceding screenshot shows the detailed reporting that DFS Management provides. These reports include configuration results and replication health.

The following table provides links to DFSR reference articles and blog posts.

Table: Reference articles for DFSR

Title Description

Replication

DFS Management TechNet topic with links for Replication

DFS Replication: Survival Guide

Wiki with links to DFS information

DFS Replication: Frequently Asked Questions

DFS Replication TechNet topic

Jose Barreto’s Blog

Principal Program Manager on File Server team at Microsoft

The Storage Team at Microsoft – File Cabinet Blog

About files services and storage features in Windows Server

Log shipping is the critical component for setting up disaster recovery in this environment. You can use log shipping to automatically send transaction log files for databases from a primary database server instance to a secondary database server instance. To set up log shipping see Configure log shipping in SharePoint 2013.

C++ noteVisual C++
Log shipping support in SharePoint Server is limited to certain databases. For more information, see Supported high availability and disaster recovery options for SharePoint databases (SharePoint 2013).

The goal of this final phase is to verify that the disaster recovery solution works as planned. To do this, create a failover event that shuts down the production farm and starts up the recovery farm as a replacement. You can start a failover scenario manually or by using scripts.

The first step is to stop incoming user requests for farm services or content. You can do this by disabling DNS or by shutting down the Front End Web servers. After the farm is ”down,” you can fail over to the recovery farm.

You must stop log shipping before farm recovery. Stop log shipping on the secondary server in Azure first, and then stop it on the primary server on-premises. Use the following script to stop log shipping on the secondary server first and then on the primary server.

-- Removes Log Shipping from server
-- Commands must be executed on the Secondary server FIRST, then the Primary

SET NOCOUNT ON
DECLARE  @PriDB nvarchar(max)
,@SecDB nvarchar(250)
,@PriSrv nvarchar(250)
,@SecSrv nvarchar(250)

Set @PriDB= ''
SET @PriDB = UPPER(@PriDB)
SET @PriDB = REPLACE(@PriDB, ' ', '')
SET @PriDB = '''' + REPLACE(@PriDB, ',', ''', ''') + ''''

Set @SecDB = @PriDB


Exec ( 'Select  ''exec master..sp_delete_log_shipping_secondary_database '' + '''''''' + prm.primary_database +  ''''''''   
from msdb.dbo.log_shipping_monitor_primary prm INNER JOIN msdb.dbo.log_shipping_primary_secondaries sec  ON  prm.primary_database=sec.secondary_database
where prm.primary_database in ( ' + @PriDB + ' )')

Exec ( 'Select  ''exec master..sp_delete_log_shipping_primary_secondary '' + '''''''' + prm.Primary_Database + '''''', '''''' + sec.Secondary_Server + '''''', '''''' + sec.Secondary_database + ''''''''   
from msdb.dbo.log_shipping_monitor_primary prm INNER JOIN msdb.dbo.log_shipping_primary_secondaries sec  ON  prm.primary_database=sec.secondary_database
where prm.primary_database in ( ' + @PriDB + ' )')


Exec ( 'Select  ''exec master..sp_delete_log_shipping_primary_database '' + '''''''' + prm.primary_database +  ''''''''   
from msdb.dbo.log_shipping_monitor_primary prm INNER JOIN msdb.dbo.log_shipping_primary_secondaries sec  ON  prm.primary_database=sec.secondary_database
where prm.primary_database in ( ' + @PriDB + ' )')



Exec ( 'Select  ''exec master..sp_delete_log_shipping_secondary_primary '' + '''''''' + prm.primary_server + '''''', '''''' + prm.primary_database +  ''''''''   
from msdb.dbo.log_shipping_monitor_primary prm INNER JOIN msdb.dbo.log_shipping_primary_secondaries sec  ON  prm.primary_database=sec.secondary_database
where prm.primary_database in ( ' + @PriDB + ' )')

Backups must be restored in the order in which they were created. Before you can restore a particular transaction log backup, you must first restore the following previous backups without rolling back uncommitted transactions (that is, WITH NORECOVERY):

  • The full database backup and the last differential backup – Restore these backups, if any exist, taken before the particular transaction log backup. Before the most recent full or differential database backup was created, the database was using the full recovery model or bulk-logged recovery model.

  • All transaction log backups – Restore any transaction log backups taken after the full database backup or the differential backup (if you restore one) and before the particular transaction log backup. Log backups must be applied in the sequence in which they were created, without any gaps in the log chain.

To recover the content database on the secondary server so that the sites render, remove all database connections before recovery. To restore the database, run the following command-SQL statement:

restore database WSS_Content with recovery

C++ noteVisual C++
When you use T-SQL explicitly, specify either WITH NORECOVERY or WITH RECOVERY in every RESTORE statement to eliminate ambiguity—this is very important when writing scripts. After the full and differential backups are restored, the transaction logs can be restored in SQL Server Management Studio. Also, because log shipping is already stopped, the content database is in a standby state, so you must change the state to Full Access.

In SQL Server Management Studio, right-click the WSS_Content database, point to Tasks > Restore, and then click Transaction Log (if you have not restored the full backup, this is not available). For more information, see Restore a Transaction Log Backup (SQL Server).

You must start a full crawl for each content source to restore the Search Service. Note that you lose some analytics information from the on-premises farm, such as search recommendations. Before you start the full crawls, use the Windows PowerShell cmdlet Restore-SPEnterpriseSearchServiceApplication, and specify the log shipped and replicated Search Administration database, Search_Service__DB_<GUID>. This cmdlet gives the search configuration, schema, managed props, rules, sources, and creates a default set of the other components.

To start a full crawl, complete the following steps:

  1. In the SharePoint 2013 Central Administration go to Application Management > Service Applications > Manage service applications, and then click the Search Service application that you want to crawl.

  2. On the Search Administration page, click Content Sources, point to the content source that you want, click the arrow, and then click Start Full Crawl.

The following table shows how to recover services that have log shipped databases, the services that have databases but are not recommended to restore these with log shipping, and the services that do not have databases.

Table: Service application database reference

Restore these services from log-shipped databases These services have databases but we recommend that you start this services without restoring their databases These services do not store data in databases; start these services after failover
  • Machine Translation Service

  • Managed Metadata Service

  • Secure Store Service

  • User Profile (Only the Profile and Social Tagging databases are supported. The Synchronization database is not supported.)

  • Microsoft SharePoint Foundation Subscription Settings Service

  • Usage and Health Data Collection

  • State service

  • Word automation

  • Excel Services

  • PerformancePoint Services

  • PowerPoint Conversion

  • Visio Graphics Service

  • Work Management

The following example shows how to restore the Managed Metadata service from a database:

This uses the existing Managed_Metadata_DB database. This database is log shipped, but there is no active service application on the secondary farm, so it needs to be connected once the service application is in place.

First, use new-spmanagedmetadataserviceapplication, and specify the -database switch with the name of the restored database.

Next, configure the new Managed Metadata Service Application on the secondary server, as follows:  

  • Name: Managed Metadata Service  

  • Database server: Use the database name from the shipped transaction log  

  • Database name: Managed_Metadata_DB  

  • Application Pool: SharePoint Service Applications 

You must manually create DNS records to point to your SharePoint farm.

In most cases where you have multiple web-front-end servers, it makes sense to take advantage of the Network Load Balancing feature in Windows Server 2012, or a hardware load balancer to distribute requests among the web-front-end servers in your farm. Network load balancing can also help reduce risk by distributing requests to the other servers if one of your web-front-end servers fails.

Typically, when you set up network load balancing, your cluster is assigned a single IP address. You then create a DNS host record in the DNS provider for your network that points to the cluster. (For this project, we put a DNS server in Azure for resiliency in case of an on-premises datacenter failure.) For instance, you can create a DNS record, in DNS Manager in Active Directory, for example, called http://sharepoint.contoso.com, that points to the IP address for your load-balanced cluster.

For external access to your SharePoint farm, you can create a host record on an external DNS server with the same URL that clients use on your intranet (for example, http://sharepoint.contoso.com) that points to an external IP address in your firewall. (A best practice, using this example, is to set up split DNS so that the internal DNS server is authoritative for contoso.com and routes requests directly to the SharePoint farm cluster, rather than routing DNS requests to your external DNS server.) You can then map the external IP address to the internal IP address of your on-premises cluster so that clients find the resources they are looking for.

From here, you might run into a couple different disaster-recovery scenarios:

Example scenario: The on-premises SharePoint farm is unavailable because of hardware failure in the on-premises SharePoint farm. In this case, after you completed the steps for failover to the Azure SharePoint farm, you can configure network load balancing on the recovery SharePoint farm’s web-front-end servers, the same way you did with the on-premises farm. You can then redirect the host record in your internal DNS provider to point to the recovery farm’s cluster IP address. Note that it can take some time before cached DNS records on clients are refreshed and point to the recovery farm.

Example scenario: The on-premises datacenter is lost completely. This scenario might occur due to a natural disaster, such as a fire or flood. In this case, for an enterprise, you would likely have a secondary datacenter hosted in another region, as well as your Azure subnet that has its own directory services and DNS. As in the previous disaster scenario, you can redirect your internal and external DNS records to point to the Azure SharePoint farm. Again, take note that DNS-record propagation can take some time.

If you are using host-named site collections, as recommended in Host-named site collection architecture and deployment (SharePoint 2013), you might have several site collections hosted by the same web application in your SharePoint farm, with unique DNS names (for example, http://sales.contoso.com and http://marketing.contoso.com). In this case, you can create DNS records for each site collection that point to your cluster IP address. Once a request reaches your SharePoint web-front-end servers, they handle routing each request to the appropriate site collection.

We designed and tested a proof-of-concept environment for this solution. The design goal for our test environment was to deploy and recover a SharePoint farm that we might find in a customer environment. We made several assumptions, but we knew that the farm needed to provide all of the out-of-the-box functionality without any customizations. The topology was designed for high availability by using best practice guidance from the field and product group.

The following table describes the Hyper-V virtual machines that we created and configured for the on-premises test environment.

Table: Virtual machines for on-premises test

Server name Role Configuration

DC1

Domain controller with Active Directory.

2 processors

512 MB–4 GB RAM

1 x 127 GB hard disk

RRAS

Server configured with the Routing and Remote Access Service (RRAS) role.

2 processors

2–8 GB RAM

1 x 127 GB hard disk

FS1

File server with shares for backups and end point for DFSR.

4 processors

2–12 GB RAM

1 x 127 GB

1 x 1 TB (SAN)

1 x 750 GB

SP-WFE1, SP-WFE2

Front End Web servers.

4 processors

16 GB RAM

SP-APP1, SP-APP2, SP-APP3

Application servers.

4 processors

2–16 GB RAM

SP-SQL-HA1, SP-SQL-HA2

Database servers, configured with SQL Server 2012 AlwaysOn availability groups to provide high availability. This configuration uses SP-SQL-HA1 and SP-SQL-HA2 as the primary and secondary replicas.

4 processors

2–16 GB RAM

The following table describes drive configurations for the Hyper-V virtual machines that we created and configured for the Front End Web and Application servers for the on-premises test environment.

Table: Virtual machine drive requirements for Front End Web and Application servers for on-premises test

Drive Letter Size Directory Name Path

C

80

System drive

<DriveLetter>:\Program Files\Microsoft SQL Server\

E

80

Log drive (40 GB)

<DriveLetter>:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA

F

80

Page (36 GB

<DriveLetter>:\Program Files\Microsoft SQL Server\MSSQL\DATA

The following table describes drive configurations for the Hyper-V virtual machines created and configured to serve as the on-premises database servers. On the Database Engine Configuration page, access the Data Directories tab to set, and confirm the settings shown in the following table.

Table: Virtual machine drive requirements for database server for on-premises test

Drive Letter Size Directory Name Path

C

80

Data root directory

<DriveLetter>:\Program Files\Microsoft SQL Server\

E

500

User database directory

<DriveLetter>:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA

F

500

User database log directory

<DriveLetter>:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA

G

500

Temp DB directory

<DriveLetter>:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA

H

500

Temp DB log directory

<DriveLetter>:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA

During the different deployment phases, the test team typically worked on the on-premises architecture first and then on the corresponding Azure environment. This reflects the general real-world cases where in-house production farms are already running. What is even more important is that you should know the current production workload, capacity, and typical performance. In addition to building a disaster recovery model that can meet business requirements, you should size the recovery farm servers to deliver a minimum level of service. In a cold or warm standby environment, a recovery farm is typically smaller than a production farm. After the recovery farm is stable and in production, the farm can be scaled up and out to meet workload requirements.

We deployed our test environment in the following three phases:

  • Set up the hybrid infrastructure

  • Provision the servers

  • Deploy the SharePoint farms

This phase involved setting up a domain environment for the on-premises farm and for the recovery farm in Azure. In addition to the normal tasks associated with configuring Active Directory, the test team implemented a routing solution and a VPN connection between the two environments.

In addition to the farm servers, it was necessary to provision servers for the domain controllers and configure a server to handle RRAS as well as the site-to-site VPN. Two file servers were provisioned for the DFSR service and several client computers were provisioned for testers.

The SharePoint farms were deployed in two stages in order to simplify environment stabilization and troubleshooting, if required. During the first stage, each farm was deployed on the minimum number of servers for each tier of the topology and to support the required functionality.

We created the database servers with SQL Server installed before creating the SharePoint 2013 servers. Because this was a new deployment, we created the availability groups before deploying SharePoint. We created three groups based on MCS best practice guidance.

We created the farm and joined additional servers in the following order:

  • Provisioned SP-SQL-HA1 and SP-SQL-HA2.

  • Configured AlwaysOn and create the three availability groups for the farm.

  • Provisioned SP-APP1 to host Central Administration.

  • Provisioned SP-WFE1 and SP-WFE2 to host the distributed cache.

We used the skipRegisterAsDistributedCachehost parameter when we ran psconfig.exe at the command line. For more information, see Plan for feeds and the Distributed Cache service in SharePoint Server 2013.

We repeated the following steps in the recovery environment:

  • Provisioned AZ-SQL-HA1 and AZ-SQL-HA2.

  • Configured AlwaysOn and create the three availability groups for the farm.

  • Provisioned AZ-APP1 to host Central Administration.

  • Provisioned AZ-WFE1 and AZ-WFE2 to host the distributed cache.

After we configured the distributed cache and added test users and test content, we started stage two of the deployment. This required scaling out the tiers and configuring the farm servers to support the high-availability topology described in the farm architecture.

The following table describes the virtual machines, cloud services, and availability sets we set up for our recovery farm.

Table: Recovery farm infrastructure

Server name Role Configuration Availability set

spDRAD

Domain controller with Active Directory

2 processors

512 MB–4 GB RAM

1 x 127 GB hard disk

spDRAD

AZ-SP-FS

File server with shares for backups and endpoint for DFSR

A5 configuration:

  • 2 processors

  • 14 GB RAM

1 x 127 GB

1 x 135

1 x 127 GB

1 x 150 GB

sp-databaseservers

DATA_SET

AZ-WFE1, AZ -WFE2

Front End Web servers

A5 configuration:

  • 2 processors

  • 14 GB RAM

1 x 127 GB hard disk

sp-webservers

WFE_SET

AZ -APP1, AZ -APP2, AZ -APP3

Application servers

A5 configuration:

  • 2 processors

  • 14 GB RAM

1 x 127 GB hard disk

sp-applicationservers

APP_SET

AZ -SQL-HA1, AZ -SQL-HA2

Database servers and primary and secondary replicas for AlwaysOn availability groups

A5 configuration:

  • 2 processors

  • 14 GB RAM

sp-databaseservers

DATA_SET

After the test team stabilized the farm environments and completed functional testing, they started the following operations tasks required to configure the on-premises recovery environment:

  • Configure full and differential backups.

  • Configure DFSR on the file servers that transfer transaction logs between the on-premises environment and the Azure environment.

  • Configure log shipping on the primary database server.

  • Stabilize, validate, and troubleshoot log shipping, as required. This included identifying and documenting any behavior that might cause issues, such as network latency, which would cause log shipping or DFSR file synchronization failures.

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft