Disaster recovery in Microsoft Dynamics 365 (online)
Updated: November 29, 2016
Applies To: Dynamics 365 (online), Dynamics 365 (on-premises), Dynamics CRM 2016, Dynamics CRM Online
Disaster recovery is a feature of Microsoft Dynamics 365 (online) to recover from a planned or unplanned service interruption. An example of a planned service interruption is regular and periodic datacenter system maintenance. An example of an unplanned service interruption is a failure of a key computer system or network component in a data center. For either case, you temporarily lose access to your organization's data and the Microsoft Dynamics 365 (online) services.
Planned service interruptions are preceded by a public notice in the web application or Dynamics 365 for Outlook identifying the date and time of the service maintenance so that businesses can plan for the interruption in accessing their organization's data. Unplanned service interruptions result in a notice that the organization is currently undergoing unplanned maintenance.
When a failure or a disaster occurs, well-defined processes are applied by the administrators of the Microsoft Dynamics 365 (online) data center to recover from a service interruption. The processes and software to recover from these service interruptions is known as disaster recovery failover. Your Microsoft Dynamics 365 (online) datacenter maintains a duplicate and synchronized (alternate) copy of your organization's data on a different server. Should a disaster occur in the data center where you no longer have access to your data, the administrators monitoring the datacenter can switch access from your primary organization to this alternate organization, thereby minimizing the service interruption. When the failure has been corrected, service access to your primary organization can be restored.
This recovery happens in the datacenter and is handled transparently to you and your .NET managed applications. However, there is one issue that application developers must deal with: data loss. When the Microsoft Dynamics 365 (online) services encounter a failure, data change operations that your application performs using web service calls may not complete successfully. This can result in data loss. The following sections in this topic describe how you can write your applications to deal with data loss issues.
Developers can write their applications to account for data center failure and recovery by implementing code to check for and handle a failover event gracefully. An application can subscribe to the EndpointSwitched and EndpointSwitchRequired notification events. These events are also available in derived classes like OrganizationServiceProxy. For more information about these events, see the ServiceProxy<TService> class documentation.
Your application can check the EndpointAutoSwitchEnabled property to determine whether automatic failover behavior is enabled for an organization. This property is set to true for organizations where a failover alternate endpoint is available. No other special code is required in your application other than optionally subscribing to the notification events when EndpointAutoSwitchEnabled is true.
Typical application logic flow for a disaster event and failover
A disaster event occurs in the Microsoft Dynamics 365 (online) datacenter.
The service proxy class object receives an exception after attempting the service call.
If the target organization of the call is not enabled for failover, go to step 9.
The EndpointSwitchRequired event is thrown.
The EndpointSwitched event is thrown.
The service proxy class object automatically tries the call again.
If the second call was successful, the application continues normally.
If the call was not successful, an exception is returned to the application: EndpointNotFoundException, TimeoutException, FaultException<OrganizationServiceFault> where fault.Detail.ErrorCode == -2147176347.
You may want to implement code that checks for potential data loss after endpoint switch events are received and handle it appropriately.
After the disaster affecting the primary organization endpoint has been corrected in the datacenter, a fail back from the alternate endpoint URL to the primary endpoint URL for the organization occurs as part of planned organization maintenance.
Applications that do not link to the Microsoft Dynamics 365 SDK assemblies, for example Java applications that access the web services by using SOAP or ODATA, can try accessing the failover URL for the target organization. The URL for a failover alternate organization is the same as the URL for the primary organization with “--S” added to the organization name. For example, an organization named Contoso would have the primary and alternate URLs shown in the following table.
Primary Organization URL
Alternate Organization URL
For non.NET-connected applications, there is no notification event to which your application can subscribe to receive notice of a service interruption and failover. Your application will begin to receive a variety of fault exceptions, as listed previously, during the service interruption. At that point, the application can attempt to connect to the failover alternate URL for the target organization. After the disaster has been corrected, a fail back to the primary URL for the organization occurs as part of planned organization maintenance.
The following list describes some best practices you can implement in your applications to make them more robust when they deal with service interruptions and failure recovery.
Write application code to check the EndpointAutoSwitchEnabled property value to determine whether it is set to true. If true, consider subscribing to the EndpointSwitched and EndpointSwitchRequired notification events.
If your application works with critical data where any data loss is disastrous, write event handler code or catch the indicated exceptions to handle the disaster event and failover as appropriate for business needs.
Microsoft Dynamics 365
© 2016 Microsoft. All rights reserved. Copyright