Risk and mitigation for Office 365 for SSO with Azure Virtual Machines

 

Applies to: Office 365

Summary: Describes the risks of running Office 365 directory integration components on Azure Virtual Machines.

We're listening to your feedback and consolidating all our Office 365 deployment content. On July 1st, 2015, all information in this guide will be moved to https://support.office.com/, and these pages will be removed from TechNet. As you review the content still on TechNet, you'll notice many have links pointing to the new content already on https://support.office.com/.

To explore content available on https://support.office.com/, start with the Office 365 for business - Admin Help page.

The risks of running all or part of the Office 365 directory integration components on Azure Virtual Machines are similar to running these components on-premises and the same mitigation principles apply. Most risks are based on some or all the servers going down or becoming unavailable.

Risk and mitigation strategies for virtual machines

We recommend that, at a minimum, you use the same mitigation techniques to address issues that you would use if you were running directory integration components on-premises. These include the following:

  • Deploying two or more Active Directory Federation Services (AD FS) servers for all roles that are deployed

  • Prepare a plan to re-install the Azure Active Directory Sync tool in the event of a failure

The additional risks and possible mitigation that must be considered when hosting Office 365 directory integration components on virtual machines are listed in the following table. The indicated service degradation level applies in the scenario where Office 365 integration components are primarily active in Azure.

Risk Severity Service Degradation Mitigation

Temporary virtual private network (VPN) outage

Low

Replication traffic is temporarily affected.

Users can continue to log on.

Domain controllers must be deployed to Azure.

Federation Services must be configured to use external AD FS endpoints.

Monitor connectivity between on-premises and Azure.

Single virtual machine outage (AD FS, AD FS proxies)

Low

No impact.

Steps must be taken to restore the virtual machine that failed.

If a single virtual machine is unavailable, it can be mitigated by deploying multiple instances of the server.

Use availability sets and fault domains to avoid redundant instances being affected at the same time.

Single virtual machine outage (directory synchronization)

Medium

Directory synchronization is interrupted.

Monitoring of the directory synchronization services by default sends email to the tenant admin if updates aren’t detected during the default replication window.

The customer can implement additional monitoring.

Multiple virtual machines (AD FS, AD FS proxies)

(Entire fault domain or availability set)

Critical

Users are no longer able to sign in to Office 365.

Each server from a service set should be deployed to a unique fault domain to isolate the effect of a single-rack failure.

Azure datacenter service outage

Critical

Users are no longer able to sign in to Office 365.

Azure takes several steps to ensure availability of the service. For more information, see Manage the Availability of Virtual Machines.

Additionally, Azure allows deployments to span datacenters to mitigate these types of outages.

VPN network availability

We strongly recommended that you ensure your VPN connection is always active (24 hours a day, seven days a week). This connection will need to be monitored in the same way you would monitor any other site-to-site network connection. If the connection is unavailable, directory replication to the domain controllers in Azure stops functioning; directory changes don’t synchronize. As a result, your users cannot log on.

Domain controller and AD FS security

By default, only a Remote Desktop Protocol (RDP) endpoint is accessible from the Internet, allowing Remote Desktop access to the domain controller. After your VPN is set up, you should disable the access to the RDP endpoint on the domain controllers and manage them exclusively through VPN. This blocks all outside access to the domain controllers that are running in Azure.

Managing a domain controller on virtual machines is similar to managing a domain controller in the on-premises perimeter network. Here are some basic recommendations to get you started:

  • Don’t expose any endpoints to the Internet if they’re not needed. Remove the default Remote Desktop endpoint from the configuration. Allow connectivity only to the domain controllers through the VPN tunnel.

  • Monitor the security event log for suspicious logon patterns. For more information about how to monitor and detect a potential attack, see Security Auditing Overview.

  • Use a strong-password policy—not only for domain accounts but also for the Active Directory service recovery account.

Important

You should never directly expose an AD FS server to the Internet without going through a proxy solution for security reasons. Federation Services must be published through AD FS proxies.

We recommend using the Windows Server security product baselines that are released for the Windows Server operating systems. These baselines, used with the Microsoft Security Compliance Manager (SCM) tool, enable you to define custom baselines for Windows Server. For more information, see Microsoft Security Compliance Manager.

For more information about securing Windows Server, see Secure Windows Server 2012.