Hyper-V Network Virtualization Gateway Architectural Guide
Published: August 13, 2012
Updated: September 26, 2012
Applies To: Windows Server 2012
Windows Server 2012 introduces a scalable, secure multi-tenant solution called Hyper-V Network Virtualization. This technology can be used to build cloud datacenters and makes it easier for customers to incrementally move their network infrastructure to private, hybrid, and public clouds.
For an overview of Hyper-V Network Virtualization, see Hyper-V Network Virtualization Overview, Hyper-V Network Virtualization Technical Details, and Windows Server® 2012 Hyper-V Survival Guide at the Microsoft TechNet web site.
Hyper-V Network Virtualization is deployable with existing server and switch hardware. However, gateway functionality, in addition to a datacenter management solution, is required to complete key scenarios including private cloud, cross-premise, and hybrid cloud deployments.
This article provides requirements for the different types of Hyper-V Network Virtualization gateways and gateway functionality. Note that gateway functionality is specifically called out because existing network appliances may incorporate Hyper-V Network Virtualization gateway functionality. Additional technical details regarding the packet encapsulation format used by Hyper-V Network Virtualization may be found in the NVGRE draft RFC.
Most customer deployments require communication from the network virtualized environment to the non-network virtualized environment. Therefore Hyper-V Network Virtualization gateways are required to bridge the two environments.
Gateways can come in different form factors. They can be built upon Windows Server 2012, incorporated into a Top of Rack (TOR) switch, put into an existing network appliance, or can be a stand-alone network appliance. The routing functionality is required for the private cloud scenario described in the Private Cloud section. In addition to routing, VPN functionality is required for the hybrid cloud scenario (described in the Hybrid Cloud section). A third scenario incorporates the gateway functionality into a load balancer (described in the Load Balancer section).
Large enterprises may either be hesitant, or for compliance reasons, unable to move some of their services and data to a public cloud hoster. However, enterprises still want to obtain the benefits of the cloud and network virtualization by consolidating their datacenter resources into a private cloud. In a private cloud deployment, overlapping IP addresses may not be needed because corporations have sufficient internal non-routable address space (for example, 10.x.x.x or 192.x.x.x). Consider the example shown in Figure 1.
Figure 1: Private cloud deployment
Notice in this example that the customer addresses in the virtual subnets are 10.229.x IP addresses while the IP addresses in the non-network virtualized part of the network (CorpNet) are 10.229.1 addresses. In this case the PA addresses for the virtual subnets in the datacenter are 10.60.x IP addresses. This deployment allows the enterprise to take advantage of Hyper-V Network Virtualization’s ability to offer flexibility in both virtual machine placement and cross-subnet live migration in the datacenter fabric. This increases datacenter efficiency thereby reducing both Operational Expenses (OpEx) and Capital Expenses (CapEx).
The Private Cloud scenario shown in Figure 1 requires a private cloud gateway appliance or gateway functionality incorporated into an existing network appliance. The purpose of the private cloud gateway is to provide routing and transition between physical and virtual addresses.
A key advantage of Hyper-V Network Virtualization is that it can seamlessly extend an on-premise datacenter to a Windows Server 2012 based cloud datacenter. This is called a Hybrid Cloud model as shown in Figure 2.
Figure 2: Hybrid cloud deployment
In this scenario an internal server, such as a web server, is moved from the enterprise network into a cloud hoster’s datacenter. Taking advantage of Bring Your Own IP Address offered by the hoster, the enterprise does not need to change the network configuration of the web server virtual machine or any other network endpoint that references that web server. The hoster provides a secure link via a VPN gateway appliance. The enterprise administrators need to only configure their on-premise VPN with the appropriate IP address. The web server virtual machine is unaware that it has been moved to the cloud. It remains domain-joined with Active Directory (AD) and uses the enterprise’s DNS server. The web server virtual machine also continues to interact with other servers in the enterprise such as a SQL Server. In this example, all three of these services (AD, DNS, SQL) remain on-premise. Network diagnostic tools such as tracert, which counts the number of network hops between a source and destination, show that the network traffic between the web server virtual machine and SQL is no longer local but routed across the Internet.
The Hybrid Cloud scenario shown in Figure 2 requires a multi-tenant VPN appliance such as a network virtualization gateway appliance. The VPN appliance gateway must interact with the datacenter orchestrator (for example, System Center Virtual Machine Manager) to obtain the appropriate network virtualization policies.
A load balancer provides the illusion of a single IP address (Virtual IP address or VIP) to clients requesting a service. The service is implemented on multiple different IP addresses (Direct IP address or DIP). A traditional load balancer maps network traffic from VIP to DIP via Network Address Translation (NAT). The load balancer re-writes the destination IP address (VIP) of the incoming packet with one of the IP addresses (DIP) behind the load balancer that performs the service. In the case where the load balancer is acting as a bi-directional NAT, the DIP sends a return packet to the load balancer and the load balancer rewrites the source IP (DIP) with the VIP. In this case the client only knows about the VIP and is unaware of the DIP.
In the context of Hyper-V Network Virtualization the load balancer emits NVGRE packets to get the packets to their destination virtual machines.
Consider the following scenario where the Blue Company and the Red Company have their virtual machines in the same multi-tenant datacenter that provides load balancing for the tenants. Blue Company has four virtual machines offering a service behind a load balancer. The Red Company has two virtual machines offering a different service that they want load balanced.
Figure 3: Multi-tenant Load Balancer Example
Figure 3 shows a virtual topology where publically routable IP addresses are load balanced across multiple destination virtual machines. For example, clients from the Internet send traffic to the Blue public IP (VIP). The load balancer, based on Network Virtualization policy, emits an NVGRE packet to get the packet to the appropriate Hyper-V host. They Hyper-V host will decapsulate the packet and then deliver it to the destination virtual machine.
Figure 4: Multi-tenant Load Balancer deployment
Figure 4 show a possible deployment of Figure 3. In this case the PA space of the load balanced virtual machines is 192.168.6.x. Blue Customer has four virtual machines that are running on three physical hosts. Red Customer has two virtual machines running on two different hosts. Both Blue and Red are hosting web services. The outside world accesses these services via the appropriate publically routable VIP (Blue VIP and Red VIP). The load balancer, using network virtualization policy, generates the appropriate NVGRE packet.
Consider the case where a connection initiated from ClientIP and sent to the Blue VIP is load balanced to Blue VM2. The load balancer needs to send a packet to Blue VM2 which has a VSID of BlueVSID1, CA of 10.1.6.21 and a PA of 192.168.6.4. The load balancer’s PA is 192.168.6.2. Therefore the load balancer generates the NVGRE packet header:
MACLB --> MACH1
192.168.6.2 --> 192.168.6.4
MACext --> MACVM2
ClientIP --> 10.1.6.21
Now consider the scenario shown in Figure 5 where Blue and Red customers have multi-tier applications that require load balancing. From the perspective of the virtual machines they communicate to the next tier via VIP1. For example virtual machines in Tier B1 communicate to virtual machines in Tier B2 via Blue VIP1. Blue VIP1 is a Blue CA IP address which will also have a corresponding PA address on the load balancer. A possible deployment topology is shown in Figure 6.
Figure 5: Load balancing between tiers (logical perspective)
Figure 6: Load Balancer placed internally in the datacenter
The Blue CA VIP and the Red CA VIP on the load balancer both have the PA of 192.168.6.2. The load balancer needs to look at the VSID in the NVGRE packet and then determine the appropriate VIP to DIP mapping. For example Blue virtual machine 11 will send a packet to Blue CA VIP. The NVGRE packet header will have Source IP: 192.168.6.7, Destination IP: 192.168.6.2, GRE Key Field Blue VSID1, Inner Source IP: 10.1.5.20, Inner Destination IP: Blue CA VIP, and the rest of the original packet. The load balancer’s policy must match on both the VSID and the VIP because with overlapping CA IP addresses the VIP by itself may not be unique.
Gateways must contain Hyper-V Network Virtualization policy and perform actions similar to a Windows Server 2012 host. In the typical deployment the Windows Server 2012 host is aware of all the policy information for every routing domain for all virtual machines located on that host. For example, if a host has a virtual machine with routing domain X and another virtual machine from routing domain Y the host has all the policy information for all virtual subnets contained in both routing domain X and routing domain Y. Even in the case of a single routing domain there could be a substantial amount of policy that needs to be on every Hyper-V host.
Consider the case of a private cloud shown in Figure 1. In this scenario a corporation has an internal cloud with a single routing domain. In this case every Hyper-V host must have policy for all network virtualized virtual machines. This can place a burden on the datacenter orchestrator to maintain this policy on all hosts. Note that it is rare for a virtual machine to actually communicate with every other virtual subnet in the corporation so much of this policy on a given host is in practice not needed.
To reduce the amount of policy needed on each host the Windows Server 2012 host can be put into a special mode where it routes traffic within a given Virtual Subnet based on its own host policy. Any cross virtual-subnet traffic is sent to an external router for a network virtualization policy check and potentially forwarding the packet along the path to its eventual destination. In such a deployment the host only needs to be aware of policy for the virtual subnets of virtual machines running on that host and not for every virtual subnet of the routing domains of the virtual machines running on the host. In this scenario the Hyper-V Network Virtualization Gateway makes all cross-subnet routing decisions. The Hyper-V Network Virtualization Gateway must have all policy information for all virtual machines in the virtual subnets to which it routes packets.
For any Hyper-V Network Virtualization deployment a datacenter orchestrator is required. One such datacenter orchestrator is System Center Virtual Machine Manager (VMM). The datacenter orchestrator is responsible for providing a mechanism to get Hyper-V Network Virtualization policies to the appropriate gateway appliance. Note that PA IP routing is not managed by the datacenter orchestrator.
The Hyper-V Network Virtualization Gateway partner should provide a management console for basic configuration of the gateway appliance or appliance functionality. This management console should handle operations such as PA assignment, high availability, monitoring, and authentication. The management console will be responsible for configuring the aspects of the gateway not managed by a datacenter orchestrator such as VMM.
High Availability by using multiple physical Hyper-V Network Virtualization gateways is outside the scope of this article. The datacenter orchestrator (for example VMM) will need to configure and manage the gateway. Scenarios in which the datacenter deploys multiple gateways include scalability and resiliency. For scalability a single physical gateway may not be able to handle the load required for a datacenter. Additionally a single gateway introduces a single point of failure. The partner’s management console should handle deployments consisting of multiple gateways.
In the VMM model the Hyper-V Network Virtualization Gateway is managed via a PowerShell plug-in module. Partners building Hyper-V Network Virtualization gateways need to create a PowerShell plug-in module which physically runs on the VMM server. This plug-in module will communicate policy to the gateway. Figure 7 shows a block diagram of VMM managing a Hyper-V Network Virtualization deployment. Note that a partner plug-in runs inside the VMM server. This plug-in communicates to the gateway appliances. The protocol used for this communication is not specified here. The partner may determine the appropriate protocol. Note that VMM uses the Microsoft implementation of WS-Management Protocol called Windows Remote Management (WinRM) and Windows Management Instrumentation (WMI) to manage the Windows Server 2012 hosts and update network virtualization policies.
Figure 7: Using System Center to manage Hyper-V Network Virtualization
Windows Server 2012 can be used as the base platform for a Hyper-V Network Virtualization Gateway appliance.
Figure 8 shows the architecture for the Private Cloud Router using Windows Server 2012 as the base platform. In this scenario all virtual machines are in the same routing domain. The red, blue, and purple virtual machines have a CA IP prefix of 157.4, 157.3, 157.2, respectively. The services on CorpNet have IP addresses with a prefix of 157.1.
Figure 8: Private Cloud Gateway using Windows Server 2012
The WNV filter in the gateway receives traffic from the red, blue, and purple virtual subnets and sends it to CorpNet via the parent( Management) operating system stack. The virtual switch has a parent (host) VNIC for communication to the host operating system stack from the network virtualized world. The other network interface is bound to the physical network interface (pnic2) connected to CorpNet. Note that there is no requirement that there be two physical network interfaces in the Hyper-V Network Virtualization Gateway shown in Figure 8. A single physical network interface could be used along with two parent virtual network interfaces. However, datacenter administrators may prefer a physical separation of traffic. Two physical network interfaces in the gateway allow the consolidated datacenter and CorpNet traffic to interact with the gateway via different physical switches. Two physical network interfaces, one for the network virtualized world and one for the non-network virtualized world provides more flexibility in customer deployments.
The Hybrid Cloud scenario enables an enterprise to seamlessly expand their on-premises datacenter into the cloud. This requires a site to site VPN tunnel. This can be accomplished with Windows Server 2012 as the host platform and a per tenant Windows Server 2012 guest virtual machine running a Site To Site (S2S) VPN tunnel connecting the cloud datacenter with various on-premise datacenters. Windows Server 2012 S2S VPN supports IKEv2 and configuration of remote policy can be accomplished via PowerShell/WMI. In addition Windows Server 2012 guest virtual machines support new network interface offload capabilities that enhance the performance and scalability of the gateway appliance. These offload capabilities are discussed below in the Hardware Considerations section.
Figure 9: Hybrid Cloud with Windows Server 2012 based Gateway
Figure 9 shows a scenario where Red Corp and Blue Corp are customers of Hoster Cloud. Red Corp and Blue Corp seamlessly extend their datacenter into Hoster Cloud has deployed Windows Server 2012 based per tenant virtual machine gateways allowing Red Corp and Blue Corp to seamlessly extend their on-premise datacenters. In Figure 10 there is no requirement that Red Corp or Blue Corp run Windows Server 2012 S2S VPN, only that the customer’s on premise S2S VPN support IKEv2 to interact with corresponding Windows Server 2012 S2S virtual machines running on HostGW.
Figure 9 shows the internal architecture for HostGW. Each Routing Domain requires its own virtual machine. The technical reason for this is that a vmnic can only be associated with a single Virtual Subnet (VSID) and a VSID can only be part of a single routing domain. The VSID switch port ACL does not support trunking of VSIDs. Therefore the simplest way to provide isolation is with a per tenant (Routing Domain) gateway virtual machine.
Each of the virtual machines is dual homed which means they have two virtual network interfaces. One of the virtual network interfaces has the appropriate VSID associated with it. The other virtual network interface has a VSID of 0 which means traffic is not modified by the WNV filter. The Windows Server 2012 virtual machine is running RRAS and using IKEv2 to create a secure tunnel between Hoster Cloud and the customer’s on premise gateway.
Figure 10: Hybrid Cloud with Windows Server 2012 based Per Tenant VM Gateways
Figure 10 shows the architecture where VMM is managing a Hyper-V Network Virtualization deployment. The partner has a plug-in that runs in the VMM server. When using Windows Server 2012 as a Hyper-V Network Virtualization gateway appliance a local management process running in Windows is required as the end point for this communication from the plug-in running in the VMM server. This is how the plug-in is able to communicate network virtualization policy to the WNV filter running on HostGW.
The scalability of the Windows Server 2012 based gateway appliance will be determined by the server hardware including:
number of CPU cores
amount of RAM
offload capabilities and performance of the network interfaces
In general, a dual socket server class platform provided by major OEMs should be a sufficient base platform. The number of virtual machines that can effectively run on the server appliance is typically determined by the amount of RAM in the server.
The choice of network interfaces can also have a great impact on the performance and scalability of the gateway appliance. Figure 10 shows a two network interface configuration. Note that ALL traffic sent on the virtual subnet path will be NVGRE traffic. For encapsulated packets traditional offloads such as Large Send Offload (LSO) on the send path and Virtual Machine Queues (VMQ) on the receive path will not provide the expected benefit because they operate on the outer packet header. The outer header for NVGRE traffic makes it appear as though all traffic is generated using the same IP address and destined for a single IP address. Network interface offloads provide substantial benefit to the Hyper-V switch performance. A new feature in Windows Server 2012 is GRE Task Offload where the network interface will operate on the inner packet header for standard offloads such that LSO and VMQ provide their expected performance and scalability benefits. Therefore the network interface on the path for Virtual Subnets in Figure 10 should support GRE Task Offload. Such network interfaces are expected to be commercially available by the end of 2012.
Another new feature in Windows Server 2012 is IPsec Task Offload for guest virtual machines. IPsec is used for securing site to site VPN tunnels. Both sides of the S2S tunnel must support IKEv2. If the Cross-Premise gateway is NOT behind a NAT then IPsec Task Offload (IPsecTO) can provide better performance by having the External Network network interface in Figure 9 perform the per packet encryption operations on behalf of the per Tenant RRAS virtual machine. Both sides of the S2S tunnel would need to use AES-GCM128 to take advantage of IPsecTO. Therefore, the following conditions must be met for the RRAS virtual machine to take advantage of IPsecTO:
Both sides of the S2S tunnel must use AES-GCM128
The gateway appliance must not be behind a NAT
Because IPsec operations can consume a large amount of CPU, IPsecTO can provide a significant CPU reduction and also an increase in network throughput.
SR-IOV is a new Windows Server 2012 feature that directly exposes a physical network interface to the virtual machine. SR-IOV bypasses the virtual switch. Unfortunately, current IPsecTO network interfaces on the market do not support IPsecTO on the SR-IOV path. The Windows Server 2012 Hyper-V Switch introduces new policy-based features such as ACLs and QoS. If any of these Hyper-V switch policies are set on a virtual network interface , the virtual network interface traffic will not take the SR-IOV path and instead will need to go through the Hyper-V Switch path for policy enforcement. When traffic does take the SR-IOV path instead of the Hyper-V Switch path there will be a significant reduction in CPU utilization.
For the Private Cloud gateway similar considerations should be evaluated regarding hardware requirements.
Additional information regarding Hyper-V Network Virtualization and System Center Virtual Machine Manager can be found in the following articles (articles without links are still under development):
Windows Server 2012 Hyper-V Network Virtualization Survival Guide
Hyper-V Network Virtualization Overview
Hyper-V Network Virtualization technical details
Connect hosting provider and tenant networks for hybrid cloud services
//BUILD talk on Hyper-V Network Virtualization
//BUILD Keynote Demo (starts 13.5 minutes into video)
NVGRE Draft RFC