Taking a proactive stance toward server maintenance with dynamic hardware partitioning can help you balance your virtual server workload.
Server virtualization is one of the go-to technologies in today’s enterprise datacenter. Server virtualization lets you create multiple virtual machines (VMs) that share the same physical hardware. Each VM runs a separate instance of an OS using hardware resources assigned by the hypervisor (VM manager).
Optimally, you can use VMs to consolidate low-utilization servers. Instead of having numerous under-utilized servers, you have fewer servers, each with multiple VMs. Consolidating servers like this can not only save your organization money by reducing equipment costs and power consumption, it also can reduce management overhead and simplify server maintenance.
Although server virtualization is all the rage right now, it’s always not the best choice for high-utilization scenarios. As server workloads scale up dramatically, you need a server solution that scales dramatically as well. This is where hardware partitioning comes into the picture.
Hardware partitioning creates multiple isolated hardware partitions on a single server. Each hardware partition runs a separate instance of an OS and has processor, memory and I/O host bridge resources assigned to it by a service processor.
A partition manager communicates with the service processor to help you manage hardware partition configurations. Because hardware partitions are isolated from each other, hardware errors occurring on a partitioned server only affect that partition containing the malfunctioning hardware. This improves the overall reliability and availability.
That said, hardware partitions and VMs aren’t mutually exclusive. You can use the two technologies in concert by installing a VM manager in a hardware partition and creating VMs within that hardware partition. This combination helps you dramatically scale up and out using enterprise-class server hardware. You scale up to meet high-utilization needs and scale out to meet low-utilization needs. This ensures that you get the greatest impact out of both hardware management approaches.
Hardware partitioning can take a static or dynamic approach. In a static hardware partitioning environment, resource allocations are fixed while the system is running. You have to power down and restart an OS instance to change the configuration.
In a dynamic hardware partitioning environment, resource allocations are adjustable while the system is running. This means you can add or replace resources without restarting the OS running on the hardware partition. This greatly improves availability and serviceability.
Dynamic hardware partitioning requires OS support for optimal results. Windows Server 2008 R2 does support the dynamic hardware partitioning features shown in Figure 1, but does not currently support hot-remove. The released to manufacturing (RTM) version of Windows Server 2008 has the same level of support, except that it only supports hot-add memory and hot-add I/O host bridge on the x86-based systems Datacenter Edition. Native OS support for PCI Express lets you hot-plug PCIe devices, such as network adapters and host bus adapters.
Figure 1 Dynamic hardware partitioning support in Windows Server 2008 Edition
To support dynamic allocation, Windows Server 2008 models I/O bridges, processors and memory as plug-and-play devices. This lets you add or replace these resources. It also lets device drivers and running applications register for related notifications so it can allocate or transition resources. Each resource—memory, processor or an I/O host bridge—is handled as a discreet unit, referred to as a partition unit.
If a hardware component begins to fail, you’ll be alerted by hardware errors recorded in event logs. If a server is being over-utilized, you’ll be notified by performance counters or similar resource monitoring metrics. In either of those instances, you can proactively maintain your server by dynamically adding resources or replacing a problematic resource. You can do this in any of several ways:
Whether handled manually or triggered automatically, the service processor handles the “add or replace” request as a single atomic action. This means a replace is not the same as removing a resource (using hot-remove) and then adding a new resource of the same type (using hot-add). The service processor will handle a dynamic add operation by:
When Windows Server 2008 receives notification of the dynamic addition, it will take the following actions:
Hot-replace is only available for memory and processors (and then only when the replacement resource is identical to the original resource). The service processor handles a replace operation by:
Hot-replace is designed to be transparent to applications running on the partition’s OS. The pseudo S4 sleep state is the same as the regular S4 sleep state, except the OS doesn’t save a hibernation file or turn off. While in the sleep state, the OS ceases all processing and I/O operations, and devices in the partition are placed in a low-power state. If the OS is highly utilized, network connections to the OS may time-out during the hot-replace and have to be reconnected.
That’s what happens with dynamic partitioning and how you can use the related features in your datacenter environments to perform proactive maintenance. Remember, you wouldn’t use software RAID on a high-utilization server when there’s hardware RAID available, so you won’t likely use VMs on a high-utilization server when dynamic hardware partitioning is available.
Nothing is ever set in stone, though. There are times when you might want to combine techniques to get the benefits of being able to scale up and scale out quickly.