Virtualization: Physical vs. Virtual Clusters

Deploying multiple virtual machines into clusters requires some special management and configuration techniques.

Kai Hwang, Jack Dongarra and Geoffrey Fox

Adapted from “Distributed and Cloud Computing: From Parallel Processing to the Internet of Things” (Syngress, an imprint of Elsevier, 2011)

Clustering is an effective technique for ensuring high availability. It’s even more effective, flexible and cost-efficient when combined with virtualization technology. Virtual clusters are built with virtual machines (VMs) installed at distributed servers from one or more physical clusters. The VMs in a virtual cluster are interconnected logically by a virtual network across several physical networks.

Virtual clusters are formed with physical machines or a VM hosted by multiple physical clusters. Provisioning VMs to a virtual cluster is done dynamically to have the following properties:

  • The virtual cluster nodes can be either physical machines or VMs. You can deploy multiple VMs running different OSes on the same physical node.
  • A VM runs with a guest OS (often different from the host OS) that manages resources in the physical machine where the VM is implemented.
  • The purpose of using VMs is to consolidate multiple functionalities on the same server. This will greatly enhance server utilization and application flexibility.
  • You can colonize or replicate VMs in multiple servers for the purpose of promoting distributed parallelism, fault tolerance and disaster recovery.
  • The size (number of nodes) of a virtual cluster can grow or shrink dynamically, similar to how an overlay network varies in size within a peer-to-peer network.
  • If any physical node fails, it might disable some of the VMs installed on the failing nodes. However, any VM failure will not pull down the host system.

You have to effectively manage VMs running on a mass of physical computing nodes (also called virtual clusters) in a high-performance virtualized computing environment. This involves virtual cluster deployment, monitoring and management over large-scale clusters. You’ll also have to apply resource scheduling, load balancing, server consolidation, fault tolerance and other techniques. In a virtual cluster system, it’s important to store the large number of VM images efficiently.

There are common installations for most users or applications, such as OS- or user-level programming libraries. You can preinstall these software packages as templates (called template VMs). With these templates, users can build their own software stacks. They can also copy new OS instances from the template VM. You can have user-specific components such as programming libraries and applications installed to those instances in advance.

The physical machines (host systems) and VMs (guest systems) may run with different OSes. You can have each VM installed on a remote server or replicated on multiple servers belonging to the same or different physical clusters. The boundary of a virtual cluster can change as you add, remove or dynamically migrate VM nodes over time.

Fast Deployment and Effective Scheduling

The virtual environment you design should be capable of fast deployment. In this instance, deployment means two things: to construct and distribute software stacks (OSes, libraries and applications) to a physical node within clusters as fast as possible, and to quickly switch runtime environments from one user’s virtual cluster to another. If one user finishes using his system, the corresponding virtual cluster should shut down or quickly suspend operations to save the resources to run other VMs for other users.

The concept of “green computing” has recently attracted a lot of attention. However, previous approaches have focused on energy cost savings at the single workstation level. They lacked a broader vision. Consequently, they would not necessarily reduce the entire cluster’s power consumption.

You can only apply cluster-wide energy-efficient techniques to homogeneous workstations and specific applications. Live migrating VMs lets you transfer workloads from one node to another. However, it doesn’t guarantee those VMs can randomly migrate among themselves.

You can’t ignore the potential overhead caused by VM live migrations. That overhead might have serious negative effects on cluster utilization, throughput and quality of service issues. Therefore, the challenge is to determine how to design migration strategies to implement green computing without influencing cluster performance.

Another advantage for clustering conveyed by virtualization is load-balancing applications in a virtual cluster. You can use the load index and frequency of user logins to achieve a load-balanced status. You can implement the automatic scale-up and scale-down mechanism of a virtual cluster based on this model.

Consequently, you can increase mode resource utilization and shorten system response time. Mapping VMs to the most appropriate physical node should promote performance. Dynamically adjusting loads among nodes by live migrating VMs is helpful when the cluster node workloads become unbalanced.

High-Performance Virtual Storage

You can distribute the template VM to several physical hosts in the cluster to customize VMs. You can also use existing software packages to reduce the time for customization. It’s important to efficiently manage the disk space occupied by your template software packages. You can carefully design the storage architecture to reduce duplicated blocks in a distributed file system of virtual clusters, and use hash values to compare the contents of data blocks.

Your users will have their own profiles that store data block identification for corresponding VMs in a user-specific virtual cluster. When users modify the corresponding data, new data blocks are created. Newly created blocks are identified within the user profiles.

Basically, there are four steps to deploy a group of VMs onto a target cluster:

  1. Prepare the disk image.
  2. Configure the VMs .
  3. Choose destination nodes.
  4. Execute the VM deployment command on every host.

Many systems use templates to simplify the disk image preparation process. A template is a disk image that includes a preinstalled OS with or without certain application software. Users choose a proper template according to their requirements and make a duplicate as their own disk image.

Templates could use the Copy on Write (COW) format. A new COW backup file is very small and easy to create and transfer. Therefore, it definitely reduces disk space consumption. It also shortens VM deployment time, making it much more efficient than copying the whole raw image file.

Every VM is configured with a name, disk image, network setting, allocated CPU and memory. You need to record each VM configuration into a file. However, this is inefficient when managing a large number of VMs. VMs with the same configurations could use pre-edited profiles to simplify the process. The system would configure the VMs according to the chosen profile.

Most configuration items use the same settings. Some of them—such as UUID, VM name and IP address—are automatically assigned with calculated values. Normally, users don’t care which host is running their VM.

When considering your strategy for choosing your VM destination host, keep in mind your general deployment principle, which is to fulfill your needs for VM capacity, but also to balance workloads across the host network. That way, you’ll arrive at an efficient balance between your available resources and your workload.

Kai Hwang

Kai Hwang is a professor of computer engineering at the University of Southern California and a visiting chair professor for Tsinghua University, China. He earned a Ph.D. in EECS from the University of California, Berkeley. He has published extensively in computer architecture, digital arithmetic, parallel processing, distributed systems, Internet security and cloud computing.

Jack Dongarra

Jack Dongarra is a university distinguished professor of electrical engineering and computer science at the University of Tennessee, a distinguished research staff at Oak Ridge National Laboratory and a Turning Fellow at the University of Manchester. Dongarra pioneered the areas of supercomputer benchmarks, numerical analysis, linear algebra solvers and high-performance computing, and has published extensively in these areas.

Geoffrey Fox

Geoffrey Fox is a distinguished professor of informatics, computing and physics and associate dean of Graduate Studies and Research at the School of Informatics and Computing at Indiana University. He received his Ph.D. from Cambridge University, U.K. Fox is well-known for his comprehensive work and extensive publications in parallel architecture, distributed programming, grid computing, Web services and Internet applications.

©2011 Elsevier Inc. All rights reserved. Printed with permission from Syngress, an imprint of Elsevier. Copyright 2011. “Distributed and Cloud Computing: From Parallel Processing to the Internet of Things” by Kai Hwang, Jack Dongarra, Geoffrey Fox. For more information on this title and other similar books, please visit elsevierdirect.com.