The Desktop FilesDeploying Windows in a Virtual World

Wes Miller

Virtual computing is everywhere. If you aren't taking advantage of it, you should be. Virtualization reduces hardware dependence by essentially creating a hardware abstraction layer of its own and letting you move one or more guest systems, such as Windows Server or Windows client operating systems, among host systems.

Virtualization is, of course, different from emulation in that it doesn't imitate the processor of the guest. It simply represents the resources of the host system in a way that guest systems can access them. As a result, host systems are generic to guests. You can generally move a virtual guest from a system built by one OEM to one built by another OEM—the hardware of the host usually doesn't matter. But there are caveats. For example, if you move a guest from hardware with a processor from one CPU vendor, such as AMD, to another, such as Intel, you could have problems (depending on the virtualization technology you're using). That's because virtual computing technology only passes information from the host to the guest and back; it doesn't emulate a specific CPU for the guest (as, for example, Microsoft® Virtual PC running on a legacy PowerPC-based Macintosh does).

However, virtualization does emulate key hardware components to the guest. Most often this is limited to networking, video (generally a very constrained device without an advanced emulated GPU), and mass storage. These compounds all function by presenting one or more types of software-emulated devices to the guest. Now, if you've been reading my column for a while, you'll notice that is the same list of devices that Windows® PE cares about. In virtualization, those are the same types of devices you need to have in order for Windows to really do any work. Additionally, all virtualization technologies must emulate a BIOS. While they could also emulate the Extensible Firmware Interface (EFI), the limited selection of EFI-based operating systems today renders that of limited use. All of this emulation allows virtual guests to boot. The BIOS and each of the devices emulate an actual device in software and present that device to guests. This means they require the same drivers (not always Windows-provided drivers) that the actual device would require. This is an important concept to bear in mind.

While some virtualization technologies also allow for USB (or USB 2.0) devices to interface with them, I won't dive into the particulars of these technologies here. Aside from those USB devices that require either drivers (printers, USB wireless NICs, and so forth) or certain DirectX® support (not usually present in most virtualization technologies), there isn't much you should have to do to get them to work. Keep in mind that support for USB or other non-emulated devices is, of course, dependent on the virtualization technology you are using. Be sure to know the limitations (sharp edges, as I like to say) of the virtualization product you're using before trying to get new devices to work with it.

There are two primary vendors of virtualization technology today on Windows: Microsoft and VMware (vmware.com). There are additional up-and-coming vendors such as Parallels (parallels.com) as well.

Now that you have an idea of what virtualization is all about, I'm going to spend the rest of this column explaining how to set it up, how to avoid the most common pitfalls, and how to deploy it across a number of machines in your environment.

Virtual Deployment

Deployment of virtual systems doesn't have to be different from deployment of physical systems. But as you'll see, there are some good reasons to make it different.

In the early days of Windows NT®, you had to deploy via setup. You could script it, but you had to run through the whole process. Once setup completed, copying that image to multiple systems, while quite a handy concept, simply wasn't supported.

Eventually, though, Microsoft decided it made sense to support disk-duplicated or "cloned" Windows NT systems. So today, every method available when deploying physical systems is available for virtual system deployment as well. You can use: Winnt32 (or setup.exe in the case of Windows Vista® and Windows Server® 2008); Windows PE (1.x or 2.x, depending on the client you are deploying as explained in my earlier columns); Remote Installation Services (RIS) or Windows Deployment Services (WDS); or Sysprep (the System Preparation tool for deployment introduced with Windows NT 4.0) and your favorite disk duplication technology (ImageX, for example).

But of course you only have to do that the first time you deploy a specific OS. After that, you may want to just copy it. But there's a problem with disk-based duplication methods like those I just mentioned.

Using Sysprep

The original Microsoft decision not to support disk-based duplication was prompted primarily by the Windows NT Security Identifier (SID). Fortunately, Sysprep provides a solution. But first let's look at the problem it solves. As discussed at support.microsoft.com/kb/314828, the SID consists of a SID structure revision number (usually a Globally Unique Identifier, or GUID) that identifies an individual Windows-based computer. This ID is then used as the root portion of the identifier for all local accounts. Local accounts have their own unique identifier, called a Relative Identifier (RID). The RID consists of an account ID concatenated onto the end of the SID. So the combination of the two becomes the identifier for local accounts.

Let's see why this is a problem by using the Administrator SID S-1-5-21-191058668-193157475-1542849698-500. Here, S-1-5 is the descriptor that defines that this is a SID (the S is omnipresent in the textual representation of a SID) and 1 and 5 represent the Windows NT SID revision number and the authority identifier value (here Windows Security), respectively. The rest is the actual SID, including 500, which identifies this as a well-known SID—the Windows Administrator account. The Administrator account created by default (and unable to be deleted) on all Windows installations has an SID that ends in 500. Local user accounts added to Windows after installation begin iteration above 1000.

PSGetSID, available from Windows Sysinternals (mentioned in my column on the PSTools at technetmagazine.com/issues/2007/03/DesktopFiles), allows you to enumerate a SID for a given user on a system or the system's SID. See Figure 1 for the output of PSGetSID for my virtual system's SID and the SID for my user account, 1003.

Figure 1 Output of PSGetSID for a virtual system's SID and the SID of user account 1003

Figure 1** Output of PSGetSID for a virtual system's SID and the SID of user account 1003 **(Click the image for a larger view)

Since local account RIDs are based on this SID, the problem that occurs when you disk-duplicate a system or just copy a virtual machine image should be relatively apparent. By not changing the SID (Sysprep's primary but not sole task), you end up with a copy of the key component that makes a Windows system unique. If both System A and System B had the same Administrator SID, users on each of the systems would legitimately identify themselves as the same user. The same would be true of all local accounts from System B when authenticating to System A, and vice-versa. Worse, these systems will have the same SID when presented to Active Directory®. So if you allow System A to authenticate to a domain resource, but you don't allow System B, you will wind up with a collision. If you set B to deny, then A will actually be denied as well.

Thus, it is critical that you regenerate SIDs on systems using Sysprep—especially in virtual system scenarios because system images can propagate all too easily. You shouldn't use a third-party SID-changing tool either—only Sysprep. Sysprep is designed, tested, and supported by Microsoft for preparing systems for duplication (even virtual systems). See Figure 2 for an example of what Sysprep looks like before changing the SID on a system. Now make sure the "Don't regenerate security identifiers" option is always unchecked if you are preparing the system for duplication as its next step.

Figure 2 "Don't regenerate security identifiers" should be unchecked when preparing for duplication

Figure 2** "Don't regenerate security identifiers" should be unchecked when preparing for duplication **

In addition to updating the SID to a new ID, Sysprep also modifies any private data stores that it is aware of to reflect the SID and machine name or to change their encryption to work with the new SID. Some examples are the Scheduled Tasks data store and values in the IIS Metabase (if IIS is installed).

Sysprep also forcibly deletes every NIC on the system and takes with it the network configuration data for the NIC. Because the network configuration "hangs off of" the NIC in the registry, and the relationship of that NIC is based on the Hardware ID of the NIC (which outside of a virtual-to-virtual system move is frequently likely to break), Sysprep cleans up this traditionally abandoned data.

Sysprep also cleans up all Active Directory membership information from a system. As a result, it must forcibly remove a system from the domain as a part of its work. This ensures that the systems that have just received new SIDs can be joined to the domain securely. Some SID-changing utilities let you change a machine's SID without removing it from the domain, but this is neither reliable nor secure. If you absolutely must run Sysprep on a machine that is a domain member, either remove it from the domain prior to running Sysprep or run Sysprep and let it handle that task for you.

On a related note, if you're virtualizing any of your domain controllers (DCs), you need to duplicate systems that are simply standalone servers that have not been promoted to DCs and are not joined to the domain. With the exception of Windows Server 2003 Small Business Server Edition, you cannot safely disk-duplicate a DC. To create new DCs safely, you should create a disk image of a server that is ready to be joined to the domain and promoted to a DC. Sysprep (except in the very specialized SBS instance, which is single forest/single server), is not aware of how to safely change SIDs on a DC.

Finally, in addition to changing the SID and removing the machine from the domain, Sysprep also changes the name of the machine.

It may seem heavy-handed to say that you need to perform all of the above tasks when imaging (or even just copying) virtual systems. But it is critical—especially if you are using these systems on a network with other physical or virtual systems, in a domain, or with any other copy of themselves on the network.

If you don't use Sysprep when duplicating virtual systems, you will almost certainly run into a number of obvious issues (Active Directory or other networking collisions) and quite a few you may not expect. For example, your virtual images are very susceptible to hacks since hacking one will grant access to the others.

Drivers and Hardware Abstraction Layers

I mentioned that the virtual devices included in a virtual image may not have "in-box" drivers for Windows. Ensure that you have drivers handy for your devices when deploying (or when deploying disk images using Sysprep), for the dreaded 0x0000007B driver error can come up just as easily when moving a virtual image from one storage-bus driver to another as it can when working with physical hardware. The same is true of NICs. While most virtualization products have sought to provide a virtual device that is rather universal, you may still need an additional driver for it.

You can't ignore that pesky hardware abstraction layer (HAL) either. Ideally, you want to create your virtual machines supporting advanced configuration and power interface (ACPI) multiprocessors (see intel.com/technology/iapc/acpi), if that is what your virtualization technology supports. Converting among HALs isn't generally supported (see support.microsoft.com/kb/309283 for more information on specifics). However, some virtualization technologies—or, more importantly, many migration technologies—promise that they can safely move a Windows installation that is non-ACPI to an ACPI installation, or vice versa. This is not true, and, to boot, the resulting Windows installation is not supported by Microsoft when and if you do run into problems. The same limitations discussed on the support Web page I just mentioned hold true for virtualized systems. See Figure 3 for an example of what the HAL looks like in Device Manager on one of my virtual machines—in this case running using the ACPI uniprocessor HAL. Not to be confused with single processor, this one is interchangeable with its multiprocessor kin.

Figure 3 HAL in Device Manager on a virtual machine

Figure 3** HAL in Device Manager on a virtual machine **(Click the image for a larger view)

Miscellaneous Changes

Change What You Can and Accept What You Can't

When considering the migration of physical to virtual or vice versa, you need to remember the things that can and cannot change. You can change the following aspects of a Windows installation:

  • The HAL (but only from uniprocessor to multiprocessor or visa versa, as long as they are based on the same power configuration).
  • Mass-storage controllers (this is not easy—but most physical-to-virtual migration solutions attempt to do this already on their own). Note that most vendors provide an IDE and a SCSI storage solution. Choose wisely when deploying, as moving from one to the other isn't terribly easy. Generally, choosing SCSI results in a more reliable device (this is the case with most vendor SCSI device emulation implementations).
  • Network controller (though in a virtual-to-virtual migration scenario, this will generally be the same within one vendor's technologies).

You cannot change the follow aspects of a Windows installation:

  • The HAL (except in the case mentioned earlier, when the same power configuration is in use). You should not assume that a migration solution that does this will result in a Windows installation that is stable or reliable (and more importantly, it will not be supported by Microsoft).

In addition to changing the SID and the machine name, you also need to change certain values that may be specific to the virtual computing technology you're using. In particular, you need to change the MAC address (the unique ID for networking devices). Plus, many virtual applications also have their own unique identifier. Most store these in their own machine configuration files, so you'll want to know how to manipulate those entries (and maintain their validity). Note that many virtualization products that support Pre-Boot Execution Environment (PXE) key the SMBIOS UUID based on their own unique ID—emphasizing the need to change this (or let the virtualization software change it for you, if supported) if you're joining it to a domain; otherwise, managing WDS or RIS-client systems can become impossible (if GUIDs conflict). Most of the virtualization solutions I've worked with can have severe networking problems in the case of duplicate MAC addresses; so if you are not just moving a virtual machine, it's very important that you change the MAC address if the virtualization software does not do it for you.

Other components you need to bear in mind when preparing a virtual system for deployment are any linked disks or snapshots. Depending on your virtualization solution, these may also be referred to as differential disks or may have another name. But if you run Sysprep to prepare a system and you have snapshots (or other revertible state) that go with the virtual system, these must be destroyed for the image to remain safe, reliable, and secure when it is duplicated. In the case of a snapshot or other "undo disk changes" technology approach, reverting a snapshot could mean regressing back to where more than one system that was sourced from the original virtual machine now conflicts with another (or even the original source system if it is brought back up). So any snapshots or differential disk relationships should already have had Sysprep executed on them.

Optimization

Most virtualization technologies include virtual machine additions or tools that will help improve the performance and experience of interacting with the guest from the host. This usually includes work optimizing the mouse and keyboard input, among other performance enhancements, and often includes improved copy and paste (or other host-to-guest) interaction. Install the most recent version of these tools in your virtual systems before deployment.

You also need to ensure that the client memory is configured optimally for the guest OS, but also in context with the hosts it will be deployed to. The last thing you want to do is deploy a Windows XP image set to use 1GB of RAM to host systems that don't have that much RAM to begin with.

Also remember the limitations that most virtual technologies have today—so it's important to your users how to interact with any peripherals attached to the virtual system as well as what applications will and will not function on the guest OS (most do not support DirectX 9 or 10, for example, or support older versions in a limited manner). Users may not know what that means or how it manifests itself (some applications don't handle such failure well).

Host Concerns

Note that in general there is not much you need to concern yourself with regarding the host PC running the virtualization technology, regardless of whether what is running underneath is a full operating system or a Type 1 Hypervisor that runs directly on top of the hardware. Most virtualization technologies are designed to ensure that the guest OS does not need to know anything (or needs to know little) about the host. Ensure you know what hosts are in use, however, in case a guest that's moved from one host to another has problems. Also ensure you know the limitations of your virtualization vendor's product on certain platforms. While you may be able to move from one to the other, you may lose or gain certain features of the host OSs Type 2 Hypervisor (the virtualization application) in the process.

Other Deployment Mechanisms

Related Sysprep Links

The following documents will give you lots of help with Sysprep:

Using Sysprep or disk duplication (or simply running Sysprep and copying the virtual machine) are obvious choices for deploying virtual systems, but there are others. In fact, whether you are using imaging or setup, using Windows PE can be easier with virtualization than with physical machines since you're working with ISO files instead of a physical CD. Note that some virtualization technologies can't deal well with DVD media, so be sure to check with your virtualization vendor about support. You can use winnt32 or setup.exe (in the case of Windows Vista or Windows Server 2008), but there aren't any specific benefits. If you use other Microsoft deployment technologies such as Automated Deployment Services, your virtualization technology supports PXE to kick off a network-based deployment, just as if you were utilizing RIS or WDS.

Migration

Finally, aside from actually migrating an entire PC, don't forget about the User State Migration Tool (USMT). USMT allows you to move a user's settings from a physical client to a new virtual system easily. So if your users want to migrate their old data and settings to a new virtual machine, you could, for example, easily get the files and settings from Windows XP, store them to a UNC, and push them onto a new virtual machine.

Wes Miller lives in Austin, Texas. Previously, he worked at Winternals Software in Austin and at Microsoft as a Program Manager and Product Manager for Windows. Wes can be reached at technet@getwired.com.

© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.