Chapter 9 - The Art of Performance Monitoring

Archived content. No warranty is made as to technical accuracy. Content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Detecting the source of a performance problem isn't always a straightforward task. Sometimes it requires that you try different tools, running each in several ways, examining computer performance, and repeating the tests in a rigorous, scientific manner.

Problems can appear intermittently or be camouflaged by some greater or lesser matter. The following graph is an example of what you might see.


This is a Performance Monitor graph of processor and disk use over a 61-second interval. The white line represents disk activity; the black line represents processor activity. If you viewed just the first half of the interval, you would conclude that you have a disk bottleneck; the second half might lead you to believe you have a processor bottleneck. When the data is logged over time, you find that the processor is actually the problem—but you'd never know it from a one-minute glance.

This part of the Windows NT 4.0 Workstation Resource Guide is designed to help you tune and optimize Windows NT 4.0 Workstation. The remainder of the chapter includes some history and important information about Windows NT Workstation that affect how you monitor it.

The Resource Guide and Other Resources 

The following materials might be of interest as well:

  • Chapter 5 of this book, "Windows NT 4.0 Workstation Architecture," explains the organization of the Windows NT Workstation operating system. It's good background for understanding scheduling of processes, thread priorities, and changes in the architecture of the graphics subsystem that you'll be monitoring. 

  • Chapter 17 of this book, "Disk and File System Basics," provides the hardware background for the chapter on detecting disk bottlenecks. 

  • The Win32 Software Development Kit describes how to monitor and optimize your own applications and how to create extensible counters for Performance Monitor. 

  • The Windows NT Server Networking Guide describes the architecture of the Windows NT 4.0 Server and network and includes some information about network bottlenecks and capacity planning. 

  • The Windows NT Server Concepts and Planning Guide includes a clear and concise overview of performance monitoring with lots of useful tips and examples. 

Optimizing Windows NT 4.0 Workstation

Cc749835.spacer(en-us,TechNet.10).gif Cc749835.spacer(en-us,TechNet.10).gif

An original design goal of Windows NT was to eliminate the many parameters that characterized earlier systems. Adaptive algorithms were incorporated in Windows NT so that correct values are determined by the system as it runs. The 32-bit address space removed many limitations on memory and the need for users to manually adjust parameters to partition memory.

Windows NT has fundamentally changed how computers will be managed in the future. The task of optimizing Windows NT is not the art of manually adjusting many conflicting parameters. Optimizing Windows NT is a process of determining what hardware resource is experiencing the greatest demand and then adjusting the operation to relieve that demand.

Windows NT did not achieve the goal of automatic tuning in every case. A few parameters remain, mainly because it is not possible to know precisely how every computer is used. Default values for all parameters are set for a broad range of normal system use, and they rarely need to be altered. But special circumstances sometimes call for changes. In this book we will be sure to mention the few tuning parameters that remain in Windows NT and indicate when it is appropriate to change them from their default values.

Defining Bottlenecks

Cc749835.spacer(en-us,TechNet.10).gif Cc749835.spacer(en-us,TechNet.10).gif

A bottleneck is a condition in which the limitations in one component prevent the whole system from operating faster. The device with the lowest maximum throughput is the most likely to become a bottleneck if it is in demand. Making any other device faster can never yield more throughput; it can only result in lower utilization of the faster device.

Even if all other components are infinitely fast, a bottleneck holds the system at a stall until it is cleared.

Although a foolproof bottleneck alarm and a direct bottleneck counter aren't available, you can combine several different indicators to look for bottlenecks. The primary indicator is an extended high rate of use on one hardware resource and resulting low rates of use on related components. It is accompanied by sustained queues for one or more services, and slow response time.

Bottlenecks, Utilization, and Queues

The best bottleneck alarm is system response time, as perceived by the user. Users' perceptions are affected by their expectations and the kind of work they do. An accurate bottleneck alarm would be designed to reflect these same expectations and requirements. You needn't demand the same throughput on a system supporting word processing as you do on one madly calculating routes to Jupiter. Even if your processors, disks, and memory are running at near capacity, if they are not developing the queues that degrade their response time, you don't have a problem (although you might want to plan more capacity for the future).

Although 100% utilization of a resource is a clear warning, it is neither a necessary nor sufficient condition for a bottleneck. You can have bottlenecks on devices with utilization well below 100% and you can, at least in theory, have a device perking along at nearly 100% utilization with no signs that it is a bottleneck. That is, the device is not preventing any other resource from getting its work done, nothing is waiting for it, and even if it were infinitely fast, things wouldn't happen any sooner.

A bottleneck is determined by the number of requests for service, the arrival pattern of the requests, and the amount of time requested. If these factors are perfectly synchronized, no queues develop. But if they are random or unpredictable, queues develop at much lower utilization rates.

For example, suppose a process had ten threads, each of which used exactly 0.999 seconds of processor time once every ten seconds. If each request arrived exactly one second after the previous one in perfect sequence, the processor would be 99.9% busy, but there would be no queue, no interference between the threads and, technically, no bottleneck.

Admittedly, this is a highly idealized situation, but it's easy to see how any disruption in the pattern would quickly create a large queue. According to queuing theory, if the arrival pattern of requests and the duration of requested services are random or unpredictable, a device that is 66% utilized will produce a queue of two items. Even worse, if, instead of being random, requests for service are either very short or very long, queues can form at even lower utilization. That is, fewer requests for service produce even longer queues.

Monitoring Basics

Cc749835.spacer(en-us,TechNet.10).gif Cc749835.spacer(en-us,TechNet.10).gif

These chapters introduce several tools to help you monitor hardware and software performance. Many tasks require switching between or combining tools. But no matter which tools you choose, some basic concepts are common to all of them. This section describes those commonalties and describes how to monitor

  • System objects

  • System processes

  • 16-bit Windows applications

  • MS-DOS applications

These topics are an introduction to the larger topics of using Performance Monitor, Task Manager, and the other Windows NT Resource Kit 4.0 CD tools to optimize Windows NT.

System Objects

Windows NT sees the active components running on the system as objects with characteristic properties. Some, such as processes and threads, are familiar; others, such as mutexes and semaphores, are less well known. For more information on Windows NT 4.0 objects, see "Microkernel Objects" in Chapter 5, "Windows NT 4.0 Workstation Architecture."

System object counts are important because each object takes up space in the operating system's nonpaged memory. Some just perform quick housekeeping and bookkeeping functions at background priority and rarely become a bottleneck. However, too many threads and processes can degrade performance on all functions, resulting in a bottleneck in processor or memory use.

Several performance monitoring tools let you keep track of the number of objects in your system:

  • In Process Explode, the Objects box at the top of the first column displays counts for all system objects including events, sections, mutexes, and semaphores, as well as processes and threads.

  • In Task Manager, select the Performance tab. The total number of active handles, threads, and processes for the system appear in the Totals box.

  • In Performance Monitor, select the Object performance object, then select the counter for the type of object you want to track.. These include counters for operating system objects.

System Processes

Processes, which include both user applications and Windows NT services, can become bottlenecks. While investigating processor, disk, or memory use, chart use by process, and then start and stop the processes to see how your system responds.

Performance Monitor and Task Manager both show counts of running processes, including user programs and Windows NT services:

  • Task Manager is useful for short term monitoring. It lets you stop and restart applications and system services.

  • Performance Monitor includes more detailed counters and lets you log process data over time in chart or report format, set alerts on each process's use of resources, and monitor processes on remote computers.

Many of the tools on the Windows NT Resource Kit 4.0 CD also monitor processes in detail, including Process Viewer (PViewer.exe) and Process Monitor (PMon.exe). For more information, see Chapter 11, "Performance Monitoring Tools," and Rktools.hlp.

Note The Services Control Panel also displays Windows NT services and lets you start and stop them. The Services Control Panel shows all Windows NT services, regardless of the process in which they run. However, it lists services by service name whereas Performance Monitor and Task Manager display the names of executable files.

For a list of the default services and a description of each, see Windows NT Help in the Services Control Panel. Click Start, click Help, and type Default Services.

Task Manager

In Task Manager, select the Processes tab. It displays a table of active processes. From the View menu, click Select Columns to add additional measures of the processor time, memory use, process priority, handle and thread counts, and the process ID.


Performance Monitor

In Performance Monitor, select the Process object from the Add To dialog box. All active applications and services appear in the Instances box.


The following table lists processes commonly running on Windows NT 4.0 Servers and Workstations without a network connection. It shows them as they appear in Performance Monitor and in Task Manager.

Note Process Explode (Pview.exe), Process Viewer (Pviewer.exe), and Process Monitor (Pmon.exe) all display important counts of system processes. Although the information from these tools is instantaneous and cannot be logged or collected, the tools require almost no setup, so they are very valuable for a quick look.

Process name



The sum of active processes, including idle. (Performance Monitor only.)


Client Server Runtime Subsystem, provides text window support, shutdown, and hard-error handling to the Windows NT environment subsystems.
Note: Client Server Runtime Subsystem changed substantially with Windows NT 4.0. For more information, see "What's Changed for Windows NT 4.0" in Chapter 5, "Windows NT 4.0 Workstation Architecture."


Windows NT Explorer, a segment of the user interface which lets users open documents and applications from a hierarchical display.

(System Idle Process)

A process that runs to occupy the processors when they are not executing other threads. Idle has one thread per processor.
For more information, see "The Idle Process" in Chapter 13, "Detecting Processor Bottlenecks."


License Logging Service, the service that logs the licensing data for License Manager in Windows NT Server and the Licensing option in Control Panel on both Windows NT Server and Workstation.


Local Security Administration Subsystem, the process running the Local Security Authority component of the Windows NT Security Subsystem. This process handles aspects of security administration on the local computer, including access and permissions. The Net Logon service shares this process.


Network DDE Agent, handles requests for network DDE services.


NT Virtual DOS Machine, which simulates a 16-bit environment for MS-DOS and 16-bit Windows applications.


Performance Monitor executable.


Remote Procedure Call (RPC subsystem) which includes the RPC service and RPC locator.


This process is shared by the Windows NT Services Control Manager, which starts all services, and a group of Windows NT 32-bit services, including Alerter, Clipbook Server, Computer Browser, Event Viewer, Messenger, Server and Workstation, and Plug and Play.


Session Manager Subsystem


Spooler Subsystem controls despooling of printer data from disk to printer.


Contains system threads that handle lazy writing by the file system cache, virtual memory modified page writing, working set trimming, and similar system functions.


Task Manager executable.


Logon process executable. It manages logon and logoff of users and remote Performance Monitor data requests.

No matter what tool you choose, the processes that appear depend upon whether the computer is a server or workstation, and upon the services installed on the computer, including network services. User applications, including the executables for Performance Monitor and Task Manager, appear only when they are running.

Also, a process instance might not be visible for every active service. Performance Monitor and Task Manager display an instance for each executable process running on the system. Many services share a process to conserve system resources, so these appear together as one instance.

For example, many Windows NT 32-bit services, including Alerter, Clipbook Server, and Event Viewer, share the Services.exe process with the Windows NT Services Control Manager, a general process that starts all system services. Net Logon shares the Lsass.exe process with other security services.

It's difficult to monitor these services separately, although you can experiment in associating a service with threads in the process. The SC utility, in the Computer Configuration subdirectory on the Resource Kit CD, displays useful service configuration information, including the name of the process in which the service runs. For more information on SC, see Rktools.hlp.

Optimizing 16-bit Windows Applications

In Windows NT 4.0 Workstation and Server, by default, all active 16-bit Windows applications run as separate threads in a single multithreaded process called NT Virtual DOS Machine (NTVDM). The NTVDM process simulates a 16-bit Windows environment complete with all of the DLLs called by 16-bit Windows applications.

This configuration poses two challenges for running 16-bit applications:

  • It prevents 16-bit applications from running simultaneously, which might impede their performance.

  • It makes monitoring a bit trickier.

As a result, Windows NT 4.0 includes an option to run a 16-bit application in its own separate NTVDM process with its own address space.

You can monitor 16-bit Windows applications by identifying them by their Thread ID while they are running, or by running each application in a separate address space.

In addition to the 16-bit applications, each NTVDM process includes a heartbeat thread that interrupts every 55 milliseconds to simulate a processor timer-tic, and the Wowexec.exe thread, which helps to create 16-bit tasks and to handle the delivery of the 16-bit interrupt. You will see the heartbeat and Wowexec threads when monitoring 16-bit applications.

Win16 Application Performance

The NTVDM process is multitasking: A thread in the process (in this case, a 16-bit Windows application) can run at the same time as threads of other processes if the computer has more than one processor. It is also preemptible: Threads can be interrupted and resumed to allow virtual multitasking on a single-processor computer.

However, only one 16-bit Windows application thread in an NTVDM can run at one time and, if an application thread is preempted, the NTVDM always resumes with the same thread. This limits the performance of multiple 16-bit applications running in the same NTVDM process, although this limitation becomes an issue only when the processor is very busy.

Monitoring Win16 Applications

Almost all performance monitoring tools can monitor 16-bit applications on Windows NT 4.0 Server and Workstation. However, because they run in the same process, the trick to monitoring more than one 16-bit application is to distinguish among the threads of the NTVDM process.

To monitor one 16-bit application, simply select the NTVDM process in Performance Monitor, Task Manager, Process Explode, Process Viewer, Process Monitor, or another tool. If you have multiple 16-bit processes running in NTVDM, you can distinguish them by their thread IDs in all tools except Process Monitor. You might have to start and stop the 16-bit process to determine which thread ID is associated with which 16-bit process.


This figure is a Performance Monitor report on an a single NTVDM process (Process ID 105) with three threads. One of the threads is the heartbeat thread (Thread #0, Thread ID 118), one is the Wowexec thread (Thread #1, Thread ID 140), one is a 16-bit application, Write.exe (Thread #2, Thread ID 46).

Performance Monitor identifies threads by the process name and a thread number. The thread numbers are ordinal numbers (beginning with 0) that represent the order in which the threads started. The thread number of a running thread changes when a thread with a lower number stops; all threads with higher number move up in order to close the gap. For example, if thread 1 stopped, thread 2 becomes thread 1. Therefore, thread numbers are not reliable indicators of thread identity.

Performance Monitor can monitor the Process ID and Thread ID of a thread. The Process ID is the ID of the process in which the thread runs. Thread ID is the ID of the thread. Unlike thread number, it is assigned when the thread starts and remains with it until the thread stops.

The Process and Thread IDs are just ordinal numbers that are associated with the process or thread only for a single run. On subsequent runs, they just as likely to be assigned a different ID. However, you can use the ID to track them during execution.


This figure shows Process Explode monitoring a 16-bit Windows application running in a single process (Ntvdm.exe). The three threads displayed in the Thread ID box (midway down the first column) represent the heartbeat thread, the Wowexec thread, and the thread of the 16-bit Windows application.

To see information about the thread in Process Explode, click on the Thread ID of the thread in the Thread ID box.


Task Manager makes it easy to identify 16-bit applications, because it displays the names of the executable files indented below the NTVDM process name. To monitor 16-bit processes in Task Manager, click the Processes tab, and from the Options menu click Show 16-bit Tasks.

In this example, you can see the Wowexec and Write threads. The heartbeat thread is not an executable and does not appear in Task Manager. However, the Thread Count column on the far right shows that all three threads are running in the NTVDM process.

Running Win16 Applications in a Separate Process

Windows NT 4.0 lets you opt to run a 16-bit Windows application in separate, unshared NTVDM process with its own memory space. This eliminates competition between NTVDM threads in a single process, making the 16-bit application thread fully multitasking and preemptible. It also simplifies monitoring.

To run a 16-bit application in its own address space, you can do any of the following:
  1. Click Start, then click Run. When you enter the name of the 16-bit process, the Run in a Separate Memory Space option is enabled. Click the option and click OK


  2. From the command line, type 

    start /separate processname

    You can also type: 

    start /shared processname

    to run in the shared NTVDM process.

  3. Create a shortcut to the process: Click the right mouse button on the shortcut, then click Properties. Click the Shortcut tab, then click the Run in Separate Memory Space option.


    Tip Create two shortcuts to each of your 16-bit processes: One to run it in a separate memory space and one to run it in the shared memory space.

In Task Manager and Performance Monitor, two instances of the NTVDM process appear in the Process object Instances box. You can use their process IDs to distinguish between them.


This example shows Task Manager monitoring two copies of 16-bit Write, each in its own NTVDM process.

When a 16-bit process runs in its own memory space, Performance Monitor shows two instances of the NTVDM process. You need to use process IDs to distinguish between them. (You might have to stop and start the processes to make the distinction.)


Monitoring MS-DOS Applications

In Windows NT 4.0, each MS-DOS application runs in its own NTVDM process, eliminating some of the problems encountered in Win16 applications. Unfortunately, all of the NTVDM processes are called Ntvdm.exe by default, but you can change that.

To create a new process name for an NTVDM
  1. Copy Ntvdm.exe to a file with a different name.

  2. Edit the Registry by using a Registry editor. Regedt32.exe and Regedit.exe are installed when you install Windows NT.











  3. Double-click the cmdline value entry to change ntvdm.exe to the name of your copy of Ntvdm.exe. When you start an MS-DOS application, it will run in a process with that name. 


Tip You don't have to restart the computer for the registry change to take effect. Thus, you can change the registry between starting different DOS applications and have each start in a uniquely named process. It is also prudent to set it back to Ntvdm.exe when you are finished.

Unfortunately, this doesn't work with 16-bit Windows applications, so you need to distinguish those by thread or by process ID.

The Cost of Performance Monitoring

Cc749835.spacer(en-us,TechNet.10).gif Cc749835.spacer(en-us,TechNet.10).gif

Performance monitoring tools are quite sophisticated, but they are plagued by the problem common to all investigative tools: Using them changes their results. Performance tools are just applications and, as such, they occupy the processor, use memory and disk space, and tax the graphics subsystem of the Windows NT Executive. Make sure to measure the effects of these tools, and subtract them from your data.

Note Performance Monitor for Windows NT 4.0 has lower overhead than previous versions, due almost entirely to changes in the Windows NT 4.0 architecture. Most of Performance Monitor overhead is consumed by its graphic displays, which are now more efficient, not by data collection.

Response Probe, a monitoring tool included on the CD, has no apparent overhead. It monitors its own toll on the system and subtracts it before displaying its results.

  • To monitor Performance Monitor, include the Perfmon.exe process and its threads in your logs and charts and subtract them from your data.

  • To determine how much disk space is consumed by Performance Monitor, log updates by doing a series of manual updates and watching the change in the log file size on the status bar in Log view.

  • To measure the cost of monitoring particular objects, record the change in file size while adding and deleting those objects from a chart.

  • If the cost of monitoring is too high, lengthen the Performance Monitor Update Interval to at least 15 seconds. Change the Task Manager Update Speed to Low.

  • After you have collected baseline logging data, use Alerts to warn you of discrepancies. Alerts have the least overhead of any Performance Monitoring method. You can also log data over the network if you're studying disk performance.