Understanding Troubleshooting

Microsoft Windows XP Professional provides a comprehensive set of troubleshooting tools for diagnosing and resolving hardware and software problems. Using these tools effectively requires an understanding of basic troubleshooting concepts and strategies.

For information on how to obtain the Windows XP Professional Resource Kit in its entirety, please see https://www.microsoft.com/mspress/books/6795.asp.

Bb457121.3squares(en-us,TechNet.10).gif

On This Page

Related Information
Troubleshooting Overview
Troubleshooting Concepts
Troubleshooting Strategies
Establishing a Troubleshooting Checklist
Additional Resources

  • For more information about enabling, disabling, and managing devices, see Chapter 9, “Managing Devices.”

  • For more information about troubleshooting tools use and syntax, see Appendix C, “Tools for Troubleshooting.”

  • For more information about troubleshooting Stop messages, see “Common Stop Messages for Troubleshooting” on the companion CD.

  • For more information about system and performance monitoring, see “Overview of Performance Monitoring” in the Operations Guide of the Microsoft Windows 2000 Server Resource Kit.

Troubleshooting Overview

Whether an issue stems from a hardware or software problem, you need a reliable troubleshooting plan. Guesswork and random solutions are unreliable and often unsuccessful. An effective troubleshooting plan starts with gathering information, observing symptoms, and doing research.

Figure 27-1 illustrates a six-step troubleshooting model used by Microsoft Product Support Services engineers, who call it the “detect method.”

Figure 27-1 Troubleshooting model

Figure 27-1 Troubleshooting model

Based on research in problem solving, the six steps of this troubleshooting model are as follows:

  • Discover the problem.

    Identify and document problem symptoms, and search technical information resources to determine whether the problem is a known condition. For more information, see “Identify Problem Symptoms” and “Check Technical Information Resources” later in this chapter.

  • Evaluate system configuration.

    Review your system’s history to determine what configuration changes occurred since the computer last worked correctly. Did you install new hardware or software? Did you verify that the hardware or software is fully compatible with Windows XP Professional? For more information, see “Review Your System’s History”, “Check Firmware Versions”, and “Avoid Common Pitfalls” later in this chapter.

  • Track possible solutions.

    Instead of using the trial-and-error approach, review Microsoft Knowledge Base articles. You can simplify troubleshooting by temporarily removing hardware and software that is not needed for starting Windows XP Professional. Consider enabling Windows XP Professional logging options to better evaluate your troubleshooting efforts. For more information, see “Troubleshooting Strategies” later in this chapter.

  • Execute a plan.

    Test potential solutions and have a contingency plan if these solutions do not work or have a negative impact on the computer. Be sure to back up critical system or application files. For more information, see “Avoid Common Pitfalls” later in this chapter.

  • Check results.

    Determine whether your plan was successful. Have another plan in place to address unresolved issues.

  • Take a proactive approach.

    Document changes that you make along the way while troubleshooting the problem. After resolving the problem, organize your notes and evaluate your experience. Think about ways to avoid or reduce the impact of the problem in the future. For more information, see “Document and Evaluate the Results” and “Take Proactive Measures” later in this chapter.

For more information about the preceding steps, see “Troubleshooting Concepts” and “Troubleshooting Strategies” later in this chapter.

Troubleshooting Concepts

The immediate goal of any troubleshooting session is to restore service as quickly as possible. However, the larger goal is to determine the cause of the problem. Root-cause analysis is the practice of searching for the source of problems to prevent them from recurring.

Problems represent deviations from known or expected behavior, and the most effective way to solve a problem is to gather information before acting and then isolate and eliminate variables.

Identify Problem Symptoms

Start by observing and identifying symptoms of the problem. You need to learn more about the circumstances in which problems occur and become familiar with system behavior when issues arise. Here are some questions that you can use to help identify symptoms:

Do error messages appear?

If error messages appear, record the error numbers, the exact message text, and a brief description of the activity. This information is useful when researching the cause of the problem or when consulting with technical support. In your description, include events that precede or follow a problem and the time and date of the error. For complex or lengthy messages, you can use a program such as Microsoft Paint (Mspaint.exe) to record the error message as a bitmap.

To capture an on-screen error message
  1. Click the window or dialog box that contains the error message.

  2. To capture the contents of the entire desktop, press PRINT SCREEN (or PrtScn).

    – or –

    To capture an image of the active (foreground) desktop window only, press Alt+Print Screen (or PrtScn).

  3. In the Run dialog box, in the Open box, type:

    mspaint

  4. On the Edit menu, click Paste.

  5. If the prompt The image in the clipboard is larger... appears, click Yes.

  6. On the File menu, type a file name for the image and then click Save.

Error messages might appear before Windows XP Professional starts. For example, motherboard or storage adapter firmware might display an error message if self-tests detect a hardware problem. If you are unable to record the message quickly enough, you can pause the text display by pressing PAUSE BREAK. To continue, press Ctrl+Pause Break.

Did you check Event Viewer logs?

Entries in Event Viewer’s application, security, and system logs might contain information helpful for determining the cause of the problem. Look for symptoms or signs of problems that occur at frequent or regular intervals. For more information about Event Viewer, see Windows XP Professional Help and Support Center and Appendix C, “Tools for Troubleshooting.”

Did you check log files on your computer?

Error messages sometimes direct you to view a log file on your computer. The operating system or an application typically saves log files in text format. By using Notepad or an equivalent text editor, you can view the contents of a text log file to determine whether it contains information useful for troubleshooting your problem.

Does the problem coincide with an application or activity?

If the problem occurs when an application is running or during activities such as network printing or Internet browsing, you can reproduce the error to observe details and gather information for troubleshooting purposes. Be sure to record what applications and features are being used when the problem occurs.

Do previous records exist?

Check to see whether there are records that describe changes, such as the software installed or hardware that has been upgraded. If records are not available, you might query users or other support technicians. Pay special attention to recent changes such as Service Packs applied, device drivers installed, and motherboard or peripheral firmware versions. This information can help you determine whether the problem is new or a condition that has worsened.

Is baseline information available?

Baseline information is system configuration and performance data taken at various times to mark hardware and software changes. If possible, compare current baselines with previous ones to determine the effects of recent changes on system performance. If previous baselines are not available, you can generate a baseline to evaluate recent efforts to troubleshoot your current system configuration. For more information about generating configuration baselines of your systems’ performance and hardware, see Windows XP Professional Help and Support Center and “Overview of Performance Monitoring” in the Operations Guide of the Microsoft Windows 2000 Server Resource Kit.

Do other users who log on to the same computer have similar problems? Are all users who do not experience problems using Administrator accounts, or do they share other common attributes? For example, check whether the problem occurs when using a newly created user account.

Determine whether the same error occurs on more than one computer on a network. See whether the error happens when you log on locally or use a domain account. For a network-related error that occurred during startup, try disconnecting the network cable and restarting the computer. For more information about troubleshooting startup problems, see Chapter 29, “Troubleshooting the Startup Process.”

Is incompatible or untested software installed?

Are you using unsigned or beta drivers? Installing software not fully tested for compatibility with Windows XP Professional or using unsigned drivers can cause erratic behavior or instability.

Do you have backups to examine?

If you can establish a time frame for the problem, try to locate earlier system backups. Examining the differences between current and previous configurations can help you identify system components or settings that have changed. In addition to examining backups, you can use the System Restore tool to save or restore system states. By comparing the current state to past states, you might be able to determine when the changes occurred and identify the components or settings affected. For more information about System Restore, see Windows XP Professional Help and Support Center and Appendix C, “Tools for Troubleshooting.”

Check Technical Information Resources

After you gather information about key symptoms, check internal and external technical information sources for ideas, solutions, and similar or related symptoms reported by others.

Information resources such as Windows XP Professional–related newsgroups and the Microsoft Knowledge Base can save you time and effort. The ideal situation is that your problem is a known issue, complete with solutions or suggestions that point you in the right direction. See sources of information shown in Table 27-1.

Table 27-1 Help and Information Sources

Source

Description

Help and Support Center

Provides access to troubleshooting tools, wizards, information, and links that cover a wide range of Windows XP Professional–related topics, including:

  • Hardware devices, such as modems and network adapters

  • Networking and the Internet

  • Multimedia applications and devices

  • E-mail, printing, and faxing

  • Working remotely

  • Remote assistance and troubleshooting

  • System information and diagnostics

  • Troubleshooting tools and diagnostic programs provided by Windows XP Professional

To do a search using this feature, on the Start menu, click Help and Support.

Help Desk, Problem Management Department

Technicians who have access to a wide range of information and history, including common problems and solutions.

International Technology Information Library (ITIL) and Microsoft Operations Framework (MOF) Web sites

Sites that provide information for developing, troubleshooting, planning, organizing, and managing information technology (IT) services. The ITIL Web site provides an online glossary of commonly used industry terms used in IT-related documents. For more information, see the International Technology Information Library (ITIL) link and the Microsoft Operations Framework (MOF) link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Internet newsgroups

Technical newsgroups such as those available at
news://msnews.microsoft.com offer peer support for common computer problems. You can exchange messages in an appropriate forum to request or provide solutions and workarounds. Newsgroup discussions cover a wide range of topics and provide valuable information that might help you track down the source of your problem. Viewing newsgroup messages generally requires newsreader software, such as Outlook Express. An alternative approach is to use a Web-based newsgroup reader such as the one available on the Microsoft Technical Communities Web site at https://www.microsoft.com/communities.

Manufacturers’ Web sites

Web sites offered by manufacturers of computers, peripherals, and applications to provide Web support for their products.

Microsoft Knowledge Base

An extensive list of known problems and solutions that you can search. If you are unfamiliar with searching the Microsoft Knowledge Base, see article 242450, “How to Query the Microsoft Knowledge Base Using Keywords.” To find this article and for more information about the Microsoft Knowledge Base, see the Microsoft Knowledge Base link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Microsoft Product Support Services

A Web site that contains technical information, useful links, downloads, and answers to frequently asked questions (FAQs). To access the support options available from Product Support Services, see the Microsoft Product Support Services link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Microsoft TechNet

A subscription-based service for IT professionals that enables you to search technical content and topics about Microsoft products. For more information about TechNet, see the Microsoft TechNet link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Other online information Web sites

Many Web sites maintained by individuals and organizations provide troubleshooting information for Microsoft Windows 98, Microsoft Windows Me, Microsoft Windows NT version 4.0, Microsoft Windows 2000, Microsoft Windows Server™ 2003, and Windows XP Professional. Some of these Web sites specialize in hardware issues; others, in software.

Readme files

Files that contain the latest information about the software or driver installation media. Typical file names are “Readme.txt” or “Readme1st.txt.”

Reference books

Reference books such as the Microsoft Windows 2000 Server Resource Kit, the Microsoft Windows Server 2003 Deployment Kit, and the Microsoft Windows Server 2003 Resource Kit provide helpful information for diagnosing problems.

Technical support

Technical support can help you solve a complex problem that might otherwise require substantial research time.

Training

Instructor-led or self-paced training can increase your troubleshooting efficiency.

Windows Update Web site

A site that contains downloadable content, including current information about improving system compatibility and stability. For more information about Windows Update, see the Windows Update link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Before you apply a solution or workaround, or test an upgraded or updated application, use Backup to back up your system. Backups allow you to restore the computer to the previous state if you are not satisfied with the results. For information about backing up your system, see Chapter 14, “Backing Up and Restoring Data.”

If your organization has test labs to use, consider testing workarounds and updates in a lab environment before applying them to multiple systems. For more information about software compatibility testing, see “Avoid Common Pitfalls” later in this chapter.

Review Your System’s History

Review the history of your computer to know about recent changes, including all hardware and software installed. If baseline or change records exist, look for information about new devices, new applications, updated drivers, and change dates—as well as descriptions of the work done. If records are not available, you can learn much about your computer by querying users and internal support personnel or by using tools such as Device Manager and System Information. For more information about Device Manager and System Information, see Windows XP Professional Help and Support Center and Appendix C, “Tools for Troubleshooting.” Also, see Chapter 9, “Managing Devices.”

Here are a few points to consider when reviewing the history of your computer:

  • Did problems occur shortly after the installation of a particular application?

  • Was a software update applied?

    Microsoft technical support  might provide a software update for an urgent or critical issue. Software updates address a specific issue and might not be fully tested for compatibility. For example, a software update that works for one computer might cause unwanted results in another. Carefully read and follow the instructions before applying a software update.

  • Did the problem occur soon after new hardware was installed?

  • Why were hardware or software updates made?

    Are the motherboard and peripheral firmware current? Can you establish a relationship between the problem and the recent change?

  • Are any non–Plug and Play devices installed?

    If so, you can check for proper configuration by using hardware diagnostic programs and Device Manager. Try replacing non–Plug and Play devices with hardware that is compatible with Windows XP Professional. For more information about selecting hardware, see “Avoid Common Pitfalls” later in this chapter.

  • Was a new user recently assigned to the computer?

    If so, review system history to determine whether the user has installed incompatible hardware or software.

  • When was the last virus check performed?

    Does the virus scanning software incorporate the latest virus signature updates? For more information about virus signature versions and updates, see the documentation provided with your antivirus software.

  • If a Service Pack is installed, is it the latest version?

To determine the version of a Service Pack
  • In the Run dialog box, type winver
Compare System Settings and Configurations

If similar computers in your organization are problem free when you are troubleshooting a problem, you can use those problem-free computers as a reference for your root-cause analysis. The properly functioning system can provide valuable baseline data. By comparing the following elements, you can speed up the process of identifying contributing causes.

Installed services and applications

Generate a list of applications and services installed on the baseline computer to compare with applications and services on the problem system. To gather a list of applications installed on your system, use Add or Remove Programs in Control Panel. To gather a list of services enabled on your system, use Services (Services.msc) or System Information. For more information about using Add or Remove Programs, Services, and System Information, see Windows XP Professional Help and Support Center. Also, see Appendix C, “Tools for Troubleshooting,” and Chapter 29, “Troubleshooting the Startup Process.”

Tip Service Pack 2 for Windows XP adds a Show Updates check box to Add or Remove Programs that lets you toggle between displaying or hiding installed updates such as security updates downloaded from the Windows Update Web site.

Software revisions

Check the application and driver revisions to see whether differences exist between the two systems. Update the problem system’s software to match the versions used on the problem-free system. For applications, you can usually find version information by clicking Help and then clicking About application name. For drivers, you can use Device Manager or System Information to find version information. For more information about determining device driver versions, see Chapter 9, “Managing Devices.”

System logs

Compare Event Viewer logs for problem indications such as signs of hardware stress. For example, unexpected system shutdowns are logged with a “1076” event identification number in the System event log. The associated descriptive text can provide essential information to diagnose the problem. Baseline and problem systems might have similar problems, but the symptoms are more noticeable on one computer because it performs a unique or very demanding role. For example, a server that provides multimedia content typically consumes more system resources than a server that stores infrequently used Microsoft Word documents. Problems with disk, audio, video, or network devices and drivers typically appear earlier on computers that are stressed. Additionally, logging options for most Windows XP Professional components exist, and these can help you with features such as authentication, security, and remote access.

Hardware revisions

A minor hardware component upgrade might not be significant enough to cause a manufacturer to change a product model number. Consider the following hypothetical scenario:

A computer company uses a revision 1.0 motherboard when assembling a Model ZZXZ1234 computer. When reordering components, the company receives notice from the original equipment manufacturer (OEM) that it plans to correct certain problems by substituting updated revision 1.1 motherboards. The computer company then incorporates the updated components into all Model ZZXZ1234 computers. These minor changes might require you to exercise more care when updating drivers or firmware in your Model ZZXZ1234 computers. For example, a support Web page for Model ZZXZ1234 computers might post specific firmware versions, such as V3.0 for revision 1.0 motherboards and V4.0 for revision 1.1 and higher motherboards. Using firmware version V4.0 for computers that use revision 1.0 motherboards might cause problems.

Check Firmware Versions

When you turn on or cycle power to a computer, the central processing unit (CPU) begins to carry out programming instructions, or code, contained in the motherboard system firmware. Firmware—known as basic input/output system (BIOS) on x86-based and x64-based computers and internal adapters—contains operating system independent code necessary for the operating system to perform low-level functions such as startup self-tests and the initialization of devices required to start Windows XP Professional. If instability or setup problems affect only a few Windows XP Professional–based computers in your organization, check the motherboard and peripheral firmware.

Motherboard firmware revisions

Compare the BIOS version on the problem and problem-free systems. If the versions differ, check the computer manufacturer’s Web site for the latest firmware revisions. For example, if your firmware revision A was stable, but upgrading to firmware revision B causes problems, you might find firmware revision C on the Web site. If no revision C exists, temporarily downgrade to revision A until an update becomes available.

Peripheral firmware revisions

It might be necessary to check peripheral firmware revisions and upgrade firmware for individual peripherals, such as Small Computer System Interface (SCSI) adapters, CD and DVD-ROM drives, hard disks, video cards, and audio devices. Peripheral firmware contains device-specific instructions, but it is independent from the operating system. Peripheral firmware enables a device to perform specific functions. Upgrading firmware can enhance performance, add new features, or correct compatibility problems. In most cases, you can upgrade device firmware by using software the manufacturer provides. Outdated motherboard system firmware can cause problems, especially for Advanced Configuration and Power Interface (ACPI) systems.

OEMs periodically incorporate updated firmware into existing products to address customers’ issues or to add new features. Sometimes similar computers using the same hardware components have different motherboard and peripheral firmware versions. Upgrading firmware on older devices might require you to replace components (such as electronic chips) or exchange the part for a newer version. To avoid firmware problems, be sure to check the firmware revision your computer uses.

To check the firmware version on your computer
  1. In the Run dialog box, in the Open box, type:

    msinfo32

  2. In the Item column, locate BIOS Version.

Compare the firmware version listed for your system against the most recent revision available on the OEM’s Web site. Figure 27-2 shows an example of this.

Figure 27-2 Motherboard firmware revision in System Summary

Figure 27-2 Motherboard firmware revision in System Summary

Note Windows NT 4.0 Windows Diagnostics (Winmsd.exe) is not available in Windows XP Professional. Typing winmsd from the command prompt now starts System Information, which contains similar information.

To check whether your firmware is ACPI compliant
  • In the Run dialog box, in the Open box, type:

    devmgmt.msc

Figure 27-3 shows a Device Manager display for a computer that is not using ACPI features.

Figure 27-3 Non-ACPI computer in Device Manager

Figure 27-3 Non-ACPI computer in Device Manager

As shown in Figure 27-4, if the text ACPI appears under Computer, Windows XP Professional is using ACPI functionality.

Figure 27-4 ACPI-compliant computer in Device Manager

Figure 27-4 ACPI-compliant computer in Device Manager

For additional information about the ACPI specification, see the ACPI link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Warning Failure to follow the manufacturer’s instructions for updating firmware might cause permanent damage to your computer. If you are unfamiliar with this process, request assistance from trained personnel. Back up important data before you attempt to upgrade your firmware in case you are unable to start your computer.

During installation, Windows XP Professional checks system firmware to determine whether your computer is ACPI compliant. This prevents system instability, which can manifest in symptoms from hardware problems to data loss. If your system firmware does not pass all tests, it means that the ACPI hardware abstraction layer (HAL) is not installed. If you are certain that your computer is equipped with ACPI-compliant system firmware, but Windows XP Professional does not use ACPI features (the computer is listed as a non-ACPI Standard PC, for example), contact the computer manufacturer for updated motherboard firmware. After upgrading from non-ACPI to ACPI firmware, you must reinstall Windows XP Professional to take advantage of ACPI features.

Caution If you attempt to override the default ACPI settings selected by Windows XP Professional, setup problems occur. The remedy is reinstallation of the operating system. For more information about how Windows XP Professional determines ACPI compatibility, see articles 216573, “How Windows 2000 Determines ACPI Compatibility,” and 197055, “Disabling ACPI Support in BIOS Results in Error Message,” in the Microsoft Knowledge Base. To find these articles, see the Microsoft Knowledge Base link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

For more information about the ACPI specification, see the ACPI link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

For more information about System Information and Device Manager, see Windows XP Professional Help and Support Center, as well as Appendix C, “Tools for Troubleshooting” and Chapter 9, “Managing Devices.”

Troubleshooting Strategies

After you observe symptoms, check technical information sources, and review your system’s history, you might be ready to test a possible solution based on the information that you have gathered. If you are unable to locate information that applies to your problem or find more than one solution that applies, try to further isolate your problem by grouping observations into different categories such as software-related symptoms (as a result of a service or application), hardware-related symptoms (by hardware types), and error messages. Prioritize your list by frequency of occurrence, and eliminate symptoms that you can attribute to user error. This enables you to methodically plan the diagnostic steps to take or to select the next solution to try.

Isolate and Resolve Hardware Problems

When troubleshooting hardware, start with and work toward the simplest configuration possible by disabling or removing devices. Then incrementally increase or decrease complexity until you isolate the problem device. In safe mode, Windows XP Professional starts with only essential drivers and is useful for diagnosing problems. For more information about safe-mode troubleshooting, see Windows XP Professional Help and Support Center and Appendix C, “Tools for Troubleshooting.”

Check your hardware

If your diagnostic efforts point to a hardware problem, you can run diagnostic software available from the manufacturer. These programs run self-tests that confirm whether a piece of hardware has malfunctioned or failed and needs replacing. You can also install the device on different computers to verify that the problem is not because of system-specific configuration issues. Replacing defective hardware and diagnosing problems on a spare or test computer minimizes the impact on the user as a result of the system being unavailable. If diagnostic software shows that the hardware is working, consider upgrading or rolling back device drivers.

Reverse driver changes

If a hardware problem causes a Stop error that prevents Windows XP Professional from starting in normal mode, you can use the Last Known Good Configuration startup option. The Last Known Good Configuration enables you to recover from problems by reverting driver and registry settings to those used during the last user session. If you are able to start Windows XP Professional in normal mode after using the Last Known Good Configuration, disable the problem driver or device. Restart the computer to verify that the Stop message does not recur. If the problem persists, repeat this procedure until you isolate the hardware that is causing the problem.

Another method to recover from problems that occur after updating a device driver is to use Device Driver Roll Back in safe or normal mode. If you updated a driver since installing Windows XP Professional, you can roll back the driver to determine whether the older driver restores stability. If another driver is not available, disable the device by using Device Manager until you are able to locate an updated driver.

Using Device Manager to disable devices is always preferable to physically removing a part because using Device Manager does not risk damage to internal components. If you cannot disable a device by using Device Manager, uninstall the device driver, turn off the system, remove the part, and restart the computer. If this improves system stability, the part might be causing or contributing to the problem and you need to reconfigure it.

For more information about the Last Known Good Configuration startup option and Device Driver Roll Back, see Windows XP Professional Help and Support Center. Also, see Chapter 29, “Troubleshooting the Startup Process” and Appendix C, “Tools for Troubleshooting.” For more information about disabling devices and drivers, see Chapter 9, “Managing Devices.”

Isolate and Resolve Software Issues

If you suspect that a software problem or a recent change to system settings is preventing applications or services from functioning properly, use safe mode to help diagnose the problem. You can also use the Last Known Good startup option or System Restore to undo changes made by a recently installed application, driver, or service. You can isolate issues by using the following methods.

Closing applications and processes

Close applications one at a time, and then observe the results. A problem might occur only when a specific application is running. You can use Task Manager to end applications that have stopped responding. For more information about ending applications and processes using Task Manager, see Windows XP Professional Help and Support Center.

Temporarily disabling services

By using the Services snap-in (Services.msc) or the System Configuration Utility (Msconfig.exe), you can stop and start most system services. For some services, you might need to restart the computer for changes to take effect. For more information about using the Services snap-in and the System Configuration Utility to disable services, see Windows XP Professional Help and Support Center and Chapter 29, “Troubleshooting the Startup Process.”

To isolate a service-related problem, you can choose to do the following:

  • Disable services one at a time until the problem disappears.

    You can then enable all other services to verify that you found the cause of the problem.

  • Disable all non–safe mode services and then re-enable them one at time until the problem appears.

    Use the System Configuration Utility and boot logging to determine the services and drivers initialized in normal and safe mode. You can then disable all non–safe mode drivers and re-enable them one at a time until the problem returns.

For more information about System Restore, System Configuration Utility, and boot logging, see Windows XP Professional Help and Support Services and Appendix C, “Tools for Troubleshooting.” For more information about disabling applications and services while troubleshooting startup problems, see Chapter 29, “Troubleshooting the Startup Process.”

Avoid Common Pitfalls

You can complicate a problem or troubleshooting process unnecessarily by acting too quickly. Avoid the following common pitfalls that can hinder your efforts:

  • Not adequately identifying the problem before taking action

  • Not observing the effects of diagnostic changes

  • Not documenting changes while troubleshooting

  • Not restoring previous settings

  • Troubleshooting several problems at one time

  • Using incompatible or untested hardware

  • Using incompatible software

Not Identifying the Problem Adequately

If you fail to make essential observations before responding, you can miss important information in the critical moments when symptoms first appear. Here are some typical scenarios.

Failing to record information before acting

An error occurs and you start your research without recording important information such as the complete error message text and the applications running. During your research, you check technical information resources but find that you are unable to narrow the scope of your search because of insufficient information.

For more information about the types of information to record during troubleshooting, see “Identify Problem Symptoms” earlier in this chapter.

Restarting the computer too soon

In response to frequent random errors users experience with a certain application, you restart the affected computers without observing and recording the symptoms. Although users can resume work for the day, a call to technical support later that day is less effective because you cannot reproduce the problem. You must wait for the problem to recur before you can gather critical information needed to determine the root cause. For example, symptoms can be caused by power surges, faulty power supplies, excessive dust, or inadequate ventilation. Restarting the computer might be a temporary solution that does not prevent recurrence.

Failing to check for scheduled maintenance events or known service outages

A user comes to work early and finds that network resources or applications are not responding. You spend time troubleshooting the problem without success only to discover that both you and the user failed to read e-mail announcing that scheduled maintenance would cause temporary early morning outages.

Assuming that past solutions always work

Prior experience can shorten the time to solve a recurring problem because you already know the remedy. However, the same solution might not always solve a problem that looks familiar. Always verify the symptoms before acting. If your initial assumptions are incorrect and you misdiagnose the problem, your actions might make the situation worse. Keep an open mind when troubleshooting. When in doubt, verify your information by searching technical information sources (including technical support) and obtain advice from experienced colleagues. Do not ignore new information, and question past procedures that seem inappropriate.

Neglecting to check the basics

A user cannot print to a new local inkjet printer. You verify cable and power connections, check the ink cartridge, and run the printer’s built-in diagnostics, but you find nothing wrong. Windows XP Professional cannot detect the printer, so you manually install the most recent drivers without success. Reinstalling Windows XP Professional does not solve the problem, and you later realize that you neglected to find out whether printing to any local printer from this computer has ever been successful. You find that the user has never tried this, and a firmware check reveals that the parallel port is disabled. Enabling the parallel port resolves all printing problems.

Not Observing the Effects of Diagnostic Changes

System setting changes do not always take effect immediately. For example, when troubleshooting replication issues, you must wait to observe changes. If you do not allow adequate time to pass, you might prematurely conclude that the change was not effective. To avoid this situation, familiarize yourself with the feature that you are troubleshooting and thoroughly read the information provided by technical support before judging the effectiveness of a workaround or update.

Not Documenting Changes while Troubleshooting

Documenting the steps that you take while troubleshooting allows you to review your actions after you have resolved the problem. This is useful for very complex problems that require lengthy procedures to resolve. Documenting your steps allows you to verify that you are not duplicating or skipping steps, and it enables others to assist you with the problem. It also allows you to identify the exact steps to take if the problem recurs and enables you to evaluate the effectiveness of your efforts.

Not Restoring Previous Settings

If disabling a feature or changing a setting does not produce the results you want, restore the feature or setting before trying something else. For example, record firmware settings before changing them to diagnose problems. Not restoring settings can make it difficult to determine which of your actions resolved the problem. When verifying solutions that require you to make extensive changes or restart the computer multiple times, perform backups before troubleshooting so that you can restore the system if your actions are ineffective or cause startup problems.

Review backup procedures

Backups are essential for all computers, from personal systems to high-availability servers. If you suspect that your troubleshooting efforts might worsen the problem or risk important data, perform a backup. This enables you to restore your system if you experience data loss, Stop errors, or other startup problems. Backups allow you to partially or completely restore the system and continue where you left off. When you evaluate or create backup procedures, consider the following:

  • Use the verification option of your backup software to check that your data is correctly written to backup media.

  • Routinely check the age and condition of backup media, and follow the manufacturer’s recommendations for using backup media.

  • Follow the hardware manufacturer’s recommendations for maintaining the backup device.

For more information about using Backup for troubleshooting, see Appendix C, “Tools for Troubleshooting.” For more information about performing and planning backups, see Chapter 14, “Backing Up and Restoring Data.”

Windows XP Professional also provides other ways to restore system settings such as System Restore and the Last Known Good Configuration startup option. For more information, see Windows XP Professional Help and Support Center and Appendix C, “Tools for Troubleshooting.”

Troubleshooting Several Problems at One Time

If multiple problems affect your system, avoid troubleshooting them as a group. Instead, identify shared symptoms, and then isolate and treat each separately. For example, faulty video memory can cause Stop messages, corrupted screen images, and system instability. While diagnosing the symptoms, you might find that errors occur only with multimedia applications that use advanced three-dimensional rendering. When you attempt to rule out the possibility of failed video hardware by replacing the VGA adapter, you might find that this action also resolves the other issues.

Using Incompatible or Untested Hardware

For many organizations, standards for selecting hardware and purchasing new systems and replacement parts do not exist, are not fully defined, or are simply ignored. Standards that are well defined, refined, maintained, and followed can reduce hardware variability and optimize troubleshooting efforts.

If you need to replace hardware, record your troubleshooting actions as thoroughly as possible. Before installing a new device or replacement part, verify that it is in the Windows Catalog at https://www.microsoft.com/windows/catalog, that the firmware version for the system motherboard and devices are current, and that any replacement part is pretested or “burned-in” before deployment.

Checking the Windows Catalog

Hardware problems can occur if you use devices that are not compatible with Windows XP Professional. The Windows Catalog is a Web-based searchable database of hardware and software that have been certified under the Designed for Windows XP Logo Program. The Windows Catalog outlines the hardware components that have been tested for use with Windows XP Professional and is continuously updated as additional hardware is tested and approved.

Tip While the Windows Catalog replaces the Hardware Compatibility List (HCL) used for previous Microsoft Windows platforms, you can still access text-only versions of the HCL for different Windows versions from Windows Hardware and Driver Central at https://www.microsoft.com/sql/prodinfo/previousversions/winxpsp2faq.mspx.

If several variations of a device are available from one manufacturer, it is best to select only models listed in the Windows Catalog.

Table 27-2 explains the differences between Windows XP logo designations. For more information, see “Logo Program Options for Application Software” at
https://www.microsoft.com/winlogo/software/SWprograms.mspx.

Table 27-2 Designed and Compatible Designations in the Windows Catalog

HCL Designation

Description

Designed for Windows XP

Indicates that this product has been specifically designed to take advantage of the new features in Windows XP.

Compatible with Windows XP

Indicates that this product has not met all the Designed for Windows XP Logo Program requirements but has nevertheless been deemed compatible with Windows XP by Microsoft or the manufacturer.

When you upgrade to Windows XP Professional, device hardware resource settings are not migrated. Instead, all devices are redetected and enumerated during installation. Typically, upgrades to Windows XP Professional follow this migration path:

  • An upgrade to Windows XP Professional from Windows 98, Microsoft Windows Millennium Edition (Me), Microsoft Windows NT 4.0 Workstation, or Windows 2000 Professional.

You might find after installation that devices that functioned before the upgrade behave differently or do not work after the upgrade. This problem might have occurred because of the following reasons:

  • A driver for the device is not on the Windows XP Professional operating system CD, and Device Manager lists it as unknown or conflicting hardware.

  • Windows XP Professional Setup installed a generic driver that might be compatible with your device, but it does not fully support enhanced features. Many hardware manufacturers also provide tools that add value to their products, but they are not available in Windows XP Professional. Windows XP Professional Setup installs the basic feature set needed to enable your product to function. For additional software that enhances functionality or adds additional features, download the latest Windows XP Professional compatible drivers and tools from the manufacturer’s Web site.

Do not attempt to re-install older drivers because doing so might cause system instability, startup problems, or Stop errors and other startup problems. For more information about troubleshooting Stop errors and startup problems, see Chapter 29, “Troubleshooting the Startup Process.”

For best results, always use Designed for Windows XP certified devices. It is especially important to refer to the Windows Catalog before purchasing modems, tape backup units, and SCSI adapters. If you must use non-certified hardware, check the manufacture’s Web site for the latest updated device driver.

Note If your system has noncertified hardware installed, uninstall drivers for these devices before installing Windows XP Professional. If you cannot complete setup, remove the hardware from your system temporarily and rerun Setup.

Testing new and replacement parts

If you must replace or upgrade older parts with newer ones, first purchase a small number of new parts and conduct performance, compatibility, and configuration tests before doing a general deployment. The evaluation is especially important when a large number of systems are involved, and it might lead you to consider similar products from other manufacturers.

When replacing devices, use pretested or burned-in parts whenever possible. A burn-in involves installing an electronic component and observing it several days for signs of abnormal behavior. Typically, computer components fail early or not at all, and a burn-in period reveals manufacturing defects that lead to premature failure. You can choose to do additional testing by simulating worst-case conditions. For example, you might test a new hard disk by manually copying files or creating a batch file that repeatedly copies files, filling the disk to nearly full capacity.

Using Incompatible Software

Before installing software on multiple computers, test it for compatibility with existing applications in a realistic test environment. Observe how the software interacts with other programs and drivers in memory. If only the test application and the operating system are active, testing does not provide a realistic or valid indication of compatibility or performance. Testing is necessary even if a manufacturer guarantees full Windows XP Professional compatibility, because older programs might affect new software in unpredictable ways.

For large organizations, consider limited predeployment test rollouts to beta users who can provide real-world feedback. Select testers who have above-average computer skills to get technically accurate descriptions of problems they observe.

Setup and stability criteria are equally important in evaluating software and hardware for purchase. Testing is critical for upgrading systems from earlier versions of Windows such as Windows 98 or Windows NT 4.0. Software and drivers that were installable and stable on earlier versions of Windows might exhibit problems or not function in the Windows XP Professional environment. Video, sound, and related multimedia drivers and tools (such as audio, CD-ROM mastering, and DVD playback software) are especially sensitive to operating system upgrades.

For more information about application testing guidelines, see Chapter 1, “Planning Deployments,” and the Windows Application Compatibility link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources. Also see “Testing Applications for Compatibility with Windows 2000” in the Deployment Planning Guide of the Microsoft Windows 2000 Server Resource Kit and article 244632, “How to Test Programs for Compatibility with Windows 2000,” in the Microsoft Knowledge Base. To find this article, see the Microsoft Knowledge Base link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources.

Document and Evaluate the Results

You can increase the value of information collected during troubleshooting by keeping accurate and thorough records of all work done. You can use your records to reduce redundant effort and to avoid future problems by taking preventive action.

Create a configuration management database to record the history of changes, such as installed software and hardware, updated drivers, replaced hardware, and altered system settings. Periodically verify, update, and back up this data to prevent permanent loss. To maximize use of your database, note details such as:

  • Changes made

  • Times and dates of changes

  • Reasons for the changes

  • Users who made the changes

  • Positive and negative effects the changes had on system stability or performance

  • Information provided by technical support

When planning this database, keep in mind the need to balance scope and detail when deciding which items or attributes to track. For more information, see the International Technology Information Library (ITIL) and Microsoft Operations Framework (MOF) Web site links provided in “Check Technical Information Resources” earlier in this chapter.

Update baseline information after installing new hardware or software to compare past and current behavior or performance levels. If previous baseline information is not available, use System Information, Device Manager, the Performance tool, or industry standard benchmarks to generate data.

Baselines combined with records kept over time enable you to organize experience gained, evaluate maintenance efforts, and judge troubleshooting effectiveness. Analysis of this data can form the basis of a troubleshooting manual or lead to changes in control policy for your organization.

A post-troubleshooting review, or post-mortem, can help you pinpoint troubleshooting areas that need improvement. Some questions you might consider during this self-evaluation period include:

  • What changes improved the situation?

  • What changes made the problem worse?

  • Was system performance restored to expected levels?

  • What work was redundant or unnecessary?

  • How effectively were technical support resources used?

  • What other tools or information not used might have helped?

  • What unresolved issues require further root-cause analysis?

Write an Action Plan

An action plan is a set of relevant troubleshooting objectives and strategies that fits within your organization’s configuration and management strategies. After you identify the problem and find a potential solution or workaround that you have tested on one or more computers, you might need an action plan if the solution is to be deployed across your organization, possibly involving hundreds or thousands of computers. Coordinate your plan with supervisors and staff members in the affected areas to keep them informed well in advance and to verify that the schedule does not conflict with important activity. Include provisions for troubleshooting during nonpeak work hours or dividing work into stages over a period of several days. Evaluate your plan, and as you uncover weaknesses, update it to increase its effectiveness and efficiency.

As the number of users grows, the potential loss of productivity as a result of disruption increases. Your plan must account for dependencies and allow last-minute changes. Factor in contingency plans for unforeseen circumstances.

For more information about creating a configuration management database, see the ITIL and MOF links listed in Table 27-1.

Take Proactive Measures

You can combine information gathered while troubleshooting major and chronic problems to create a proactive plan to prevent or minimize problems for the long term. When planning a maintenance or upgrade process for your organization, consider the following goals:

  • Improving the computing environment

  • Monitoring system and application logs

  • Documenting hardware and software changes

  • Anticipating hardware and software updates

Improve the Computing Environment

External factors can have a major impact on the operation and lifespan of a computer. Some basic precautions include labeling connecting cables, periodically testing uninterruptible power supply (UPS) batteries, and placing computers far from high-traffic areas where they might be bumped or damaged. It is important to check environmental factors such as room temperature, humidity, and air circulation to prevent failures from excessive heat. Dust can clog cooling equipment such as computer fans and cause them to fail. Install surge suppressors, dedicated power sources, and backup power devices to protect equipment from electrical current fluctuations, surges, and spikes that can cause data loss or damage equipment. Other precautions include:

  • Performing regular file and system state backups to prevent data loss. For more information about Windows Backup, see Windows XP Professional Help and Support Center and Chapter 14, “Backing Up and Restoring Data.”

  • Using Windows XP Professional–compatible virus-scanning software and regularly downloading the latest virus signature updates. A virus signature data file contains information that enables virus-scanning software to identify infected files.

Monitor System and Application Logs

Monitor your system to detect problems early and avoid having software or hardware failure be your first or only warning of a problem. When using a monitoring tool such as Performance (Perfmon.msc) to evaluate changes that might affect performance, compare baseline information to current performance. The resulting data helps you isolate bottlenecks and determine whether actions such as upgrading hardware, updating applications, and installing new drivers are effective. You can also use the data to justify expenditures, such as additional CPUs, more RAM, and increased storage space. Checking the Event Viewer regularly helps you to identify chronic problems and detect potential failures. This allows you to take corrective action before a problem worsens. For more information about monitoring your system, see “Overview of Performance Monitoring” in the Operations Guide of the Microsoft Windows 2000 Server Resource Kit.

Document Changes to Hardware and Software

In addition to recording computer-specific changes, do not neglect to record other factors that directly affect computer operation, such as Group Policy and network infrastructure changes. For more information about developing and implementing a standard process for recording configuration changes, see “Document and Evaluate the Results” earlier in this chapter.

Plan for Hardware and Software Upgrades

Regardless of how advanced your system hardware or software is at the time of purchase, computer technologies have a limited lifespan. Your maintenance plan must account for the following factors that can make updates and upgrades necessary.

Increased demand for computing resources

When computing needs grow beyond the capability of your hardware, it makes sense to upgrade hardware components or entire systems. Performance degradation might be the result of system bottlenecks caused by hardware that has reached maximum capacity. Optimizing drivers and updating applications can help in the short term, but user demand for computing resources eventually makes it necessary to upgrade to more powerful hardware.

Discontinued support for a device or software

Operating system or manufacturer support for a device or software might be discontinued, causing compatibility issues that can block upgrades to new operating systems or prevent full use of certain features in Windows XP Professional. To minimize effort when upgrading hardware and software for many computers, purchase similar computers and follow replacement standards for your organization. Failure to standardize applications and hardware can make upgrading more difficult and expensive, especially if technicians and users need retraining.

Added capabilities

Having a process for upgrading operating systems or installing application patches, software updates, and operating system Service Packs helps to maintain the stability, performance, and reliability of your equipment. Schedule time to stay current with new developments and product updates.

Establishing a Troubleshooting Checklist

A guaranteed “system” for troubleshooting all computer-related problems does not exist. Effective troubleshooting requires technical research and experience, careful observation, resourceful use of information, and patience. During the troubleshooting process, you can consult the checklist in Table 27-3.

Table 27-3 Troubleshooting Checklist

Task

Action

Identify problem symptoms.

Observe the symptoms:

  • Under what conditions does the problem occur?

  • Which aspects of the operating system control these conditions?

  • What applications or subsystems does the problem seem related to?

  • Record all error information for future reference, including the exact message text and error numbers.

Do not forget to check the basics:

  • Verify that the power cables are properly connected and are not damaged or worn.

  • Check firmware settings to verify that devices are enabled.

Check technical information resources.

Research the problem:

  • What actions were tried for this or similar problems in the past?

  • Is this a known issue for which a solution or workaround exists? What were the results?

  • What information is available from product documentation, internal support sources, or outside resources, such as a manufacturer’s Web site or newsgroups?

  • What information can you obtain from support staff, such as Help Desk, or other users who might have experienced similar problems?

Review your system’s history.

Analyze the events that led up to the problem:

  • What happened just before the problem occurred?

  • What hardware was recently installed? Are driver and firmware revisions current?

  • What software or system file updates were made? Are the software revisions current?

  • Does the software and hardware configuration match the documented configuration? If not, try to determine the differences.

  • Did you examine the event logs for clues to the problem?

  • Gather baseline information or compare to a reference system:

  • Did this application or hardware work correctly in the past? What has changed since then?

  • Does the application or hardware work correctly on another computer? If so, what is different on that computer?

  • Generate performance data by using the Performance tool or benchmark programs. If previous baselines exist, compare current and past performance.

Document and evaluate the results.

Record the results:

  • Use a common report format such as a database to record information.

  • Make a detailed record of all the work done to correct the problem for future reference.

  • Record who, what, when, and why—and identify positive and negative cause and effect.

  • Evaluate the results:

  • Was the work done efficiently?

  • Was the solution effective? What remains unresolved?

  • When a solution was implemented, was system performance restored to expected levels?

  • What processes can be changed or implemented to prevent the problem from recurring?

  • Are systems being adequately monitored? Can this problem be caught early if it happens again?

  • What additional information, tools, or tests are needed?

Additional Resources

These resources contain additional information related to this chapter.

  • Chapter 9, “Managing Devices”

  • Appendix C, “Tools for Troubleshooting”

  • Chapter 14, “Backing Up and Restoring Data”

  • Chapter 13, “Working with File Systems”

  • Chapter 28, “Troubleshooting Disks and File Systems”

  • “Overview of Performance Monitoring” in the Operations Guide of the Microsoft Windows 2000 Server Resource Kit, for more information about monitoring performance

  • The ACPI link on the Web Resources page at https://www.microsoft.com/windows/reskits/webresources

  • The Extensible Firmware Interface link on the Web Resources page at
    https://www.microsoft.com/windows/reskits/webresources