Preparations for assessing NTLM usage

Updated: November 21, 2012

Applies To: Windows 7, Windows 8, Windows Server 2008 R2, Windows Server 2012

This topic describes design and planning considerations you need to address when reducing NTLM usage in your environment using features introduced in Windows Server 2008 R2 and Windows 7.

These designing and planning activities can further your understanding of your existing environment, help you develop the optimal solution (by selecting a solution design and deployment strategy), give you the information to verify that the solution provides the expected level of security, determine a schedule for the deployment, and can help you validate the solution through testing.

The preparations you make should result in forming the strategy to address the following:

  • Computer naming structure.

  • Server locations within trusting forests.

  • Audit collection mechanism and coverage for targeted servers.

  • Root cause analysis of each NTLM authentication detected.

  • Applications that cannot be modified to avoid the use of NTLM.

  • NTLM restriction policies for client computers, member servers, and domain controllers.

  • Determine the ongoing audit and monitoring processes for detection and removal of NTLM authentication.

In this topic

  • Designing computer naming conventions and the forest search orders

  • Designing the audit

  • Designing the NTLM usage root cause analysis

  • Designing the NTLM reduction policies

  • Determine the ongoing audit and monitoring strategy

Designing computer naming conventions and the forest search orders

This part of the design will enable the domain controllers, member servers, and client computers within the forest to search through the different trusted forests for Kerberos service tickets.

Computer naming conventions

  1. Enforce naming conventions that guarantee uniqueness of server names by using Domain Name System (DNS).

    DNS automatically enforces the use of unique host names across an organization. So, when a client computer or application attempts to authenticate to another computer using a host name, verify that the host name that the client is attempting to connect to is that same host that is resolved by DNS.

  2. Configure the DNS suffix search order on clients to get the fully qualified domain name (FQDN).

    Clients use the gethostbyname function (or an equivalent API) to resolve a host name to an IP address. Commonly, the resolution process uses DNS but local host files can also be used. For cases where a client attempts to resolve a short name (not a fully qualified domain name), DNS uses the DNS suffix search order to resolve from a name to an IP address. The DNS suffix search order is normally configured on clients using a Group Policy Object although it can be done in other ways such as scripts or the settings in the Registry.

  3. Investigate how your applications handle short names or IP addresses for authentication.

    To use the Kerberos protocol to authenticate to a remote server, a client computer or application needs to obtain a service ticket for that computer. Service tickets are obtained for SPNs and can be based either on the FQDN or the host name of the computer. If a client has obtained the IP address of a computer using DNS, then it should use the FQDN returned by the DNS query as the basis of the SPN. However, some applications use the short name instead, which could cause NTLM authentication to be used.

  4. Set the forest search order so that the SPN can be searched in other forests.

    Prior to Windows 7 and Windows Server 2008 R2, when a client attempted to authenticate to a server using an SPN based on host name, the global catalog server would search the current forest for the SPN. If the SPN could not be found in the current forest, then the client would use NTLM authentication.

Forest and DNS search order

You can add one or more additional forests into the list of forests that are searched when trying to obtain a service ticket for an SPN with a short name. Beginning with Windows Server 2008 R2 and Windows 7, settings are available for controlling searches on either on the domain controller or on the client computer.

Note

If this setting is configured on both the domain controllers and the client computers, then multiple searches of each forest might be performed for the same SPN. If client computers are upgraded first, then it might be necessary for both client and the Key Distribution Center (KDC) forest search lists to coexist for a period of time during the domain controller upgrade from an operating system version earlier than Windows Server 2008 R2.

The following diagram shows the process of deciding whether to set the client search order list or the KDC search list first.

  1. Configure the KDC forest search order.

    If there is a mix of Windows Server 2012, Windows Server 2008 R2 and earlier version domain controllers in the forest, then setting the KDC forest search order might cause NTLM to be used. And if NTLM is disabled in these pre-Windows Server 2008 R2 domains, then authentication attempts will sometimes fail. Therefore, it is best if the KDC forest search order is configured only when all domain controllers have been upgraded to Windows Server 2012 or Windows Server 2008 R2.

  2. Coordinate the forest search order and the DNS suffix search.

    The KDC forest search order and the DNS suffix search list should also be coordinated so that searches take place in each forest in turn. The DNS search list should group together domains that are in the same forest, and these forests should be used in the same order for the forest search list. Failing to do so might result in failed authentication attempts to computers with duplicate names.

  3. Design the forest search order to reduce the number of lookups.

    Forests with the most accounts that client computers or applications need to authenticate against should be listed first in the list. This will reduce the total number of searches made.

Use the following decision tree to coordinate the DNS suffix and forest searches.

Important

Duplicate computer names across forests should be avoided because this might add confusion when analyzing audit data and could lead to possible authentication problems.

Designing the audit

Your audit plan should include coverage of all member servers, domain controllers, and some client computers. The more authentication traffic data you collect, the more complete your analysis can be. You can begin the process by implementing the audit policies on at least one server running Windows Server 2012 or Windows Server 2008 R2 of each server role across all Windows domains and forests.

  1. Determine how to collect and manage the authentication events.

    Restrict NTLM audit events are recorded in the Operational log located under the Applications and Services Log/Microsoft/Windows/NTLM. You will need to determine how to collect and manage these authentication events from each computer whether using existing collection systems or designing one specifically for restricting NTLM usage. The size of the audit log will typically be proportionate to the number of users within an environment and the length of time for which audit information is to be retained.

  2. Configure event logging specifically for reducing NTLM effort.

    Computers will need to be configured to allow the logging of useful audit information to provide root cause analysis of NTLM authentications. This includes logon information (on both domain controllers and client computers), as well as Kerberos debugging on domain controllers.

  3. Strategically enable event collection across the organization.

    Because NTLM authentication is based upon client/server architecture, servers and their associated client computers should be configured in tandem across the enterprise rather than randomly attempting to correlate security events. Logging events on client computers might also include process tracking so that processes can be correlated to specific NTLM events on the servers to allow individual applications to be identified easier.

Audit reporting

The following table lists the base set of events that you need to use for your analysis. Design your event gathering to support your analysis process and to accommodate your audit reporting and event collection system.

Purpose Event ID Event source Description

Discover NTLM logons

Success Event ID 4624/540 – Logon Events

Member servers

This event is logged on computers each time a user logs on to the server. The Authentication Package string determines whether an NTLM logon has taken place.

Discover the SPN from the failed Kerberos ticket request

Failure Event ID 4769 and 673 – Kerberos Ticket Events

Domain controllers

This event is logged when a Kerberos ticket request fails. The event usually includes the SPN that was requested.

No events are generated for either successful or failed delegated SPN requests, so this event is not useful in determining delegation issues.

Discover NTLM logon attempts from user processes

Success Event 592 – Process tracking

Client computers

This event is generated whenever a process starts on the client computer. This event can sometimes be correlated to an NTLM logon event to determine the cause of the NTLM Logon.

Analyze user authentication

All events for specific user

All computers

All security events for a specific user can be collected for detailed analysis.

Note

Security events in Windows Server 2008 changed significantly from Windows Server 2003 and earlier versions. Most events have two different forms that are based on these version differences. The older event IDs have lower numbers (for example, 673) while the newer events are numbered higher. The format of the event output has also changed considerably, which complicates the analysis of audit data in a mixed environment.
For a complete listing and description of security audit events in Windows Server 2008 and Windows Vista, see Security audit events for Windows Server 2008 and Windows Vista in the Microsoft Download Center.
For a complete listing and description of security audit events in Windows Server 2008 R2 and Windows 7, see Security Audit Events for Windows 7 and Windows Server 2008 R2 in the Microsoft Download Center.

Designing the NTLM usage root cause analysis

Even though Kerberos authentication is the preferred Windows authentication protocol, and the security support providers are designed to promote it, NTLM is available for use so that compatibility hurdles do not stop authentication attempts. Your effort to reduce NTLM usage in your environment will be determined by security policies and compliance issues and by productivity factors within your organization. To balance these factors, you will need to perform a root cause analysis, which, although time consuming, will lead to a successful promotion of Kerberos authentication.

NTLM authentication might be selected as the authentication protocol instead of Kerberos for a variety of reasons, but for the purpose of your initial root cause analysis, your investigation should focus on the following reasons.

  1. Determine any duplicate SPN failures Kerberos is used.

    The SetSPN command has an additional option (-X) that allows detection of duplicate SPNs and an option (-Q) to query for a specific SPN. If SetSPN (-Q) returns more than one account, then there is a duplicate account.

    For more guidance about running the SetSPN command, see Service Principal Names (SPNs) SetSPN Syntax (Setspn.exe).

Warning

The SetSPN (-X) command consumes a large amount of computer memory, so if it is to run on the domain controller, run this command when the domain controller is not very busy.

  1. Locate Kerberos target principal unknown failures.

    Detection of the target principal unknown error is performed by inspection of the domain controller event logs for Kerberos errors (Event IDs 4769 and 673). These events are recorded in the Security log and display the information about the request for a Kerberos service ticket. The service name indicates the resource to which access was requested.

  2. Identify applications that use Negotiate SSP (SPNego.dll) with a NULL Target name or by supplying an IP address.

    Using Application Verifier on the application on the client computer can help determine which application does not provide the required parameters for Kerberos authentication to be used. Determining which application is causing NTLM to be used can be performed by correlating the client process tracking audit events or Process Monitor logs with the domain controller logs.

Application Verifier analysis

The NTLM Application Verifier plug-in monitors individual process calls to the authentication APIs AcquireCredentialsHandle and InitializeSecurityContext to detect uses of the NTLM protocol for each application analyzed. The events generated by those calls that can be detected by the plug-in are listed in the following table.

Event name Description API

NTLMCaller - UNCLASSIFIED_ERROR

An unclassified error is detected.

AcquireCredentialsHandle

NTLMCaller - INTERNAL_ERROR

Internal error. Please report this to the provider owner.

AcquireCredentialsHandle

NTLMCaller - ACH_EXPLICIT_NTLM_PACKAGE

Explicit usage of NTLM package is detected in AcquireCredentialsHandle. This manifests a straight NTLM call. Negotiate package should be used to remove this verifier stop.

AcquireCredentialsHandle

NTLMCaller - ACH_IMPLICITLY_USE_NTLM

Only NTLM can be negotiated given supplied package list. See Param1 for the package list.

AcquireCredentialsHandle

NTLMCaller - ACH_BAD_NTLM_EXCLUSION

Wrong exclusion syntax -NTLM is detected. !NTLM should be used instead. See Param1 for the package list.

AcquireCredentialsHandle

NTLMCaller - ISC_NO_TARGET

NTLM is to be used because no target is supplied into InitializeSecurityContext.

InitializeSecurityContext

NTLMCaller - ISC_WRONG_TARGET

NTLM is to be used because wrongly formatted target is supplied into InitializeSecurityContext because of wrongly formatted pszTargetName.

InitializeSecurityContext

NTLMDowngrade - UNCLASSIFIED_ERROR

An unclassified error is detected.

InitializeSecurityContext

NTLMDowngrade - INTERNAL_ERROR

Internal error. Please report this to the provider owner.

InitializeSecurityContext

NTLMDowngrade - FALLBACK_TO_NTLM

An authentication has been detected being downgraded to NTLM. This should be investigated if the downgrade is not expected.

InitializeSecurityContext

Your analysis should result in a list of applications that successfully use the Kerberos protocol and a list of those that do not or cannot. For those that currently use NTLM, you must decide whether to modify or upgrade the application so that it calls the Negotiate security support provider (Spnego.dll) with the correct parameters to use the Kerberos protocol. Your investigation should result in decisions about upgrading existing applications or hardware, whether Microsoft or non-Microsoft. Although NTLM is available in Windows operating systems, it is the application or device that chooses to use a particular authentication protocol. If you have concluded not to modify or upgrade your applications and hardware, then you can create an exception for those applications to use NTLM by identifying the servers responsible for handling the application authentication requests.

Designing the NTLM exceptions

When an application cannot be modified to avoid the use of NTLM authentication, you will need to decide your strategy based on the following options.

  1. Discontinue the use of the application until it is modified to use the Negotiate SSP.

    This option is dependent on the requirements of your business organization, but if it is an internal application, modifying it to use the Negotiate SSP might be possible. The business organization should make the decision to discontinue its use or make the selection for its replacement.

  2. Add the name of the server hosting the application to the exception list.

    There are two security policies with which to name servers to allow them to use NTLM authentication once you set the Restrict NTLM security policies. Exceptions can be applied on the clients, the member servers, or the domain controllers.

  3. Abandon the NTLM reduction project.

    At the end of the analysis, you might conclude that the various costs of the project are too high because the list of exceptions has become too numerous, complicated, or cumbersome for your project goals. This option is dependent upon the requirements of your business organization and should be a balance of security risk, cost, and user productivity. You have to determine the cost of managing exceptions. Managing more than 100 exceptions might cause frequent operational errors leading to associated user authentication failures and to increased support calls. The exceptions list can include a single wild card character that, depending upon computer naming conventions, might allow a single exception to cover a wide range of computers that all run the same application. If there are many complex server and computer naming conventions (that would require multiple wild card characters to be used), then it is likely that the project will have to be stopped as the exception list will be too long to be maintainable.

The following illustration shows the decision process for root cause analysis and NTLM reduction.

Designing the NTLM reduction policies

Once your analysis is complete and you know which applications or computers use NTLM, you can determine your reduction goals and corresponding policies for each of the following operating systems and roles in your environment:

  • Client computers

  • Member servers

  • Domain controllers

The following security policies to reduce NTLM and promote Kerberos usage were introduced in Windows Server 2008 R2 and Windows 7:

In addition, client computers and servers starting with the Windows Server 2008 R2 and Windows 7 operating system versions offer the following security policies to permit NTLM authentication while you promote Kerberos usage:

Determine the ongoing audit and monitoring strategy

To monitor NTLM usage in your environment, you might want to keep the audit collection system in place for a period of time. If so, you need to design the system both for the investigation and analysis segment of the project as well as the monitor and maintenance segment. Monitoring usage can be used to confirm that NTLM authentications are no longer occurring outside of the expected exceptions list. As applications and servers are retired and replaced, the exception list can be updated.

The following security policies are available to help you analyze and monitor NTLM usage:

See Also

Concepts

About NTLM usage in your environment