Configure Kerberos Forest Search Order (KFSO)

Article
07/09/2014

Updated: July 8, 2014

Applies To: Windows 7, Windows 8, Windows 8.1, Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2

Kerberos Forest Search Order (KFSO) is a new feature in Windows Server 2008 R2 which you can enable for the Key Distribution Center (KDC) and the Kerberos client.

In Group Policy Editor, the “Use forest search order” setting appears as follows:

KDC: Computer Configuration\Administrative Templates\System\KDC

Kerberos: Computer Configuration\Administrative Templates\System\Kerberos

How do the KDC and Kerberos handle the forest search?

The KDC/Kerberos Forest Search is only triggered for two-part names.

In Kerberos, a service principal name (SPN) has three parts:

1111/222222@33333

For example: Cifs/server12.contoso.com@contoso.com

Service type
Server network name
Realm name

If the requesting client specifies the realm name, the Kerberos Forest Search is not allowed. This explains why SPNs with server FQDNs are redirected by Kerberos Forest Search: as far as Kerberos is concerned, the target realm is unknown.

When the SPN is not found in the local domain and the GC, the KDC on a Domain Controller engages a Kerberos Forest Search, if configured. The same function is used by the Client-based Kerberos Forest Search done when the Kerberos Client receives error 7 (KDC_ERR_S_PRINCIPAL_UNKNOWN) from the KDC.

The component has a “forest reference cache” which means it keeps the GC session information to a forest it talked to in the past. It also tracks recent failure to talk to a forest to avoid getting stuck retrying continuously. It does NOT cache SPNs clients asked for recently and the result it got for the SPN. So aggressive clients asking for invalid names will cause DCs to be busy with Forest Search requests, no matter whether the search is KDC-based or Kerberos-based.

The account used for 2b and 2c steps below is the computer account executing KFSO. That’s the Domain Controller account for KDC-based forest search and the client computer account for client-based forest search. Thus it might happen that the SPN cannot be found, since the computer account is not allowed to read the SPN or the whole Active Directory object in the queried forest, or the account fails to logon (selective authentication, one-way trusts).

The rough algorithm is:

Walk the list of forests in the policy setting.

Note

Try to keep the list of forests as short as possible. If the DCs are busy with requests for invalid SPNs, they will also send these requests to forests in the KFSO list. The cache remembers only failed attempts to searched forests, not for recently failed SPNs that do not make sense to retry. A distributed denial of service (DDoS) attack is possible as many random SPNs can bloat the negative cache.

For each forest:
1. Locate a GC in the forest requires DNS name resolution and UDP/389 for the target GC.
2. Bind to the GC in the remote forest (involves authentication cross-forest as the DC identity, prefers Kerberos). Requires DS RPC port see KB 224196.
3. Call DsCrackNames name format DS_SERVICE_PRINCIPAL_NAME for the SPN passed by the client.
4. If account was mapped successfully, note results and leave loop.
The function returns to KDC or Kerberos client and:
1. KDC builds referral ticket that brings client to root domain of target forest.
2. Kerberos requests ticket from its own DC that brings it to root domain of target forest.
3. From here normal authentication as if the FQDN was given by client.

Warning

On high-volume, Kerberos Forest Searches with non-existing SPNs (which includes SPNs containing IP addresses), Kerberos or the KDC executes maximum effort with negative end-result.

The following diagrams help show the difference in the process for Kerberos client versus KDC.

The important point is that many simultaneous clients can tie up many ATQ worker threads at the domain controller. Eventually, you could starve out other domain controller activities like LDAP queries and routine KDC work. Especially if many clients send many requests in KDC-based search, the requests for invalid names are most expensive, but a high-volume of succeeding requests can also hit the ATQ thread limit sooner or later. This can also be a problem with Kerberos-based search, especially with the initial forests in the list.

The process for the KDC search is:

Client sends TGS request cifs/server12 from forest tailspintoys.com to KDC in contoso.com.
KDC does not find server12 in its own forest.
KDC has list of forests from policy, tries DsCrackNames against GC in northwindtraders.com, and receives failure “name not found.”

Note

The DsCrackNames call requires authentication and thus a separate TGS acquisition.

KDC tries DsCrackNames against GC in tailspintoys.com, finds one entry.
KDC responds with referral ticket to krbtgt/tailspintoys.com.
Client sends TGS request to tailspintoys.com forest root DC.
KDC there finds SPN in child domain, another referral ticket to krbtgt/sales.tailspintoys.com.
KDC in child domain responds with ticket.

The process for the Kerberos search is:

Client sends TGS request cifs/server12 from forest tailspintoys.com to KDC in contoso.com.
KDC does not find server12 in its own forest and responds with KDC_ERR_S_PRINCIPAL_UNKNOWN.
Kerberos has list of forests from policy, tries DsCrackNames against GC in northwindtraders.com, and receives failure “name not found”.

Note

The DsCrackNames call requires authentication and thus a separate TGS acquisition.

Kerberos tries DsCrackNames against GC in tailspintoys.com, finds one entry.
Kerberos sends TGS request for cifs/server12 to tailspintoys.com forest root DC.
KDC there finds SPN in child domain, another referral ticket to krbtgt/sales.tailspintoys.com.
KDC in child domain responds with ticket.

It is necessary to monitor the server end for overload and how do I go about it?

KDC handles the forest search in an ATQ worker thread, also called LDAP worker threads in the “NTDS” or “Directory Services” performance object as the thread pool services both LDAP and KDC workloads.

Tip

Monitor the ATQ thread usage to spot shortages and subsequent delays for KDC and LDAP requests. The queue also services LDAP UDP pings which since Windows Server 2008 R2 could also become stuck with DNS name resolution it performs to help with IPv4/IPv6 dual stack support.
You might need to increase the “MaxPoolThreads” LDAP policy to make more worker threads available: 315071 How to view and set LDAP policy in Active Directory by using Ntdsutil.exe
It would be useful to monitor the use of DsCrackNames for this purpose to spot frequently-used SPNs. There is a “NTDS” or “Directory Services” counter “DS Client Name Translations/sec” that is tracking this on the server end.

The only logging available is NTDS diagnostics logging for “23 DS RPC Server” at level 5. This level will also log verbose events for Active Directory replication RPC activity, and it will not log the SPN the client asked for.

Further there only is some rough debug logging to a file, that is only suited for troubleshooting, not for monitoring.

What the operations can do is running data collector sets for “Active Directory”, which can also list failed Kerberos requests and DsCrackNames calls, which will also engaged by the Kerberos Forest Search.

How does Kerberos Forest Search determine looping and not to continue looping?

The component checks whether the SPN exists on the target forest, so it should find it when forwarding the Kerberos request and not loop back.

Probing the GC using DsCrackNames safeguards against the ping-pong effect.

The only scenario where KFSO would ping-pong is:

Two forests:
1. Contoso.com
2. Northwindtraders.com
SPN asked for is: cifs/server1.contoso.com was registered on account in Northwindtraders.com, but is now a lingering object.
Request is sent to contoso.com DC, hits error 7 and DC probes GC in Northwindtraders.com. It’s found there since it’s a lingering object.
Request is forwarded to Northwindtraders.com. It fails finding the SPN in the GC it uses and fails the request.
The best referral the Northwindtraders.com DC has is sending it back to contoso.com. In this case it would potentially refer the client back to the originating domain.

BUT there are multiple configuration errors in this scenario:

SPNs in Northwindtraders.com must not carry SPNs with a suffix owned by another trusted Active Directory domain such as contoso.com.
The GC has a lingering object. This needs to be cleaned up.

Does the Kerberos Forest Search Order require a specific forest functional level?

It is a client and domain controller feature and depends on the operating system version.

In a domain with a mix of DCs running different versions of Windows Server, it works when the request is sent to a Windows Server 2008 R2 domain controller.

It also works client-driven to 100%, where the KFSO is done by the client instead of the domain controller.

The feature depends on the operating system version and requires a domain controller that runs Windows Server 2008 R2 or later or a Kerberos client that runs Windows Server 2008 R2 or Windows 7 or later. For the sake of consistency, we suggest to use the feature only when all clients or KDCs support forest search.

What are the scalability limits of forest search orders?

Currently there is no scalability experience with this feature.

The client-side driven approach should produce more scalability since there is no server-side thread that is kept busy with the request.

There are multiple factors to consider for the server-side approach:

Invalid SPNs cause domain controller to walk the whole list of Kerberos forests to search for the SPN. Therefore keep the list of KFSO forests as short as possible.
In a domain controller based scenario, the “normal” Kerberos scalability of the domain controller is the limit. Every SPN search request will be a single thread. The maximum number of threads on a domain controller is “CPU cores x 4” (LDAP policy MaxPoolThreads).
The requests may scale up better when a client driven approach is used.

As a recommendation:

Keep the number of forward destinations as short as possible.

For every failed request the system will forward the query to the next entry in the KFSO list.
Prefer a client driven approach for the best scalability. Prefer a domain controller driven approach for the easiest implementation.

Client based approach:

Advantage: better scaling, because the load is distributed across the client systems instead of the domain controllers.
Disadvantage: all client systems must be Windows 7 or Windows Server 2008 R2 or higher to support the feature.

Domain Controller approach

Advantage: Only the domain controllers have to be Windows Server 2008 R2 and the KFSO feature will work with all Kerberos clients.
Disadvantage: Scalability, because the complete forwarding is handled by the domain controllers. The load is created on the domain controllers and will consume further domain controller resources.

What are good counters to monitor the performance impact of KFSO on the domain controllers?

There are no specific counters available for this feature. As always the domain controller operation should have a baseline of the domain controller performance and monitor for behavior or trends that are unexpected.

The KFSO is handled in the worker threads of the “NTDS” or “Directory Services” performance object. These counters should be monitored in the baseline monitoring.

Another often used function by KFSO is the DsCrackNames function. The function can and should be monitored by the “NTDS” or “Directory Services” counter “DS Client Name Translations/sec”.

Does application name resolution drive the Kerberos SPN search?

When the short name is resolved via NetBIOS name resolution, the short name only is used as SPN.

When the short name is resolved by DNS plus DNS search list, the returned FQDN name may be used as SPN.

Is this correct or is this application-specific?

You might see either behavior. The correct approach for an application is to stick to the original requested name (user input or configuration setting) in the SPN.

For example, Internet Explorer had this problem fixed in KB 911149.

Are there any known issues or limitations for KFSO?

It ignores the forest Suffix mapping: Incorrect SPNs are always mirrored in the other forest.

We suggest monitoring for Kerberos Error 7 (SNAME unknown) using Data Collector Sets.

What happens, when KFSO is configured in both forests and the target is the other forest?

Example scenario:

Both forests have domain controllers that run Windows Server 2008 R2.
KFSO is configured:
- Forest A points to forest B
- Forest B points to forest A

Is this configuration functional and supported?

The client requests the initial ticket from a domain controller in its own domain and only searches when the SPN is not found there. This will work since before a request is forwarded to the other forest(s), KFSO verifies whether the SPN exists in the other forest.

The Kerberos Forest Search Order is a client and domain controller based feature. Does it make sense to activate the feature on both systems? (domain controller and client)

No, it makes little sense to activate the feature on both systems. For “unknown SPNs” the domain controller would forward the requests to all entries in the KFSO list and perhaps end up with a SNAME unknown (Kerberos Error 7). The result is presented to the client and then the client starts to use its KFSO configuration to resolve the name again. As a result all unknown SPNs would be forwarded two times.

Configure Kerberos Forest Search Order (KFSO)

Additional resources