The Great Debates: Pass Phrases vs. Passwords. Part 1 of 3

Article
05/20/2008

Security Management

Jesper M. Johansson, Ph.D., ISSAP, CISSP
Security Program Manager, Microsoft Corporation

Information security fosters some interesting debates. The issues range in importance, but they all demonstrate that the field is still growing and exciting. I would like to summarize some of these debates, and offer my own partial entries. For the first set of these articles, I will enter the passwords fray and address the issue of pass phrases versus passwords.

OK, maybe “pass phrases versus passwords” is really “the other great debate” or the “kind of boring and few people care” debate. In any case, which is more secure, pass phrases or passwords? The answer is not as clear-cut as it may seem.

In order to properly analyze the issue of pass phrases versus passwords, I divided the article into three installments. In this first installment, I will cover the fundamentals of pass phrases and passwords, describe how they are stored, and so on. Next month, I will discuss the relative strength of each and apply some mathematical approaches to determine which is stronger. In the last installment, I will conclude the series and offer guidance on how to choose passwords and configure a password policy.

Some Fundamentals

First, it is vital to understand the difference between passwords and pass phrases. When most people pick a password, they pick a word, such as “Password” or a string of random symbols, such as “X2!aZ@<dF:”, or some combination of the two, such as “P@s$w0rd”. A pass phrase is a phrase, such as “This is a really complex pass phrase.” A pass phrase typically is much longer than a password and contains spaces. Some people construct a password by stringing together the first letters of each word in a phrase. Such a password is not a pass phrase; it is a password constructed of interesting logic. A pass phrase, by contrast, is a different form of token-based password where the tokens are words instead of symbols from a character set. A pass phrase does not need to be a proper sentence, spelled correctly, or devoid of symbol substitutions, as used in the last password example.

The key differences between pass phrases and passwords are:

(1) A pass phrase usually has spaces; passwords don’t

(2) A pass phrase is much longer than the vast majority of words, and, more important, longer than any random string of letters that an ordinary person could remember.

Although a pass phrase could simply be considered a very long password, typically it is constructed of a sequence of words, or something similar to words. In addition, the pass phrases depicted here are legal to use with Windows 2000 and higher. Other products use pass phrases with different characteristics, or may not support them at all.

Second, you need to understand the difference between password guessing and password cracking. Password guessing is when someone sits at the console or at a remote machine trying passwords. Guessing is not relevant to this article, because if an account has a relatively complex password, guessing will not succeed anyway. If guessing succeeds, the cause is either incredible luck on the part of the attacker, or a weak password.

Let us look at an example. First, consider that passwords allow four categories of symbols: upper-case letters, lower-case letters, numbers, and non-alphanumeric symbols. The non-alpha numeric symbols include all the symbols on the keyboard, as well as anything that does not show up on the keyboard, such as Unicode characters. Some people consider Unicode characters and the symbols on the keyboard two different categories, but for the remainder of the articles we will lump them into one definition. Furthermore, for our purposes, the term “character” will refer to all four categories collectively. For instance, assume the passwords are non-dictionary words using 8 characters with at least three of the four character types, and they expire in 70 days. For an attacker with no prior knowledge of any of those passwords to guess one of them before it expires would require the computer to have a network bandwidth of 53,000 T-3 (44.736 Mbps each). This is required just to send the authentication traffic required to try half of all the possible passwords (assuming each is equally likely).

If we restrict the character set used for guessing, and assume the password is selected at random, or looks that way to an attacker, from the 76 most common symbols, there are 1.11 x 10^15 possible 8-character passwords. If an attacker guesses 300 of those per second, which is very unlikely even with the most optimized programs, it would take 58,783 years to guess the password. If the attacker simply scripts up the “net use” statement, s/he is likely to get only two or three tries per second, which means it would take 5,878,324 years to guess a password.

Cracking, on the other hand, is performed after the attacker has obtained the raw hashes. (A hash is a mathematical representation often used to store passwords. We will discuss it in more detail below.) The attacker generates test passwords, hashes them, and then compares the result to the stored hash. Cracking is far faster than guessing. Using even moderate class hardware, an attacker can generate and test 3,000,000 passwords a second. A cracking attack against all possible 8-character passwords using the 76-character set will, based on that test rate, take 6 years. Of course, many of the passwords will be found in much less time, and any given password will statistically be found in half that time. If the passwords are only 7 characters cracking the full set will take only about 28 days.

How the attacker obtains the raw hashes is an open issue, as is whether cracking is a primary concern. Consider that in Windows the attacker needs system level access to the domain controller to crack domain passwords. If an attacker has already compromised a domain controller, cracking passwords is only a secondary problem. Yet, most attackers do crack passwords. Why? Mostly, the attacker hopes someone has an account on a different system in another domain with the same username and password. That is known as an administrative dependency, a topic to be addressed in a future article. Furthermore, since the only secret used in a challenge-response protocol is the password hash itself, cracking passwords is strictly superfluous because the hashes are all the attacker needs to access an account. Nevertheless, attackers typically do crack passwords and their success rate is a serious problem.

Keep in mind that modern operating systems do not usually store plaintext passwords or pass phrases. Normally, the stored value is the result of a one-way function, such as a hash. On Windows NT-based operating systems (including Windows 2000, XP, and Server 2003) the password is stored several different ways. The primary representations are the LM hash and the NT hash. For the purposes of this article you do not need to know exactly how they work. You only need to know three things:

The LM hash is case-insensitive, while the NT hash is case-sensitive.
The LM hash has a limited character set of only 142 characters, while the NT hash supports almost the entire Unicode character set of 65,536 characters.
The NT hash calculates the hash based on the entire password the user entered. The LM hash splits the password into two 7-character chunks, padding as necessary.

Both types of hashes generate a 128-bit stored value. Most password crackers today crack the LM hash first, then crack the NT hash by simply trying all upper and lower case combinations of the case-insensitive password cracked by the LM hash.

The LM hash is a very weak one-way function used for storing passwords. Originally invented for the LAN Manager operating system, the LM hash was included in Windows NT for backward compatibility. It is still included for backward compatibility. Because of the way the LM hash is calculated, no password with an LM hash is stronger than a 7-character password selected from a 142-character character set.

Removing the LM Hashes

There are several ways to ensure the LM hash is not stored; one of them is to use passwords or pass phrases longer than 14 characters. You can also use the NoLMHash switch – exposed in Group Policy on Windows Server 2003 and Windows XP as “Network security: Do not store LAN Manager hash value on next password change.” Using that switch globally turns off storage LM hashes for all accounts. The change will take effect the next time the password is changed. Existing LM hashes for the current and any past passwords are not removed simply by throwing that switch. In addition, the fact that the switch does not work right away means that you will not immediately notice any potential interoperability problems caused by not storing LM hashes. See Knowledge Base article KB 299656 for more information. The Knowledge Base article also has information about using the NoLMHash switch with Windows 2000.

You can also remove the LM hash by using certain characters in your password. It is widely held that using “ALT characters” in your password prevents the LM hash from being generated. Actually, only certain Unicode characters cause the LM hash to disappear. For instance, Unicode characters between 0128 and 0159 cause the LM hash not to be generated. Some Unicode characters are converted into other characters before being hashed.

There is a concern with removing LM hashes – doing so will break things! One reason LM hashes are left on by default is that removing them breaks any application that uses UDP-based authentication for RPC. That includes Windows Cluster Services, Real Time Communications Server, and probably others. These problems are solved by turning on the NtlmMinClientSec setting, exposed as “Network security: Minimum session security for NTLM SSP based (including secure RPC) clients” in Group Policy on Windows Server 2003. NtlmMinClientSec needs to be set to at least Require message integrity and require NTLMv2 Session security (0x80010). When it is set to that RPC utilizes NTLMv2 authentication, which uses the NT hash. (See KB article 828861 for more information on cluster problems when you do not have an LM hash.) Other applications will also break in the absence of an LM hash. For instance, Outlook 2001 for the Macintosh requires that all accounts it uses have one. Windows 3.x will definitely break without an LM hash, and Windows 95 and 98 will break in certain scenarios. In addition, some third-party products, such as network attached storage devices, may require LM hashes.

Final Thoughts

The first installment of the passwords article series has talked about the basics of passwords. In the next installment we will try to analyze whether pass phrases have an inherent advantage over passwords. However, given how few people currently use them, we have very little real-world data on pass phrases. In order to understand more about them, we would like to ask you a favor. If you would like to help us, think of a pass phrase you might use (preferably not the one you are currently using!) and e-mail it to passstud@microsoft.com. We are hoping to get enough results to be able to perform some analysis on pass phrases and understand how they are actually formed.

As always, this column is for you. Let us know if there is something you want to discuss, or if there is a better way we can help you secure your systems. Just click the “Comments” button below, and send us a note.

The Great Debates: Pass Phrases vs. Passwords. Part 1 of 3

Additional resources