Passwords, part 2

Stage 1: Create

Who creates the password?

Here are the top five examples of weak passwords chosen by users in 2012:

  1. password
  2. 123456
  3. 12345678
  4. abc123
  5. qwerty

Those are consistent with older password hacks. For example, in 2010, Gawker Media (parent of big blog sites), was hacked. Of 250,000 disclosed passwords, about 1% were "123456" and another 1% were "password".

All this raises the question: how can we characterize "strong" passwords? They need to be passwords that are hard for attackers to guess. It turns out we already have such a characterization from our study of cryptography. Recall that the security level of an algorithm is the exponent of the maximum number of guesses required to break an algorithm by brute force attack. When we talked about encryption schemes, the guesses were to find the key, and we implicitly assumed that keys were chosen uniformly at random from the space of all keys. For example, a 128-bit key is from a space that requires 2^128 guesses to search exhaustively.

Using entropy to measure password strength. We can use the idea of the number of guesses required for brute force search for passwords. But passwords aren't bit strings; they're character strings. That makes the math a little more complicated. Suppose there are N characters to choose from, and the password is of length L. Then there are N^L possible passwords. We want to find the security level H of that space. That is, we want an H such that 2^H is equal to the number of possible passwords. (Why use the letter H? Because the concept we're describing is known in the field of Information Theory as Shannon entropy, for which the letter H is traditionally used. And from now on, we'll write "entropy" instead of "security level" when we're talking about passwords.) Let's solve for that H:

      N^L = 2^H
  log N^L = log 2^H
  L log N = H log 2
        H = L (log N / log 2)
        H = L log_2 N

So if passwords are chosen uniformly at random from the lower-case latin alphabet of 26 characters, the entropy of an 8 character password is 8 lg 26 ≈ 37.6 bits. That's very low compared to the minimum security level for keys! Is it enough? According to a 2006 NIST report, the minimum level is 14 bits, and 30 is comfortable. But that material assumes an online attack model, in which attackers interactively guess passwords. In an offline attack, in which attackers have direct access to the password database, a higher level of security is necessary.

The last paragraph began by assuming that passwords are chosen uniformly at random from the space of all passwords—for example, the password is just as likely to be "iZ8#j" as "12345". But humans just don't chose randomly. So the entropy of human-chosen passwords is effectively much less than it would be if the passwords were chosen by a machine. Suppose, e.g., that the average high-school graduate has a vocabulary of around 50,000 words [Nagy and Anderson; Pinker "The Language Instinct"]. What if this person chooses an English word as password? There will be lg 50k ≈ 15.6 bits of entropy. That's low! And it assumes that users choosing randomly over their entire vocabulary, which isn't likely either.

The aforementioned NIST report uses the following heuristic for the entropy of user-selected passwords drawn from the full keyboard:

Other heuristics have been proposed, summarized in Schneider and in Bishop. "Simple transformations" above could include deleting vowels, capitalizing some letters, adding suffixes/ prefixes, replacing letters with look-alike numbers, leet speak, and more.

Beyond entropy. Weir et al. (2010) show experimentally that the NIST entropy estimates don't do a good job of predicting how long it will take attackers to crack passwords. Kelley et al. (2012) show that, despite the Weir et al. result, passwords chosen according to the most comprehensive NIST requirements (mixtures of characters kinds, no dictionary words, sufficiently long, etc.) are indeed the passwords that are hardest to crack—call these comprehensive passwords. So the NIST recommendations reach the right conclusion, even if the metric they use isn't valid. But comprehensive passwords are hard to remember and hated by users, leading them to reuse passwords or predictably modify passwords. Could we do better? Here are three options that have been explored:

Beyond passwords

Could we replace passwords with a different authentication mechanism? Bonneau et al. (2012) develop criteria against which to judge proposed new mechanisms:

Evaluating many proposed schemes for replacing passwords, Bonnaeu et al. conclude that though they generally offer better security, they tend to offer worse deployability, and usability is sometimes better and sometimes worse. It seems that passwords are here to stay, at least for now. Bonnaeu et al. observe that most of the schemes that compare favorably to passwords involve single sign on.

Single sign on

With single sign on (SSO), a user enrolls with many service providers (SPs), shares authentication secrets, e.g. password, with each SP, but authenticates only once to the SSO service. Thereafter, the SSO manages authentication. Note that the SSO can trivially impersonate the user: the SSO has to be trusted.

Variants of SSO include true SSO, in which the SSO does authentication and the SPs simply trust the SSO when it asserts the identity of a use, and pseudo SSO, in which the SSO impersonates the user to the SP through the SP's own native authentication mechanism. Either way, the SSO could be local to the user's machine or could be running as a remote or proxy service.

Password managers are an example of a typically local pseudo SSO offering a limited degree of automation. Browsers that remember passwords and synch them across machines are an example of something approaching a proxy pseudo SSO. Examples of proxy true SSOs include Kerberos and third-party authentication by Google/Facebook credentials. Local true SSOs are harder to exemplify, as they necessitate the remote SP trusting the user's machine not to lie about the user's identity; a trusted cryptographic co-processor might be needed here to ensure that the user cannot subvert the local SSO.

Exercises

  1. A user is required to choose a 4-digit PIN. The allowed digits are 0..9. Assume the user chooses the PIN randomly. What is the entropy of such a PIN?

  2. Continuing the previous exercise, the user is now required to enter their 4-digit PIN on an unusual keypad with five buttons, each of which is labeled with two digits:

    +---+---+---+---+---+
    |1*2|3*4|5*6|7*8|9*0|
    +---+---+---+---+---+

    To enter either a "1" or a "2" on this keypad, the user presses the "1*2" button only one time. Hence, the system cannot distinguish between "1" and "2". The same is similarly true for the other digits.

    What is the entropy of a randomly-selected 4-digit PIN as it would be entered on this keypad?

  3. Let X be such that an X-digit PIN chosen randomly from digits 0..9 has entropy equal to that of a 4-digit PIN chosen randomly from the keypad in the previous exercise. Determine what X is to the nearest integer.

  4. According to the NIST SP 800-63 (2008) heuristics, what is the entropy of a 10-character password chosen (non-randomly) by a user from a standard US keyboard? Assume the user isn't forced to use any upper-case or non-alphabetic characters, and that no dictionary checking is done.  

  5. Which of the following policies will produce the highest-strength passwords? Which policy do you think will produce passwords that are easiest to remember? Use the NIST heuristics to evaluate policy 2.

    • Policy 1: Users are assigned randomly-generated 6-character passwords, where each character is a lower-case Latin letter (i.e., a-z).

    • Policy 2: Users choose their own passwords, which must be at least 12 characters long, where each character may be any character from the full keyboard. Users are not required to use any upper-case or non-alphabetic characters, and no dictionary checking need be done.

    • Policy 3: Users are assigned randomly-generated passphrases, where each passphrase is the concatenation of four words randomly chosen from a system dictionary of 2,000 very common words (e.g., "correcthorsebatterystaple").

  6. Consider this claim: "Policies that require user-chosen passwords to include upper-case and non-alphabetic characters are not useful, because they do not make passwords harder to guess: once the attacker learns the policy, she can adjust her guessing strategy accordingly." Evaluate that claim.

  7. Choose any three of the potential replacements for passwords discussed in Bonneau et al. (2012). Analyze each replacement against the criteria of security, usability, and deployability. Do you agree with the assessment made by Bonneau et al. in their Table I?