So, I thought maybe I’d spend a little time discussing password authentication. Skip to the end if you just want to see good and bad ways to come up with passwords.
An early bit of computer security reading that made an impact on me while I was learning the ropes as Ye Company Computer Fellow at The Adams Group, was Foiling the Cracker: A Survey of, and Improvements to, Password Security, by Daniel V. Klein. Based on research from 1989, the limits of computing power had already dramatically increased by the time I got my hands on it, and yet even now, nearly two decades later, the cautions and advice from this paper have already proved to age remarkably well.
In conducting his research for this paper, Mr Klein collected roughly 15,000 encrypted password hashes (from actual user accounts), and attempted to recover the original passwords via “brute force”.
An “encrypted password hash” is a unique, mathematical value that is generated from a user’s password, and stored for the purpose of later authenticating the user by verifying that phe knows per password. When the user enters the password, the very same mathematical transformation is performed, and the result is compared to the stored value. If they match, the password is the same (well, to be more precise, the password has only a one in millions-times-millions-times-millions-times… chance of being different).
The advantage of doing it this way instead of just saving the passwords themselves, is that if someone were to recover the file which contained all the passwords, they suddenly have access to every account represented in the file; whereas if only the encrypted hash is stored, all they have is a bunch of useless mathematical values, represented as strings of garbage text. There is no way to take the hash, and transform it back into the original password (for this reason, they are often called “one-way hashes”). The only thing you can do with a hash is to compare it to other hashes you can generate (from guessing what the password might be), to see if you’ve found the user’s password. (This tends to be faster and safer, though, than just trying the passwords directly on the system with which you’re trying to authenticate, as many systems have built-in time delays, or don’t let you try more than a few passwords in a given amount of time, and log every attempt for later forensic analysis.)
And that’s called a “brute force” password attack. When you take a few tens of thousands of your favorite password candidates, run it through the hash algorithm, and see if any of them match the hashes you have. If any do, you note down the passwords they came from and which accounts they belong to—you’ve just hacked them!
So Mr Klein got a large number of passwords, and ran a computer (or possibly more than one, I’m not sure) to just chug along, trying out passwords from a large dictionary he’d created of some couple-million passwords to try out (about 60,000 base passwords, the rest are various permutations and transformations of those). In a week’s time, he’d recovered more than 1 of every 5 passwords (3000 passwords). He recovered 368 passwords in just the first 15 minutes!
The very first thing that would be tried against a password, was 130 variations on the account name itself. A user named “Micah J. Cowan”, with a username of mrdude, would get password attempts like mrdude, mrdude0, mrdude1, mrdude123, mjc, mjcmjc, mcowan, MCowan, hacim, micahc, MjccjM, MICAH-COWAN, (mrdude), CowanM, etc. This is actually the technique that fetched him the 368 passwords in his first 15 minutes of processing. Ouch!
Other things that would be tried, were dictionary words. And not just Meriam-Webster. A relatively exhaustive dictionary of a large number of words: people names (real and fictional), place names, foreign-language words, words from the King James Bible, offensive words and phrases, etc, etc. Variations on all these words would also be checked, such as replacing letters with similar-looking digits (o -> 0, “ell” -> 1, z -> 2, etc); various capitalizations (“mIchael”, “miChael”, “MichAel”, etc); spelling them backwards, etc.
Thought you were clever with your password of “fylgjas” (guardian creatures from Norse mythology)? Or the Chinese word for “hen-pecked husband”? Think again—he caught ’em.
In addition to the techniques Klein describes in his paper, modern, readily-available brute-force password-crackers will also support things like exhaustive searches of all combinations of letters and numbers up through around six characters. Exhaustive searches of all combinations of all possible characters are also possible, but take a lot more time.
On the other hand, what with the power of large computer clusters, and cracker “bot-nets”, given a little time, attackers can readily search exhaustively for passwords of several characters longer than was previously practical. In fact, computer security expert Bruce Schneier has a more up-to-date description of password cracking software designed to run on computer networks, and advice on what passwords are easily cracked, and how to choose safe ones. These days, good cracking software typically recover over half of the passwords given it, rather than just the ~25% that Klein managed after a year’s worth of CPU time.
So, to close up, passwords that everyone should be avoiding, for any system they care about, are:
- Any password shorter than eight characters. Passwords of arbitrary strings of letters and numbers up to six or 7 characters can be exhaustively searched given enough time and resources (32 CPU years were adequate in the days of Klein’s article: that sounds like a lot until you run into someone with a 128-CPU cluster and a few months to spare). Throwing in some punctuation marks will help for shorter strings, but really you’re best-off going for at least eight. And, don’t forget, if 7 characters is just within- or without-reach, where will it be in a few years, given the exponential growth of computer power?
- Single words or names, no matter what language they’re from, or how you modify them. Write ’em backwards, add some numbers at the end, use funky capitalization: it doesn’t matter. If they can exist in a list somewhere, a password cracker can guess it.
- My God, man, don’t ever pick a password based on your name, your account information, your girlfriend’s name, etc. You’re better avoiding your birthday or anniversary, too: these things can be exhaustively searched faster than you can blink.
- Never use the same password for more than one site.
Practices that are recommended for choosing secure passwords include:
- Building it from the initial letter of each word in a phrase: To be or not to be, that is the question becomes Tbontbtitq. This would be improved by using numbers in some spots, perhaps capitalizing an extra letter or two, and leaving in or adding in additional punctuation: 2Br!2b,tit?. (note the substitution of the letter r for or, ! for not, and ? for question). This technique can easily be used to produce random-looking passwords which are very hard to brute-force or guess. However, be careful not to choose easily-guessed phrases as the basis for your password; for example, the above phrase was intended only as an example. It is far too widely recognized to make a good basis for a password; I wouldn’t be at all surprised to discover there were password dictionaries out there that already have both Tbontbtitq and 2Br!2b,tit?. in them, along with other variations. John 3:16 makes another example of an attrociously poor choice for password derivation. The best would be to choose a phrase or sentence from a random spot in a relatively obscure book. For instance, flipping open my copy of Advanced Programming in the UNIX Environment, I find “Every process has six or more IDs associated with it.” That could be made into a decent password (though not any more, obviously, now that I’ve mentioned doing so).
- Another good technique is to use two or three regular words together, especially if you use punctuation marks to separate the words; e.g., hooky$preheroic. This can make for easily-memorized, but hard-to-guess/bruteforce passwords. As already mentioned, single words, even with a large number of variations, make for easily-cracked passwords; but multiple-word passwords exponentially increase the difficulty of brute-forcing them. That’s assuming that you pick fairly random words, particularly, words that are random with respect to one another, and to yourself. For instance, tootie and frootie, or guitar and music, make horrible words to pair. And, if you know that I play piano and love Coca-Cola, even the three-word password coke-fiend-pianist may not be too much of a stretch for you. 😉