Everybody's got passwords for everything, and almost every password people use is total shit. The best passwords are long sequences randomly generated from a large alphabet, but those are difficult to remember. So maybe you come up with one really good random one and spend some time committing it to memory. But the biggest mistake you can make is using the same password for everything, especially websites.
I've been a huge culprit of doing exactly that. For years, I used the same password for everything. It was a good strong password, but by spreading it all over the place, I completely compromised the security of it. First, when you submit a password in a web form, if that form isn't sent over a secure connection (HTTPS), then your password is going to be nabbed by somebody. Good sites like gmail use secure connections, but it's truly shocking to see how many don't. Second, once it is submitted to the site, who knows what they're doing with it? Do you trust the owner of the site to hold on to the password you use for everything? Even if you do, accidents and break-ins happen. Ideally, they'll only be storing a salted strong hash of your password, but a number of recent incidents in which websites' users databases have been leaked show that this is very often not the case. There's also some sites and services that do silly things like email your password to you. Email is not a secure medium.
So using the same password for everything is foolish because inevitably, one of those things is going to leak your password. You have to use different passwords for everything. Fortunately, that doesn't mean you need to commit a ton of different strong passwords to memory. A common recommendation which is extremely poor but which we'll expand on, is to come up with a single strong password that you commit to memory, and then a simple rule for customizing the password to each site/service. For instance, your core password might be "password" (but hopefully, it's a lot strong than that!), and then you can derive site-specific passwords like "gpasswordmail" for gmail, "lpasswordhacker" for lifehacker, etc.
It's the start of a good idea, but any rule that is easy to remember and use would be instantly apparent to whoever it is that eventually gets one of your passwords. If you got your hands on my password for amazon.com, found it was "am78.fALzd-4noza", what would you guess as my password for ebay? Probably something like "e78.fALzd-4yab", and you'd probably be right.
So we need to take this one step further. If you know anything about cryptography, you're probably already thinking about a cryptographically secure hash, like MD5 or one of the SHA hashes. For instance, if I feed "gpasswordmail" into MD5, the output is "48a9b2c2273351b9b4c9df9172a7dd10". Then I try "lpasswordhacker" and get "07a490202128b63bd3bc2267d585d0db". Can you see any connection between them? If either of these get revealed as my password on that site, it would be exceedingly difficult for someone to reverse the hash (that's what makes them cryptographically secure) to see the plaintext I put in. Without the plaintext, they can't figure out my rule for generating passwords.
To make things a little more manageable I'd recommend encoding your hash in base-64. Not the ASCII string "07a490202128b63bd3bc2267d585d0db", but the actual hex value 0x07a490202128b63bd3bc2267d585d0db. That doesn't add any extra security: your password is exactly the same number of bits as it was before and encoding has nothing to do with encrypting. All it does it condense the number of characters. For instance, this same password in base64 encoding is "B6SQICEotjvTvCJn1YXQ2w==". Not actually any easier to remember, but maybe the form limits the number of characters or something.
This is still not an excuse for using shitting passwords. It wouldn't be hard for an attacker to see "07a490202128b63bd3bc2267d585d0db" and guess that you're using a hash function, or to see "B6SQICEotjvTvCJn1YXQ2w==" and guess that you're using base-64. Once they decrypt the base-64, they'll be back to "07a490202128b63bd3bc2267d585d0db", which they will guess is a hash. Reversing the hash directly may be difficult, but if they can guess the input to the hash because you're using a stupid core password like "password" or "asd" or "foobar", then they will have your core-password and rule to break into all the rest of your accounts.
Now obviously, you're not going to commit strings like "07a490202128b63bd3bc2267d585d0db" or "B6SQICEotjvTvCJn1YXQ2w==" to heart, at least not very many of them, so you'll need to be able to generate them whenever you need to authentication (log in). Obviously, this will only be a feasible scheme if you'll always be able to do so when you need to.
On Unix-Like systems
On Unix-like systems, the following bash commands will ask you for your passphrase (like "gpasswordmail", but stronger) and print the hashed and base64-encoded password to stdout:
read -sp "Password: " PASSPHRASE && echo "" \ && (echo -n "0:" ; echo -n $PASSPHRASE |md5sum - | cut -d" " -f 1 ) | xxd -r | uuencode --base64 - | sed -n "2p" ; \ unset PASSPHRASE
I can break that down a little for you:
read -sp "Password: " PASSPHRASE
Reads from stdin and stores the input in a variable called PASSPHRASE. The -p option lets you specify a prompt that will be printed out before reading starts, and the -s tells it to read silently, meaning the typed characters are not echoed on the terminal.
&& echo "" \
The && just means only execute the next command if the last one succeeded. Not sure why it wouldn't, but just in case. Anyway, the next command is just to echo an empty string, which really means a linebreak will be printed. This is just so the output shows up on a line byitself, instead of following the "Password: " prompt. The \ at the end of the line is a line-continuation, meaning the command continues on the next line.
&& (
Once again, && means only execute this if the last command succeeded. There's no point in continuing if reading the password failed. The ( opens a subshell. Subshells do a lot of things, but what we're interested in now is gathering output. Basically, once the subshell is closed, it will look to the rest of the world like a single command. So everything that gets written to stdout inside the subshell will get concatenated together into one string, which becomes the "output" of the subshell. In other words, it lets us pipe the output of several commands into the next command after the subshell as a single stream.
echo -n "0:"
When we execute md5sum, it outputs the hash digest in ASCII hex. For base64 encoding, we need it to be binary, so we're going to use xxd to "undump" the hex back into binary. xxd requires the byte offset at the start of the line, so that' what we're printing here. The -n option tells echo not to print the customary linebreak at the end of the text.
; echo -n $PASSPHRASE
The ; is just a command seperator that says the last command is over, and we're starting a new one. The next command is another echo without linebreak, and we use it to echo the value of the PASSPHRASE variable, which we set earlier to the value we read from the user. We need to suppress the linebreak just to be consistent, because we're feeding this into the md5sum utility. If we don't suppress the linebreak, then it will get included in the hash. There's nothing wrong with that, per se, but you need to make sure it always gets included, which is tricky if you're going between systems with different line-endings.
|md5sum -
The | is a pipe, meaning the output from the previous command is sent as input to the next one. The next command is md5sum, and we tell it to read from stdin with the - arg. So this is going to read the PASSPHRASE and generate a hash for it. It isn't reading the "0:" from the first echo, because that was a previous command. The output from md5sum looks something like "3858f62230ac3c915f300c664312c63f *-". The bit at the end tells the type and name of the file the hash belongs to (in our case, stdin), which we don't care about. The next command will take care of it:
| cut -d" " -f 1
Once again we use the pipe character so the output from md5sum will be fed directly to the stdin stream for our cut command. This command breaks the input into fields, each field separated by the delimiter specified with the -d option. For our purposes, the delimiter is a space character, which seperates the hash itself from the type-and-name specifier. The -f option tells cut which fields we actually want. We want the first field (the hash), so we use -f 1.
)
This closes the subshell. So our subshell consisted of two echo commands, the second of which was piped through md5sum, and then cut. Alltogher, the output from the subshell looks like: "0:3858f62230ac3c915f300c664312c63f". Even though it was generated by multiple commands, the subshell gathered it up into a single string that can be piped to the next command.
| xxd -r
And that's exactly what we do, pipe it into xxd. This program is used for making hexdumps of binary files, but it can also be used to "undump" ASCII hex back into binary. Essentially, it's going to take our hex hash, "3858f62230ac3c915f300c664312c63f", and interpret it as the hexadecimal value 0x3858f62230ac3c915f300c664312c63f, then generate the binary representation of that value, and write it to stdout. Note this is raw binary, not ASCII representation of binary.
| uuencode --base64 -
The next command will encode this binary value in base-64, to make sure it consists entirely of printable ASCII characters. Before encoding, we have raw binary octets anywhere in the range [0, 256]. It might look something like "8Xö"0¬<`_0♀fC↕Æ?". After encoding, it will look more like "OFj2IjCsPJFfMAxmQxLGPw==". However, uuencode dumps some header and footer lines around the encoded text, which we want to get rid of. There is 1 line header line, and 1 footer line, and we happen to know in this case that out encoded text fits on one line, which means it's all on the second line. We can get that easily with sed:
| sed -n "2p"
Which simply tells it to keep line 2, and discard the rest. This is our output, our hashed-encoded password, printed to stdout.
unset PASSPHRASE
The final command just clears the original passphrase from the variable, so your password isn't just left hanging around.
On that topic, note that these commands dump your actual password, the thing you use to actually authenticate, to stdout. It's not as dangerous as dumping the plain-text passphrase, but it's still dangerous if anyone happens to be looking over your shoulder or otherwise has access to your system. You can use your favorite pipe-to-clipboard command, something like xclip -selection to put the password someplace useful (your clipboard) but not in plain site. You'll want to make sure you remove it from your clipboard after you paste it into the form of course (for instance, copy something else to the clipboard).
On a similar note, I can't say anything to the security of any of these commands. It's entirely possible that the piped values will be left lingering somewhere on your system, in swap, etc. If you're that worried, you'll need to write something in C and use secure memory and stuff to do everything the right way.
With Python
I wrote the following simple python script to do the same thing. Launched without arguments, it's a CLI program that prompts and reads (silently) from the command line, and writes the hashed-encoded password to stdout. Launched with -g, it launches a little Tk window where you can type in your plaintext password, and it will dump the encoded hash to a little text field that you can copy it out of.
I'm not including a copy of license, just go to http://www.gnu.org/licenses/
""" File: passhash.py License: Copyright 2010 Brian Mearns (bmearns@ieee.org) This file is part of passhash.py. passhash.py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. passhash.py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with passhash.py. If not, see <http://www.gnu.org/licenses/>. ====================== The generated hash takes a binary md5sum of your input text, and then encodes it with base-64. It's equivalent to the following bash command: $ read -sp "Password: " PASSPHRASE && echo "" \ && (echo -n "0:" ; echo -n $PASSPHRASE |md5sum - | cut -d" " -f 1 ) | xxd -r | uuencode --base64 - | sed -n "2p" ; \ unset PASSPHRASE $ """ import hashlib import getpass import base64 from Tkinter import * def generateHash(password): hasher = hashlib.md5() hasher.update(password) hash = hasher.digest() return base64.b64encode(hash) class PasswordWidget: def __init__(self): self.__password = None self.__root = Tk() self.__hashframe = Frame(self.__root) self.__hashlabel = Label(self.__hashframe, text="NONE") self.__frame = Frame(self.__root) self.__frame.pack(side=TOP) self.__label = Label(self.__frame, text="Password:") self.__label.pack(side=LEFT) self.__field = Entry(self.__frame, exportselection=False, show="*", width=40) self.__field.pack(side=RIGHT) self.__hashFrame = Frame(self.__root) self.__hashFrame.pack() self.__hashEntry = Entry(self.__hashFrame, width=40) self.__hashEntry.insert(0, "None") self.__hashEntry.configure(state=DISABLED) self.__hashEntry.pack(side=RIGHT) #self.__hashTest = Text(self.__hashFrame, width=40, height=1) #self.__hashText.insert("0.0", "None") #self.__hashText.configure(state=DISABLED) #self.__hashText.pack(side=RIGHT) self.__bframe = Frame(self.__root) self.__bframe.pack(side=BOTTOM) self.__okButton = Button(self.__bframe, text="OK", command=self.__ok) self.__okButton.pack(side=LEFT) self.__cancelButton = Button(self.__bframe, text="Cancel", command=self.__cancel) self.__cancelButton.pack(sid=LEFT) self.__root.bind("<Key>", self.__keyPressed) self.__field.focus() self.__root.mainloop() def __keyPressed(self, event): if event.keycode == 13: #Return self.__okButton.invoke() elif event.keycode == 27: #Escape self.__cancelButton.invoke() def __cancel(self): self.__password = None self.__root.withdraw() self.__root.quit() def __ok(self): self.__password = self.__field.get() hash = generateHash(self.__password) self.__hashEntry.configure(state=NORMAL) self.__hashEntry.delete(0, END) self.__hashEntry.insert(0, hash) self.__hashEntry.selection_range(0, END) self.__hashEntry.focus() #self.__hashText.configure(state=NORMAL) #self.__hashText.delete("0.0", END) #self.__hashText.insert("0.0", hash) #self.__hashText.configure(state=DISABLED) def getPassword(self): return self.__password import sys, os from errno import * if len(sys.argv) < 2: password = getpass.getpass() hash = generateHash(password) print hash elif sys.argv[1] == "-g": pw = PasswordWidget() else: sys.stderr.write("Bad arguments.%sUsage: %s%s or: %s -g%s" % (os.linesep, sys.argv[0], os.linesep, sys.argv[0], os.linesep)) sys.exit(EINVAL)
Pencil and Paper method
- Be careful, I haven't thought this through very hard.
Here's another idea I'm working on. This is supposed to be something that you can potentially do by pencil and paper when necessary. It's not necessarily practical to do all the time by pencil and paper, a program would automate the process. But it should definitely be reasonable to have to do once in a while.
It's fairly simple, you have a large pseudo-random grid of characters, a la http://passwordcard.org/. It should be large enough that a number of different (overlapping) 16x16 grids can be formed. For instance, if it's 17x17, then 4 different 16x16 grids can be created. If it's 18x18, nine different grids, etc. 20x20 is probably good (giving 25 different grids), or even just like 20x18 or something.
So all you do is choose one of these 16x16 grids. Cut the grid 4 non-overlapping squares of 8x8 each. The top-right and bottom-left squares are the keys for a 4-square cipher. The top-left and bottom-right are ignored and could be overlaid with ordinary alphabet squares for convenience, or you can just write out the cipher grid manually. Now you encode your core password and use that. The site-specificness comes from the selection of the grids. For instance, we might put heading letters for each row and column, then assign a two-letter id to each site/service. Like "gm" for gmail, "lh" for lifehacker, "sd" for slashdot, etc. However, that means you need a unique 2-character id for each one or they'll have the same password. A good option (in general) is to do multiple encryptions, selecting the grids for each encryption from the next two letters in the id. E.g., for gmail, you would encrypt with grid "gm" first, then with grid "ai", then with grid "lx". However, multiple-encryptions also requires that you write things down, which is ungood.
Instead, use the first two letters in the id to choose a preliminary grid. With this grid, encode the next to letters in the id, and use the result to pick a second grid. Repeat as many times as is appropriate. However, you're still limited by the total number of grids on the master. To make sure each site has a unique password, you can append the site-id to your core password and encrypt, but 'only if you encrypt it again after wards. It could be a simpler encryption, like transposition, but otherwise you've given an attacker a crib against a fairly weak cipher, so they could potentially recover some of your master-grid from that.
An 8x8 grid gives 64 characters, so your core-password can contain upper and lower case letters (52), digits (+10), and two specials chars. Or, you can have any custom alphabet you like.
Now, since the master grid is random, it contains duplicates, so the encryption is not likely to be exactly reversible (because it is not one-to-one). That's fine, and actually an added benefit. Even if an attacker gets your key and knows which grid you chose, they hopefully won't be able to decrypt it exactly. However, this is only a small benefit, because they will probably be able to narrow it down to just a few possibilities, so you should guard your master grid carefully.
Multiple encryptions, especially with different grids, will help.
