Last night I read a Guardian article, Online passwords: keep it complicated. It’s a surprisingly good summary, given that it’s aimed at the general public. The author concludes by telling us he decided to adopt LastPass, and also mentions 1Password. The comments section on the page gives similar solutions, like KeePass. The security-conscious people I know have arrived at the same conclusion. There are plenty of articles on the web that summarize various similar products, e.g., Which Password Manager Is The Most Secure? at LifeHacker. A single encrypted blob offers good security and works well in practice. It also allows the storage of information like name, address, credit cards, etc., that can be used to auto-fill web forms.
But… I’ve never liked the idea of a single encrypted file with all my passwords in it. What if the storage is lost or corrupted? Could the file someday be decrypted by someone else? If my encrypted blob is online, what happens when I am offline? If the blob is to be stored locally, I need to think about where to put it, how to back it up, etc. If a company holds it for me, what happens if they go out of business or get hacked? What if they use proprietary encryption, a closed-source access app, or a proprietary underlying data format? Not all the above solutions have all these issues, but they all have some of them.
The crucial thing they all have in common is that they use a master password to encrypt all your passwords into a single blob, and the blob has to then be reliably stored and accessible forever.
An approach that requires no storage
I realized there’s a solution that doesn’t require any storage. It’s not perfect, but it has some attractive properties. I’ve already started using it for some sites.
[Edit: it has been pointed out in the comments that the following solution has been thought of before. See SuperGenPass.]
Here’s a simple first version of Python code to print my password for a given service:
import base64, getpass, hashlib, sys
service = sys.argv[1]
secret = getpass.getpass('Enter your master password: ')
password = base64.b64encode(hashlib.sha512(secret + service).digest())[:32]
print 'Your password on %s is %s' % (service, password)
The service name is given on the command line. The code prints a 32-character password for use on that service. Here’s some sample output:
$ mypass.py facebook
Enter your master password:
Your password on facebook is Wza2l5Tqy0omgWP+5DDsXjQLO/Mc07N8
$ mypass.py twitter
Enter your master password:
Your password on twitter is eVhhpjJrmtSa8XnNMu6vLSDhPeO5nFOT
This has some nice advantages. It also places some small requirements on the user. Unfortunately, however, it is not generally applicable – at least not today. These are discussed below.
Advantages
The obvious advantage is that there is no external storage. Your passwords are not stored anywhere. There’s no blob to store, protect, access, backup, worry about, etc. The algorithm used to generate your password is dead-simple, it’s open and available in dozens of languages, and it’s secure.
You’re free to use more than one master password if you like. You can invent your own services names (more on which below).
Requirements / User burden
As with all one-key-to-unlock-them-all approaches, the user obviously needs to remember their master password.
With this approach though, the user also has to remember the name they used for the service they’re trying to access. If you create a password for a service called “gmail” you’ll need to use that exact service name in the future. For me that’s not much of a burden, but I guess it would be for others.
There’s no reason why the list of services you have passwords for couldn’t be stored locally. If the password generator were in a browser extension, it could possibly suggest a service name, like “facebook.com”, based on the domain of the page you were on.
With this approach, it’s even more important that one’s master password be hard to guess. Unlike the single-encrypted-blob approach, anyone who can guess your master password (and the names you use for services) can immediately obtain your passwords. They don’t also need access to the blob – it doesn’t exist.
Additional security can be easily had by, for example, using a convention of adding a constant character to your service names. So, e.g., you could use “facebook*” and “twitter*” as service names, and not tell anyone how you form service names.
General applicability
Unfortunately, there is a major problem with this approach. That’s because different sites have different requirements on passwords. Some of the difficulties can be avoided quite easily, but there’s an additional problem, caused when services change their password policy.
The above code generates a Base64 password string. So, to give some examples, if the service you want a password for doesn’t allow a plus sign in your password, the above code might make something unacceptable to the service. Same thing if they insist that passwords must be at most 12 characters long.
Ironically, these services are insisting on policies that prevent the use of truly secure passwords. They’re usually in place to ensure that short passwords are chosen from a bigger space. It would be better, though more work, to impose restrictions only on short passwords.
In a perfect world, all sites could immediately switch to allowing Base64 passwords of length ≥ 16 (say). Then the above approach would work everywhere and we’d be done.
Varying password length
A general approach to adjusting the generated password is to take some of the Base64 information produced and use it to modify the password. For example, you might not comfortable with all your passwords being the same length, so we can compute a length like this:
import base64, getpass, hashlib, string, sys
b64letters = string.ascii_letters + '0123456789+/'
secret = getpass.getpass('Enter your master password: ')
password = base64.b64encode(hashlib.sha512(secret + service).digest())
lenAdjust = b64letters.find(password[-5]) % 16
print 'Your password on %s is %s' % (service, password[0:16 + lenAdjust])
This generates passwords that are between 16 and 31 characters in length:
$ ./length-varying.py facebook
Enter your master password:
Your password on facebook is 1nTlVGPhuWZf0l9Sk27
$ ./length-varying.py twitter
Enter your master password:
Your password on twitter is WE1DVZHAFBx2c3g63tR+Oi3Jxs4xMV
Satisfying site requirements
A possible approach to dealing with per-site password requirements is to have the code look up known services and adjust the initial password it generates to be acceptable. This can easily be done in a secure, random, repeatable way. For example:
- If a site doesn’t allow upper case, lowercase the password.
- If a site doesn’t allow digits, replace them with random letters.
- If a site requires punctuation, you can replace some initial letters in the password with randomly chosen punctuation and then randomly permute the result using the Knuth Shuffle.
Some of these transformations use random numbers. These are easy to obtain: take an unused part of the Base64 string and use it to seed a RNG. For each transformation, you would need to call the RNG a fixed number of times, i.e., independent of the number of random numbers actually used to perform the transformation. That’s necessary in order to keep the RNG in a known state for subsequent transformations (if any).
For example, the following replaces digits 0-9 with a letter from A-J whose case is chosen randomly:
import base64, getpass, hashlib, random, string, sys
def getSeed(chars):
seed = ord(chars[0])
for letter in chars[1:]:
seed = (seed << 8) & ord(letter)
return seed
service = sys.argv[1]
b64letters = string.ascii_letters + '0123456789+/'
secret = getpass.getpass('Enter your master password: ')
digest = base64.b64encode(hashlib.sha512(secret + service).digest())
lenAdjust = b64letters.find(digest[-5]) % 16
passwordWithDigits = digest[0:16 + lenAdjust]
password = ''
random.seed(getSeed(digest[32:36]))
randoms = [random.randint(0, 1) for _ in passwordWithDigits]
for index, letter in enumerate(passwordWithDigits):
if letter in '0123456789':
base = ord('a' if randoms[index] else 'A')
replacement = chr(base + ord(letter) - ord('0'))
password += replacement
else:
password += letter
print 'Your password on %s is %s' % (service, password)
Here's the output for the above two services (using the same master password):
$ ./no-digits.py facebook
Enter your master password:
Your password on facebook is bnTlVGPhuWZfAljSkch
$ ./no-digits.py twitter
Enter your master password:
Your password on twitter is WEBDVZHAFBxccdgGdtR+OidJxsExMV
As you can see, the digits are replaced with letters (in a biased way, that we can ignore in an informal blog post). The RNG is in a known state because the number of times it has been called is independent of the number of digits in the pre-transformation text.
This approach can be used to transform the initial random sequence into one that satisfies a service's password restrictions.
It is difficult to reliably associate service names with site policies. To do so might require keeping a file mapping the name a user used for a service to the policy of the site. Although this doesn't defeat the purpose of this approach (since that file would not need to be stored securely), it is an additional and unwanted pain for the user. Part of the point was to try to entirely avoid additional storage, even if it doesn't have to be encrypted.
The major problem with per-site requirements
The major problem however is that sites may change their password policy. Even if our program knew the rules for all sites, it would have a real problem if a site changed its policy. The code would need to be updated to generate passwords according to the new site policy. Existing users, supposing they upgraded, would then be shown incorrect passwords and would need to do password resets, which is obviously inconvenient.
Conclusion
I like the above approach a lot, but don't see a way to solve the issue with changing site policies. I wouldn't mind building in some rules for known popular sites, but any step in that direction has its problems - at least as far as I can see.
For now, I'm going to start using the above approach on sites that allow a long random password with characters from the Base64 set. That covers the majority of sites I use. Importantly, that includes Google, so if I ever need password resets I can have them sent there, knowing that I can always log in to recover them.