Add to Technorati Favorites

Secure per-site passwords with no encrypted blob

Last night I read a Guardian article, Online passwords: keep it complicated. It’s a surprisingly good summary, given that it’s aimed at the general public. The author concludes by telling us he decided to adopt LastPass, and also mentions 1Password. The comments section on the page gives similar solutions, like KeePass. The security-conscious people I know have arrived at the same conclusion. There are plenty of articles on the web that summarize various similar products, e.g., Which Password Manager Is The Most Secure? at LifeHacker. A single encrypted blob offers good security and works well in practice. It also allows the storage of information like name, address, credit cards, etc., that can be used to auto-fill web forms.

But… I’ve never liked the idea of a single encrypted file with all my passwords in it. What if the storage is lost or corrupted? Could the file someday be decrypted by someone else? If my encrypted blob is online, what happens when I am offline? If the blob is to be stored locally, I need to think about where to put it, how to back it up, etc. If a company holds it for me, what happens if they go out of business or get hacked? What if they use proprietary encryption, a closed-source access app, or a proprietary underlying data format? Not all the above solutions have all these issues, but they all have some of them.

The crucial thing they all have in common is that they use a master password to encrypt all your passwords into a single blob, and the blob has to then be reliably stored and accessible forever.

An approach that requires no storage

I realized there’s a solution that doesn’t require any storage. It’s not perfect, but it has some attractive properties. I’ve already started using it for some sites.

[Edit: it has been pointed out in the comments that the following solution has been thought of before. See SuperGenPass.]

Here’s a simple first version of Python code to print my password for a given service:

import base64, getpass, hashlib, sys

service = sys.argv[1]
secret = getpass.getpass('Enter your master password: ')
password = base64.b64encode(hashlib.sha512(secret + service).digest())[:32]

print 'Your password on %s is %s' % (service, password)

The service name is given on the command line. The code prints a 32-character password for use on that service. Here’s some sample output:

        $ mypass.py facebook
        Enter your master password: 
        Your password on facebook is Wza2l5Tqy0omgWP+5DDsXjQLO/Mc07N8

        $ mypass.py twitter
        Enter your master password: 
        Your password on twitter is eVhhpjJrmtSa8XnNMu6vLSDhPeO5nFOT

This has some nice advantages. It also places some small requirements on the user. Unfortunately, however, it is not generally applicable – at least not today. These are discussed below.

Advantages

The obvious advantage is that there is no external storage. Your passwords are not stored anywhere. There’s no blob to store, protect, access, backup, worry about, etc. The algorithm used to generate your password is dead-simple, it’s open and available in dozens of languages, and it’s secure.

You’re free to use more than one master password if you like. You can invent your own services names (more on which below).

Requirements / User burden

As with all one-key-to-unlock-them-all approaches, the user obviously needs to remember their master password.

With this approach though, the user also has to remember the name they used for the service they’re trying to access. If you create a password for a service called “gmail” you’ll need to use that exact service name in the future. For me that’s not much of a burden, but I guess it would be for others.

There’s no reason why the list of services you have passwords for couldn’t be stored locally. If the password generator were in a browser extension, it could possibly suggest a service name, like “facebook.com”, based on the domain of the page you were on.

With this approach, it’s even more important that one’s master password be hard to guess. Unlike the single-encrypted-blob approach, anyone who can guess your master password (and the names you use for services) can immediately obtain your passwords. They don’t also need access to the blob – it doesn’t exist.

Additional security can be easily had by, for example, using a convention of adding a constant character to your service names. So, e.g., you could use “facebook*” and “twitter*” as service names, and not tell anyone how you form service names.

General applicability

Unfortunately, there is a major problem with this approach. That’s because different sites have different requirements on passwords. Some of the difficulties can be avoided quite easily, but there’s an additional problem, caused when services change their password policy.

The above code generates a Base64 password string. So, to give some examples, if the service you want a password for doesn’t allow a plus sign in your password, the above code might make something unacceptable to the service. Same thing if they insist that passwords must be at most 12 characters long.

Ironically, these services are insisting on policies that prevent the use of truly secure passwords. They’re usually in place to ensure that short passwords are chosen from a bigger space. It would be better, though more work, to impose restrictions only on short passwords.

In a perfect world, all sites could immediately switch to allowing Base64 passwords of length ≥ 16 (say). Then the above approach would work everywhere and we’d be done.

Varying password length

A general approach to adjusting the generated password is to take some of the Base64 information produced and use it to modify the password. For example, you might not comfortable with all your passwords being the same length, so we can compute a length like this:

import base64, getpass, hashlib, string, sys

b64letters = string.ascii_letters + '0123456789+/'
secret = getpass.getpass('Enter your master password: ')
password = base64.b64encode(hashlib.sha512(secret + service).digest())
lenAdjust = b64letters.find(password[-5]) % 16
print 'Your password on %s is %s' % (service, password[0:16 + lenAdjust])

This generates passwords that are between 16 and 31 characters in length:

        $ ./length-varying.py facebook
        Enter your master password: 
        Your password on facebook is 1nTlVGPhuWZf0l9Sk27
        $ ./length-varying.py twitter
        Enter your master password: 
        Your password on twitter is WE1DVZHAFBx2c3g63tR+Oi3Jxs4xMV

Satisfying site requirements

A possible approach to dealing with per-site password requirements is to have the code look up known services and adjust the initial password it generates to be acceptable. This can easily be done in a secure, random, repeatable way. For example:

  • If a site doesn’t allow upper case, lowercase the password.
  • If a site doesn’t allow digits, replace them with random letters.
  • If a site requires punctuation, you can replace some initial letters in the password with randomly chosen punctuation and then randomly permute the result using the Knuth Shuffle.

Some of these transformations use random numbers. These are easy to obtain: take an unused part of the Base64 string and use it to seed a RNG. For each transformation, you would need to call the RNG a fixed number of times, i.e., independent of the number of random numbers actually used to perform the transformation. That’s necessary in order to keep the RNG in a known state for subsequent transformations (if any).

For example, the following replaces digits 0-9 with a letter from A-J whose case is chosen randomly:

import base64, getpass, hashlib, random, string, sys

def getSeed(chars):
    seed = ord(chars[0])
    for letter in chars[1:]:
        seed = (seed << 8) & ord(letter)
    return seed

service = sys.argv[1]
b64letters = string.ascii_letters + '0123456789+/'
secret = getpass.getpass('Enter your master password: ')
digest = base64.b64encode(hashlib.sha512(secret + service).digest())
lenAdjust = b64letters.find(digest[-5]) % 16

passwordWithDigits = digest[0:16 + lenAdjust]
password = ''

random.seed(getSeed(digest[32:36]))
randoms = [random.randint(0, 1) for _ in passwordWithDigits]

for index, letter in enumerate(passwordWithDigits):
    if letter in '0123456789':
        base = ord('a' if randoms[index] else 'A')
        replacement = chr(base + ord(letter) - ord('0'))
        password += replacement
    else:
        password += letter

print 'Your password on %s is %s' % (service, password)

Here's the output for the above two services (using the same master password):

        $ ./no-digits.py facebook
        Enter your master password: 
        Your password on facebook is bnTlVGPhuWZfAljSkch
        $ ./no-digits.py twitter
        Enter your master password: 
        Your password on twitter is WEBDVZHAFBxccdgGdtR+OidJxsExMV

As you can see, the digits are replaced with letters (in a biased way, that we can ignore in an informal blog post). The RNG is in a known state because the number of times it has been called is independent of the number of digits in the pre-transformation text.

This approach can be used to transform the initial random sequence into one that satisfies a service's password restrictions.

It is difficult to reliably associate service names with site policies. To do so might require keeping a file mapping the name a user used for a service to the policy of the site. Although this doesn't defeat the purpose of this approach (since that file would not need to be stored securely), it is an additional and unwanted pain for the user. Part of the point was to try to entirely avoid additional storage, even if it doesn't have to be encrypted.

The major problem with per-site requirements

The major problem however is that sites may change their password policy. Even if our program knew the rules for all sites, it would have a real problem if a site changed its policy. The code would need to be updated to generate passwords according to the new site policy. Existing users, supposing they upgraded, would then be shown incorrect passwords and would need to do password resets, which is obviously inconvenient.

Conclusion

I like the above approach a lot, but don't see a way to solve the issue with changing site policies. I wouldn't mind building in some rules for known popular sites, but any step in that direction has its problems - at least as far as I can see.

For now, I'm going to start using the above approach on sites that allow a long random password with characters from the Base64 set. That covers the majority of sites I use. Importantly, that includes Google, so if I ever need password resets I can have them sent there, knowing that I can always log in to recover them.


You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

16 Responses to “Secure per-site passwords with no encrypted blob”

  1. Great idea! Have to think about the problem solution. At least i’ll do it for myself .
    Thanks :)

  2. Petter Häggholm Says:

    Have you looked into SuperGenPass? [http://supergenpass.com/]

  3. Hi Petter. Yes, that’s the same thing, pretty much, thanks for the pointer! I’m going to edit the text above to point people to SGP.

  4. Richard Moore Says:

    With your scheme if you ever sign into a local service say for the sake of example ‘www’ then you will be generating a password that can be used to workout all passwords for sites that start with www. The hash output can be used as input for a length extensions attack (due to the way these hashes work). The risk is mitigated to an extent by the fact that the password is a truncated form of the hash as you only take the first 32 bytes of the digest, but that means that the remaining search space for a brute force attack is tiny. If you want to look into doing something like this, take a look at things like HMAC which are designed to prevent this kind of extension attack.

  5. You might also consider running `pip install oplop` and/or visiting https://oplop.appspot.com/.

  6. Richard Moore Says:

    Looking at the ‘how it works’ page for this tool it looks like this is broken too.

  7. Hi Richard. Thanks. I originally wrote the code using hmac but then thought I didn’t need to worry about length extensions. I guess you’re right to be concerned though, and also I later thought that I might want to use service names like “gmail” and “gmail-work” or whatever. So, I should probably revert :-)

  8. Richard Moore Says:

    Yeah, that would be a lot safer.

  9. Christian Heimes Says:

    Never ever do something like hashfunc(secret + message)! This is insecure and open to several attack vectors like length extension attacks. At least you want to use a

    message authentication code algorithm like HMAC. If you want to reach a minimum level of security, than add a salt from a proper CPRNG and use a key stretching or key derivation algorithm like PBKDF2. The master passwords has most likely not enough entropy. A key stretching algorithms compensates it a bit.

  10. Thanks Christian. I’d not heard of key stretching. I’ll go look…

  11. Brandon Rhodes Says:

    Note that http://passwordmaker.org/ implements this idea with several competing implementations for many different platforms, and with more secure cryptographic options than the simple hash(secret + message) that you do here. PasswordMaker Pro, the Chrome extension, is the particular implementation of its standard that I recommend that people use.

  12. Good point. Unfortunately, that means you now have to save the salt between invocations, if I’m not mistaken. You’ll also need the same salt value on every computer you want to generate passwords on (instead of only the same algorithm), and when you lose it, you lose access to all your passwords. I wonder if there’s a better way?

  13. David Avraamides Says:

    What if your password expires on one site? How do you change one password without changing the “secret” and therefore changing all passwords?

  14. Hi David. That’s a good question :-) I guess the only answer is that you’d need to choose a different service name for that site (because, as you say, changing your master password would be bad). But that further highlights the need for some kind of management of the service names you’ve used for sites. I don’t like that at all, as I guess is clear – that’s why I say this is a good (hard) question.

    Another thing which I find very impractical about this approach is that once you start using a version of this code, it’s very hard to upgrade to another version. Again, you’d want metadata stored elsewhere to keep some idea of state.

    Thanks for commenting!

  15. Thanks Brandon, I’ll have a look.

  16. Thanks Richard, I’ll go look.