PGP public key autodiscovery based on email address

Published: April 20, 2009
Tags: email phishing security cryptography

Quite a while ago now (toward the end of 2007), when I was freshly inspired to do more cryptography advocacy by the death of Itojun, "the IPV6 samurai"), I wrote a rant/essay called Anti-phishing Cryptography (it's actually in a pretty bad state right now, it needs some structuring and completion) in which I lamented the fact that people in 2007 were being fooled by fraudulent emails into giving up the credentials for important services like Paypal or internet banking, despite the fact that digital signature technology which makes verifying the identify of an email sender largely foolproof has existed for literally decades. I tried to generate enthusiasm for a system where important services like Paypal or banks would digitally sign all official communications with their customers, and would make the keys easily available as metadata on their websites, with a well recognised and user friendly icon system, like that currently in place for RSS. Randomly returning to this idea a few years later, I'm struck by how much this overcomplicated the problem and how (seemingly) much better a new idea of mine is.

The idea is simple: when an email client like Thunderbird or Outlook receives an email from (purportedly) the address user@domain.com, it makes a HTTP request to a hostname derived from domain.com, perhaps something like keydiscovery.domain.com. The request can be on a port other than 80 (which means that keydiscovery.domain.com remains free for use as a website URL) and the URI can be some agreed upon standard with the username user included as a query parameter, e.g /getkey?id=user. A HTTP server on the other end responds with user@domain.com's public key, in ASCII format, and then closes the connection. If the user doesn't exist it returns a 404 like usual. Once the email client has the key (which it caches against the email address for future use), it can attempt to validate any signature present.

This can take place in a few seconds, completely invisibly to the email client user. Depending upon the possible results - key successfully found, signature is valid; key not found or signature not present; key successfully found, signature is invalid - the client can display a green, orange or red light respectively beside the email's listing. Thus, without understanding anything that has actually gone on, even a clueless user who doesn't know what a public key is can quickly grasp some significant security information about that particular email. If there's a green light he knows the email is genuine. If there's a red light he knows it is certainly a scam. If places like Paypal and major banks played their part by signing all of their emails (and their outgoing SMTP server could be configured to do this automatically to reduce the costs of retraining etc. and the risk of keys being stolen/leaked from low-level employees with insecure desk PCs) and making the public key available via HTTP at the appropriate URI and on the appropriate port, then people running phishing scams could not get away with spoofing a sender address which ends in @paypal.com, unless they were able to forge a digital signature, which at the moment is essentially impossible in practice. Any attempt to send such impersonating mail would raise an obvious red light at the user's end.

Unfortunately, this idea does not kill phishing completely. Scammers could simply register domains which are superficially similar to those that they want to pretend to send email from, like paypaal.com (in fact, they often do this now in order to have somewhere convincing-looking to host web applications which steal credentials), set up a key discovery server for this domain and send signed emails that would get green lighted. It would be up to the user to realise that the email was from PayPaal and not PayPal, and that's something that anybody could miss. So further measures would be required to help guard against this. I have ideas on how this might be done, but I don't want to get this entry too off track by talking about them now. Besides, even if this system doesn't solve the phishing problem completely, it is still cool and useful and a bold step in the direction of making public key cryptography easier for the public to use.

I've done a bit of googling and nobody appears to have written about this sort of idea before, although this may be because nobody has talked about it using the term "autodiscovery" (which I picked up from the RSS world, btw). The nearest I have found are some places (like here and here) talking about putting a tag in your blog's HTML code to advertise your public key, so that people can do things like leave signed or encrypted comments on your blog without too much trouble. This is a similar-in-spirit but not-quite-the-same idea to the one I've proposed here. The fact that this is the closest idea I've found means either I'm the first person to have a great new idea (which I just can't believe, given how simple it is), or (much more likely) there's something fatally wrong with this idea and so everybody else who has had it has dropped it after some thought. Surely I must be missing something? Let's think this through...

Suppose there's a mail service out there somewhere that has not yet set up a key autodiscovery server. Could an attacker set up a fraudulent key server before them to impersonate their users? Well, no, since the hostname used to do the key lookup is derived from the domain part of the email address. An attacker can only set up a fake key server if they control the relevant domain name (excepting some sort of DNS attack, of course, which causes some users to mistakenly ask the attacker's server for a key instead of the real one). This feature of the system limits impersonation attacks to people inside the same domain as the impersonated person. A disgruntled Microsoft sysadmin could establish a fake keydiscoery.microsoft.com and impersonate bill.gates@microsoft.com but he couldn't impersonate scott.mcnealy@sun.com (and neither could Scott or one of his employees impersonate Bill). This is a weakness, but not a terrible one. Inside attackers are always going to be at an advantage over outsiders.

So fake servers don't seem to be a problem. What about attacking genuine servers? Could an attacker exploit a buggy key server and overwrite somebody's stored public key with his own? If the keys are served over a port other than 80, then even though they come via HTTP there is no need for them to be served by a "proper" webserver like Apache or Lighttpd (of course, in this case there's no reason for them to use HTTP at all instead of some other invented-for-the-purpose protocol, but I'm a big fan of reusing HTTP wherever possible because it makes life easier for developers). They could be served by extremely small and lean specialist server programs which implement an extremely restricted subset of HTTP, close to HTTP/0.9 in simplicity. They would ignore any requests other than GET, and return a 404 for any URI other than the designated standard /getkey (or whatever the internet agreed upon). That's it. They would have read-only access, enforced by the OS, to the keys themselves (some other program, less accessible to the public, could handle key management). The code for a server like this could be made so delightfully short and simple that you could realistically get all the dangerous bugs out of it.

I'm going to have to think about possible attacks against this system for a while. If there really are none then I think it's a really exciting and important idea. I might go about trying to write a plugin for Thunderbird to implement a proof of concept. I have no idea how this will go, I don't even know what language Thunderbird plugins have to be written in, or whether or not they would have access to a HTTP implementation. If anyone can see any weaknesses in this idea, please leave a comment. If the comment feature doesn't work (which I'm starting to suspect is the case sometimes), then please email me.

Feeds