USENIX Security 2016 - Best Paper Award

Written by Petr Svenda, 2016-09-01.

What started as an attempt to investigate how cryptographic smart cards are generating RSA key pairs ended as surprising resulting showing that library or device responsible for generating given key can be identified - based on public key only. Simply put, if you have RSA public key, we can say which library generated it. And we got Best Paper Award out of it at 25th USENIX Security Symposium, 2016. The paper name is The Million-Key Question—Investigating the Origins of RSA Public Keys.

The paper's abstract captures the main points:

Can bits of an RSA public key leak information about design and implementation choices such as the prime generation algorithm? We analysed over 60 million freshly generated key pairs from 22 open- and closed-source libraries and from 16 different smartcards, revealing significant leakage. The bias introduced by different choices is sufficiently large to classify a probable library or smartcard with high accuracy based only on the values of public keys. Such a classification can be used to decrease the anonymity set of users of anonymous mailers or operators of linked Tor hidden services, to quickly detect keys from the same vulnerable library or to verify a claim of use of secure hardware by a remote party. The classification of the key origins of more than 10 million RSA-based IPv4 TLS keys and 1.4 million PGP keys also provides an independent estimation of the libraries that are most commonly used to generate the keys found on the Internet. Our broad inspection provides a sanity check and deep insight regarding which of the recommendations for RSA key pair generation are followed in practice, including closed-source libraries and smartcards.

We released all gathered key pairs (more than 60 million) for subsequent research. Get it at http://crcs.cz/papers/usenix2016. We also operate online tool where you can test your keys - classify-as-a-service approach. Try it at http://crcs.cz/rsapp. Subsequent posts will focus on selected aspects of this research like current distributions of cryptographic libraries for TLS keys (based on classification feature), biased random number generators on smart cards and much more. Stay tuned 🙂.


Q&A section

Q: So what did you do?
A: We figured out that RSA public key is leaking info about the library which created it. Hence we can tell which library you used to generate your key - based on public key only.

Q: Is single key enough to identify source library?
A: Sometimes yes, but mostly no. If you have 5 keys from the same source, it will be quite accurate. Try our automatic tool at http://crcs.cz/rsapp/

Q: Can I mutually distinguish all libraries?
A: Not always. Source libraries introducing exactly same bias to the value of generated public moduli will be indistinguishable. At the moment, we have 13 different groups of libraries.

Q: Can I also identify the version of used library?
A: Sometimes. The new version of a library that did not change source code of key generation method will not be distinguishable from the older one. E.g., OpenSSL 1.0.2f is not distinguishable from OpenSSL 1.0.2g, but OpenSSL 1.0.2g is distinguishable from OpenSSL 2.0.12 FIPS.

Q: Have you tested all libraries of the world?
A: No. We tested a lot of them, but not all. We also did not test all possible version of given library. We are also missing hardware sources such as SSL accelerators (contact us please, if you have one and like to contribute).

Q: How quickly will be the information leakage vulnerability you found fixed?
A: Probably not soon. The fix would require changing code of key generation method for the most libraries. And developers don't like to mess with that part of crypto too often. Even if fixed in the new version, a lot of old legacy libraries will be used for a long time.

Q: So how can I protect my key(s)?
A: If you need just one key, it is easy - just generate 5 keys instead of one, let all be classified by our tool ( http://crcs.cz/rsapp/) and then keep the one which is classified with the least accuracy. If you need more keys to keep, it is slightly more tricky, but still can be done (with more keys generated and discarded).

Q: Are the data you gathered and used publicly available?
A: Definitely! Download everything in the datasets section and try your own analysis. Please don't forget to cite our Usenix paper if you will use it.

Q: I want to know more details!
A: Great, then read original paper and technical report for even more details.