The State of OpenSSL for pyca/cryptography

215 points by SGran 6 days ago | 59 comments

formerly_proven 6 days ago |
> Finally, taking an OpenSSL public API and attempting to trace the implementation to see how it is implemented has become an exercise in self-flagellation. Being able to read the source to understand how something works is important both as part of self-improvement in software engineering, but also because as sophisticated consumers there are inevitably things about how an implementation works that aren’t documented, and reading the source gives you ground truth. The number of indirect calls, optional paths, #ifdef, and other obstacles to comprehension is astounding. We cannot overstate the extent to which just reading the OpenSSL source code has become miserable — in a way that both wasn’t true previously, and isn’t true in LibreSSL, BoringSSL, or AWS-LC.
OpenSSL code was not pleasant or easy to read even in v1 though and figuring out what calls into where under which circumstances when e.g. many optimized implementations exist (or will exist, once the many huge perl scripts have generated them) was always a headache with only the code itself. I haven't done this since 3.0 but if it regressed so hard on this as well then it has to be really quite bad.
ak217 6 days ago |
I have a hacky piece of code that I used with OpenSSL 1.x to inspect the state of digest objects. This was removed from the public API in 3.0 but in the process of finding that out I took a deep dive in the digests API and I can confirm it's incomprehensible. I imagined there must be some deep reason for the indirection but it's good to know the Cryptography maintainers don't think so.
Speaking of which, as a library developer relying on both long established and new Cryptography APIs (like x.509 path validation), I want to say Alex Gaynor and team have done an absolutely terrific job building and maintaining Cryptography. I trust the API design and test methodology of Cryptography and use it as a model to emulate, and I know their work has prevented many vulnerabilities, upleveled the Python ecosystem, and enabled applications that would otherwise be impossible. That's why, when they express an opinion as strong as this one, I'm inclined to trust their judgment.
Xraider72 6 days ago |
The whole 3.0 rewrite is a massive regression in all ways possible - they deprecated the old engines and replaced them with providers, and they are not that much easier to work with as a developer (I hope that providers are at least easier for the maintainers to handle) and the library is a lot more runtime dynamic (for some reason). This has resulted in mutex explosion and massive performance regressions in every facet. haproxy has an amusing article on the topic.
https://www.haproxy.com/blog/state-of-ssl-stacks
People who need cryptography but on the openssl API should be using aws-lc and seek a TLS stack elsewhere.
commandlinefan 5 days ago |
> OpenSSL code was not pleasant or easy to read
I published a book about SSL a while back - my original plan for the book was going to be to work through the OpenSSL source code and relate each piece back to the relevant specifications, step by step. I found that the OpenSSL source code was so complex that I would have spent a lot more time discussing the intricacies of the C code itself than the cryptographic algorithms they were implementing - so much so that it made more sense to just write my own SSL implementation and walk through that.
In fairness to OpenSSL, though, I can see how and why it got so complex: they're trying to be all things to all people in a backwards compatible way. At the time, OpenSSL still had support for SSLv2, an albatross that LibreSSL doesn't have around its neck.
woodruffw 6 days ago |
I think this part is really worth engaging with:
> Later, moving public key parsing to our own Rust code made end-to-end X.509 path validation 60% faster — just improving key loading led to a 60% end-to-end improvement, that’s how extreme the overhead of key parsing in OpenSSL was.
> The fact that we are able to achieve better performance doing our own parsing makes clear that doing better is practical. And indeed, our performance is not a result of clever SIMD micro-optimizations, it’s the result of doing simple things that work: we avoid copies, allocations, hash tables, indirect calls, and locks — none of which should be required for parsing basic DER structures.
I was involved in the design/implementation of the X.509 path validation library that PyCA cryptography now uses, and it was nuts to see how much performance was left on the ground by OpenSSL. We went into the design prioritizing ergonomics and safety, and left with a path validation implementation that's both faster and more conformant[1] than what PyCA would have gotten had it bound to OpenSSL's APIs instead.
[1]: https://x509-limbo.com
some_furry 6 days ago |
Now I wonder how much performance is being left on the table elsewhere in the OpenSSL codebase...
Avamander 6 days ago |
Given the massive regression with 3.x alone, you'll probably be happier if you don't know :/
Xraider72 6 days ago |
haproxy has an article on the subject
https://www.haproxy.com/blog/state-of-ssl-stacks
TLDR - on the TLS parts, quite a lot, up to 2x slower on certain paths. Amusingly, openssl 1.1 was much faster.
libcrypto tends to be quite solid though, though over the years, other libraries have collected weird SIMD optimizations that enable them to beat openssl by healthy margins.
tialaramex 6 days ago |
It is extremely common that a correct implementation also has excellent performance.
Also, even if somebody else can go faster by not being correct, what use is the wrong answer? https://nitter.net/magdraws/status/1551612747569299458
jmspring 6 days ago |
I’d say correct common path. OpenSSL due to hand waving deals with a lot of edge cases the correct path doesn’t handle. Even libraries like libnss suffers from this.
nine_k 5 days ago |
Are these edge cases correct to the spec, or not?
pseudohadamard 5 days ago |
Yes.
The spec is often such a confused mess that even the people who wrote it are surprised by what it requires. One example was when someone on the PKIX list spent some time explaining to X.509 standards people what it was that their own standard required, which they had been unaware of until then.
rout39574 5 days ago |
Got any links to that conversation? Sounds fun.
pseudohadamard 5 days ago |
Technically yes because I saved the messages, which I saw as a fine illustration of the state of the PKI standards mess. However I'd have to figure out which search term to use to locate them again ("X.509" probably won't cut it). I'll see what I can do.
pseudohadamard 4 days ago |
I'm drawing a blank on it sorry, it's somewhere in an archive of messages but I can't find the appropriate search term to locate it. However it did turn up a reference to something else, namely this, https://www.cs.auckland.ac.nz/~pgut001/pubs/x509guide.txt. It hasn't been updated for a long time but it does document some of the crazy that's in those standards. The various Lovecraft references I think are quite appropriate.
woodruffw 5 days ago |
There are multiple overlapping specifications for things like X.509. There are the RFCs (3280 and 5280 are the "main" ones) which OpenSSL generally targets, while the Web PKI generally tries to conform to the CABF BRs (which are almost a perfect superset of RFC 5280).
RFC 5280 isn't huge, but it isn't small either. The CABF BRs are massive, and contain a lot of "policy" requirements that CAs can be dinged for violating at issuance time, but that validators (e.g. browsers) don't typically validate. So there's a lot of flexibility around what a validator should or shouldn't do.
woodruffw 6 days ago |
> It is extremely common that a correct implementation also has excellent performance.
I think that's true in general, but in the case of X.509 path validation it's not a given: the path construction algorithm is non-trivial, and requires quadratic searches (e.g. of name constraints against subjects/SANs). An incorrect implementation could be faster by just not doing those things, which is often fine (for example, nothing really explodes if an EE doesn't have a SAN[1]). I think one of the things that's interesting in the PyCA case is that it commits to doing a lot of cross-checking/policy work that is "extra" on paper but stills comes out on top of OpenSSL.
[1]: https://x509-limbo.com/testcases/webpki/#webpkisanno-san
RiverCrochet 6 days ago |
Remember LibreSSL? That was borne of Heartbleed IIRC, and I remember presentation slides saying there was stuff in OpenSSL to support things like VAX, Amiga(?) and other ancient architectures. So I wonder if some of the things are there because of that.
tialaramex 6 days ago |
The Amigans really like their system. So they kept using them long after mainstream users didn't care. By now there probably aren't many left, but certainly when LibreSSL began there are still enough Amigans, actually using an Amiga to do stuff like browse web pages at least sometimes, that OpenSSL for Amiga kinda makes sense.
I mean, it still doesn't make sense, the Amigans should sort out their own thing, but if you're as into stamp collecting as OpenSSL is I can see why you'd be attracted to Amiga support.
Twenty years ago, there are Amigans with this weird "AmigaOne" PowerPC board that they've been told will some day hook to their legitimate 20th century Commodore A1200 Amiga. Obviously a few hundred megahertz of PowerPC is enough to attempt modern TLS 1.0 (TLS 1.1 won't be out for a while yet) and in this era although some web sites won't work without some fancy PC web browser many look fine on the various rather elderly options for Amigans and OpenSSL means that includes many login pages, banking, etc.
By ten years ago which is about peak LibreSSL, the Amigans are buying the (by their standards) cheaper AmigaOne 500, and the (even by their standards) expensive AmigaOne X5000. I'd guess there are maybe a thousand of them? So not loads, but that's an actual audience. The X5000 has decent perf by the standards of the day, although of course that's not actually available to an Amiga user, you've bought a dual-core 64-bit CPU but you can only use 32 bit addressing and one core because that's Amiga.
fanf2 6 days ago |
Most of the performance regressions are due to lots of dynamic reconfigurability at runtime, which isn’t needed for portability to ancient systems. (Although OpenSSL is written in C it has a severe case of dynamic language envy, so it’s ironic that the pyca team want a less dynamic crypto library.)
Avamander 6 days ago |
I'm glad that they're considering getting rid of OpenSSL as a hard dependency. I've built parts of pyca/cryptography with OpenSSL replaced or stripped out for better debugging. OpenSSL's errors just suck tremendously. It shouldn't be tremendously difficult for them to do it for the entire package.
Though I'd also love to see parts of pyca/cryptography being usable outside of the context of Python, like the X.509 path validation mentioned in other comments here.
Retr0id 6 days ago |
By the way, pyca/cryptography is a really excellent cryptography library, and I have confidence that they're making the right decisions here. The python-level APIs are well thought-out and well documented. I've made a few minor contributions myself and it was a pleasant experience.
And my personal "new OpenSSL APIs suck" anecdote: https://github.com/openssl/openssl/issues/19612 (not my gh issue but I ran into the exact same thing myself)
> I set out to remove deprecated calls to SHA256_xxx to replace them with the EVP_Digestxxx equivalent in my code. However it seems the EVP code is slow. So I did a quick test (test case B vs C below), and it is indeed about 5x slower.
amluto 5 days ago |
This is amazing.
Once upon a time, OpenSSL was the place to go for crypto primitives that were hardware specific and well optimized, and you would pay the price of using a nasty API. Now it’s an even nastier API and it’s not even fast anymore?
SHA256 is almost the prototype of a pure function. There should not be concepts like “EVP”. Output sizes should be static. Failure should be entirely impossible unless I opt in to using an async interface for an async accelerator. The only complexity should be the hidden part that selects the best synchronous implementation.
Retr0id 5 days ago |
I can see the value of the EVP API for certain use cases, I just wish they didn't deprecate the not-EVP API.
AlotOfReading 5 days ago |
Not to take away from the point that it's overcomplicated, but a lot of the complications come from trying to expose a generic interface. For example, digests can have variable output lengths like BLAKE2 and BLAKE3 do.
The default BLAKE2 provider doesn't actually support that functionality, but my provider for BLAKE3 does. I don't think anyone uses it though. I haven't updated that provider since they changed the internal API for setting it and no one's complained yet.
oconnor663 5 days ago |
Another important parameter with BLAKE3 is "Do you want to use threads?" There's no one-size-fits-all answer to that question, but it's also a parameter that ~no other hash function needs.
Fwiw, I think the RustCrypto effort also tends to suffer a bit from over-abstraction. Once every year or two I find myself wanting to get a digest from something random, let's say SHAKE128. So I pull up the docs: https://docs.rs/sha3/latest/sha3/type.Shake128.html. How do you instantiate one of those? I genuinely have no idea. When I point Claude at it, it tells me to use `default` instead of `new` and also to import three different traits. It feels like these APIs were designed only for fitting into high-level frameworks that are generic over hash functions, and not really for a person to use.
There are a lot of old assumptions like "hash functions are padding + repeated applications of block compression" that don't work as well as they used to. XOFs are more common now, like you said. There's also a big API difference between an XOF where you set the length up front (like BLAKE2b/s), and one where you can extract as many bytes as you want (like BLAKE3, or one mode of BLAKE2X).
Maybe the real lesson we should be thinking about is that "algorithm agility" isn't as desirable as it once was. It used to be that a hash function was only good for a decade or two (MD5 was cracked in ~13 years, but it was arguably looking bad after just 6), so protocols needed to be able to add support for new ones with minimal friction. But aside from the PQC question (which is unlikely to fit in a generic framework with classic crypto anyway?), it seems like 21st century primitives have been much more robust. Protocols like WireGuard have done well by making reasonable choices and hardcoding them.
dfajgljsldkjag 6 days ago |
It is honestly surprising that OpenSSL has been the standard for so long given how difficult it is to work with. I think moving the backend to Rust is probably the right move for long term stability.
ameliaquining 6 days ago |
Note that all cryptographic primitives are still going to be in C via an OpenSSL-like API for the next while; the current proposal is to migrate from OpenSSL to one of its forks. Various bits of backend logic that aren't cryptographic primitives (e.g., parsing) have been rewritten in Rust; additionally, https://github.com/ctz/graviola is mentioned near the end as a possible implementation of cryptographic primitives in a combination of Rust and assembly (without any C), but it's not especially mature yet.
ForHackernews 5 days ago |
Whatever happened to libsodium that was designed to provide vetted cryptographic primitives with a modern API? https://doc.libsodium.org/doc
SAI_Peregrinus 5 days ago |
Still great, but the primitives it provides aren't necessarily the ones you need.
amluto 5 days ago |
libsodium is great if you only need to interoperate with other software using libraries that sound like “sodium”. This is generally not helpful for the kinds of things that people use OpenSSL for.
tialaramex 6 days ago |
A couple of things probably made this more likely for OpenSSL than for other libraries, though I think this phenomenon (sticking with a famous library which just isn't very good) is overall just much more common than most people appreciate
1. OpenSSL is cryptography. We did explicitly tell people not to roll their own. So the first instinct of a programmer who finds X annoying ("Let's just write my own X") is ruled out by this as likely unwise or attracts backlash from their users, "What do you mean you rolled your own TLS implementation?"
2. Even the bits which aren't cryptography are niches likely entirely unrelated to the true interest of the author using OpenSSL. The C++ programmer who needs to do an HTTPS POST but mostly is doing 3D graphics could spend a month learning about the Web PKI, AES, the X.500 directory system and the Distinguished Encoding, or they could just call OpenSSL and not care.
PunchyHamster 6 days ago |
but it isn't "rolling your own" but changing the lib you use.
> The C++ programmer who needs to do an HTTPS POST but mostly is doing 3D graphics could spend a month learning about the Web PKI, AES, the X.500 directory system and the Distinguished Encoding, or they could just call OpenSSL and not care.
they gonna call libcurl, not openssl directly. Tho they might use it still for parsing certs but that's easier to replace
Macha 5 days ago |
Pre all the recent OpenSSL forks the only other options were:
- use the platform sdks which have completely distinct APIs (and so probably aren't supported by everything between you and the TLS connection)
- Use GnuTLS which is GPL and so wasn't suitable for a lot of commercial uses (less important in the age of SaaS to be fair)
tialaramex 5 days ago |
Also, the platform SDKs invariably assume platform preferred semantics which might not be what you wanted if you write cross platform software.
In particular this means you get another source of platform difference, no only does your Windows App work with different peripherals from the Mac App (because of OS drivers), but now some certificates which work with the Mac App don't work in Windows or vice versa. OpenSSL lets you bundle your CA policies with the app and thus avoid that issue (though now it's your choice what is or isn't accepted and you're probably not ready for that labour)
dochtman 5 days ago |
There’s also a compounding effect: I’ve heard from a hardware vendor that they spend a lot of time optimizing OpenSSL to get the most out of their silicon, so for their customers they suggest using OpenSSL to get the most out of the hardware.
toast0 6 days ago |
Around the time of Heartbleed, pretty much nobody else wanted to do the work, and when they did, it was worse (GNU TLS).
The crypto primitives in OpenSSL tend to be pretty good, but the protocol stuff isn't great. x.509 is terrible, so something someone else wrote to deal with it is mighty tempting. TLS protocol isn't as bad, but seeing how many bytes are spent on length can drive someone crazy.
OpenSSL has historically been crap with respect to development compatability[1], but I think the terrible performance in the 3.x series pushed a lot of people over the edge. Do the protocol work, including x.509 in a memory safe language, manage locking yourself and call out to (a fork of) openssl for the crypto.
[1] Heartbleed would have been a lot worse if people weren't slowrolling upgrading to vulnerable versions because upgrading would be a pain
owenthejumper 6 days ago |
The article highlights Haproxy's blog with essentially the same name (from 2025): https://www.haproxy.com/blog/state-of-ssl-stacks
Since that Haproxy has effectively abandoned OpenSSL in favor or AWS-LC. Packages Re still built with both, but AWS-LC is clearly the path forward for them.
captain_coffee 6 days ago |
I had no idea that OpenSSL is in such a bad state.
PunchyHamster 6 days ago |
I was surprised it is still in such bad state even after "rewrite" for 3.0.
timschmidt 5 days ago |
I am not surprised at all, because instead of throwing their support behind the LibreSSL folks who audited the OpenSSL codebase after Heartbleed and found deep design and implementation issues, Linux Foundation and member orgs including most of Silicon Valley decided that OpenSSL just needed more funding.
Felt like good money after bad on day 1.
tptacek 5 days ago |
The problem with the OpenSSL 3 codebase isn't security; many organizations, including the OpenSSL team itself, have been responsible for pulling out of the security rut OpenSSL was in when Heartbleed happened. The OpenSSL 3 problem is something else.
PunchyHamster 3 days ago |
so, incompetence of people writing it
tptacek 3 days ago |
If you don't have anything meaningful to say, you can just not comment.
groby_b 5 days ago |
And once you realize that Management + Finance + Marketing outnumber engineering at OpenSSL [1], you know the money is put to good use, too.
[1]: https://openssl-corporation.org/about/leadership/
timschmidt 5 days ago |
If I were cynical, I'd think that the inscrutable code and resultant security issues were a feature desired by those management and finance types, not a bug. The purpose of a system being what it does, and all.
whizzter 5 days ago |
Seems plenty of the people occur multiple times, so there's more engineers... if only barely :|
groby_b 5 days ago |
If you value somebody so much you show them multiple times, I'm going to assume they're outsized weight in terms of influence and cost, too.
some_furry 5 days ago |
I took Rich Salz resigning from the project as a condemnation of its future, tbh
https://mta.openssl.org/pipermail/openssl-users/2020-July/01...
rurban 5 days ago |
He resigned over the non-removal of the "offensive" word master, not technical issues.
ivanr 5 days ago |
I wrote about OpenSSL's performance regressions in the December issue of Feisty Duck's cryptography newsletter [1]. In addition to Alex's and Paul's talk on Python cryptography, at the recent OpenSSL conference there have been several other talks worth watching:
- William Bellingrath, from Juniper Networks, benchmarked versions from 1.1.1 to 3.4.x https://www.youtube.com/watch?v=b01y5FDx-ao
- Tomáš Mráz wrote about how to get better performance, which, in turn, explains why it's bad by default: https://www.youtube.com/watch?v=Cv-43gJJFIs
- Martin Schmatz from IBM presented about their _very detailed_ study of post-quantum cipher suite performance https://www.youtube.com/watch?v=69gUVhOEaVM
Note: be careful with these recorded talks as they have a piercing violin sound at the beginning that's much louder than the rest. I've had to resort to muting the first couple of seconds of every talk.
[1] https://www.feistyduck.com/newsletter/issue_132_openssl_perf...
teunispeters 5 days ago |
I look forward to crypto libraries not openssl that can provide support for ED25519 and ED448, as well as a wide range of EC keys.
These are requirements for my current work, and OpenSSL 3+ was the only crypto library that delivered.
some_furry 5 days ago |
What do you need Ed448 for? I've not seen much real world deployments of this algorithm, so I'm very curious about this.
adrian_b 5 days ago |
Presumably one would want to use Ed448 in order to achieve for session key establishment or for digital signing a level of security comparable to using for encryption AES with a 256-bit key.
ED25519 has a level of security only comparable with AES with an 128-bit key.
Nowadays many prefer to use for encryption AES or similar ciphers with a 256-bit key, to guard against possible future advances, like the development of quantum computers. In such cases, ED25519 remains the component with the lowest resistance against brute force, but it is less common to use something better than it because of the increase in computational cost for session establishment.
some_furry 4 days ago |
> Presumably one would want to use Ed448 in order to achieve for session key establishment or for digital signing a level of security comparable to using for encryption AES with a 256-bit key.
Ed448 is an instantiation of EdDSA (the Edwards curve digital signature algorithm) over the Edwards448 curve (a Goldilocks curve), as defined in RFC 7748 and RFC 8032.
Key establishment would use X448 (formerly "Curve448") for Diffie-Hellman, although ECDH over Edwards448 is also (strictly speaking) possible.
Using Ed448 for key exchange is a TypeError.
But that's neither here nor there. I was asking about real world applications that need Ed448 specifically, not a vague question of how cryptography works.
Check my blog if you need a temperature check for my familiarity with the subject: https://soatok.blog/tag/security-guidance/
> ED25519 has a level of security only comparable with AES with an 128-bit key.
No. The whole notion of "security levels" is a military meme that doesn't actually meaningfully matter the way people talk about it.
There are about 2^252 possible Ed25519 public keys. Recovering a secret key from Pollard's rho takes about 2^126 or so computations (where each computation requires a scalar multiplication), and that's why people pair it with an equivalent "security level" as AES-128, but the only meaningful difference between the algorithms (besides their performance footprint) is security against multi-user attacks.
With a 256-bit AES key, you can have 2^40 users each choose 2^50 keys and still have a probability of key reuse below 2^-32.
With 128-bit AES keys, you don't have that guarantee. 2^90 keys is well beyond the birthday bound of a 128-bit function, which means the probability of two users choosing the same key is higher than 2^32. (It's actually higher than 50% at 2^90 out of 2^128.)
See also: https://soatok.blog/2024/07/01/blowing-out-the-candles-on-th...
However, despite the "security level" claims, Ed25519 has 2^252 keys. The multi-user security of Ed25519 (and X25519) is meaningfully on par with AES-256.
As things stand today, the 128-bit symmetric cryptography "security level" is unbreakable. You would need to run the entire Bitcoin mining network for on the order of a billion years to brute force an AES-128 key.
> Nowadays many prefer to use for encryption AES or similar ciphers with a 256-bit key, to guard against possible future advances, like the development of quantum computers.
This is a common misunderstanding. So common that I once made the same mistake.
128 bits are enough. https://words.filippo.io/post-quantum-age/#128-bits-are-enou...
Grover's attack requires a quantum circuit size of 2^106.
> In such cases, ED25519 remains the component with the lowest resistance against brute force, but it is less common to use something better than it because of the increase in computational cost for session establishment.
I do not understand what this sentence is trying to say.
fsmv 4 days ago |
The go standard library has an implementation of ed25519 although I did not find ed448 it also has some NIST curves. There are a few libraries that do ed448 like one from cloudflare.
tob_scott_a 4 days ago |
To test a Claude Skill for analyzing cryptographic implementations of cryptographic side-channels ([1] see constant-time-analysis), I had Claude vibe-code an Ed448 implementation.
This includes:
1. The Ed448 signature algorithm
2. The Edwards448 elliptic curve group (which could conceivably be used for ECDH)
3. The Decaf448 prime-order group (a much better target for doing non-EdDSA things with)
I've been putting off reviewing it and making the implementation public (as it was an exercise in "is this skill a sufficient guard-rail against implementation error" more than anything), but if there's any interest in this from the Go community, I'll try to prioritize it later this year.
(I'm not publishing it without approval from the rest of the cryptography team, which requires an internal review.)
But if you're curious about the efficacy of the Skill, it did discover https://github.com/RustCrypto/signatures/security/advisories...
[1] https://github.com/trailofbits/skills
tuetuopay 5 days ago |
> OpenSSL allowed replacing any algorithm at any point in program execution
Just this is completely nuts. What in the world is the usecase for this? Fanatics of hotpatching systems for zero-downtime-ever? No wonder the performance is crippled by locks left and right, it's pure recipe for disaster.
ethin 5 days ago |
Honestly, I rarely ever use openssl these days unless I must. Now, I go for Botan, or cryptography, or monocypher, any number of cryptographic library alternatives that are designed well and are really fast.