And IMHO coherence and orphan rules have majorly contributed to the quality of the eco system.
Without it you can have many many additional forms of breakage. Worse you can have "new" breakage between two 3rd party crates without either of them changing due to some impl in a common ancestor changing (e.g. std) and this affecting two wild card implementations in each, now leading to an overlap.
When you have an overlap there are two options:
- fail compilation, but as mentioned this could be caused by a non breaking change in std in two in theory unrelated 3rd party dependencies
- try to choose one of the implementations. But that now gets very messy in multiple points: a) Which impl. to choose when. b) The user knowing which is chosen. c) Overlap with interactions with stuff like double dispatch, thread local variables, and in general side effects. The issues here are similar to specialization (and part why that is stuck in limbo), but a magnitude more complex as specialization is only (meant) for optimizations, while this can be deeply different behavior. Like `foo.bar()` with the same `use Bar as _;` might in one context return an `u32` and in another a `String`
In many other ecosystems it's not uncommon to run into having issues where certain libraries can't be used together at all. In rust that is close to not a thing (no_mange collisions and C dependencies are the only exception I can think of).
Similar, in my experience the likely hood of running into unintended breaking changes is lower in the rust ecosystem then e.g. python or js, that is partially due to coherence rules forcing a more clean design.
Also people are forced to have a somewhat clean dependency tree between crates in ways not all languages requires. This can help with incremental builds and compiler time, a area rust needs any help it can get. (As a side note, clean dependency structures in your modules can (sometimes) help will rust better parallelizing code gen, too.)
So overall it I think it's good.
Through it can be very annoying. And there is some potential for improvement in many ways.
---
EDIT: sorry some keyboard fat-fingering somehow submitted a half written response without me pressing enter...
EDIT 2: Fix spelling and sentence structure.
The same problem exists in Rust, but from the other side.
If I use serde for serialization I am effectively locked in to using crates that implement serde traits (or do newtype hacks to define them myself).
If I want to use something more niche than serde, I essentially lose access to all the popular crates as they only implement serde traits.
one you solve when initially writing code (so you can properly account for it and control it)
instead of a problem which can blow up when you update a package for a very pressing security fix
in the end it a question what is more important, stability or the option to monkey patch functionality into your dependencies without changing them
and given that you can always non-monkey patch crates (rust makes vendoring dep. relatively easy in case upstream doesn't fix things) I prefer the stability aspect (through if you do patch crates you re-introduce many of the issues in a different place, with the main difference of there being a chance to upstream you changes)
It's so easy to forget about the problems we don't have because of the (good) choices people have made in the past.
Let's say you have one library with:
pub struct TypeWithSomeSerialization { /* public fields here */ }
And you want to define a custom serialization. In this case, you can write: pub struct TypeWithDifferentSerialization(TypeWithSomeSerialization)
Then you just implement Serialize and Deserialize for TypeWithDifferentSerialization.This cover most occasional cases where you need to work around the orphan rule. And semantically, it's pretty reasonable: If a type behaves differently, then it really isn't the same type.
The alternative is to have a situation where you have library A define a data type, library B define an interface, and library C implement the interface from B for the type from A. Very few languages actually allow this, because you run into the problem where library D tries to do the same thing library C did, but does it differently. There are workarounds, but they add complexity and confusion, which may not be worth it.
pub struct TypeWithDifferentSerialization(&TypeWithSomeSerialization)You can implement `Serialize` for a wrapper type and still serialize `SomeOtherTypeWithSomeSerialization` (which might be used by the type being wrapper directly or indirectly) differently. It might not be derivable, of course, but "I don't want the default" sort of makes that a given.
Rust: if you spent 3 weeks understanding the syntax and borrow-checker, here are all of the other problems, and the list keeps growing.
Man this cracks me up.
In most other languages, it is simply not possible to “add” an interface to a class you don’t own. Rust let’s you do that if you own either the type or or the interface. That’s strictly more permissive than the competition.
The reasons those other languages have for not letting you add your interface to foreign types, or extend them with new members, are exactly the same reasons that Rust has the orphan rule.
Rust pays for coherence up front with wrapper types and boilerplate, which is ugly, but the alternative is the kind of ambient monkeypatching that makes APIs hard to reason about once a codebase gets large. A narrow escape hatch might be worth trying, but a global 'disable it' switch sounds like repo poison.
> An interesting outcome of removing coherence and having trait bound parameters is that there becomes a meaningful difference between having a trait bound on an impl or on a struct:
This seems unfortunate to me.
You depend on crates A and B. A impls Foo for Bar. You pass an instance of Bar to a function that accepts `impl Foo`. You are happy. Later crate B adds an impl of Foo for Bar. Clearly _at least_ one of these must be an orphan impl, but both could be. Suddenly it's ambiguous which implementation of Foo you're talking about, so you break because B added an impl.
There are many potential problems of this flavor with letting any `impl Trait for Type` be an orphan impl and then referenced by path. What happens, for example, if an impl that was an orphan impl in one version of A becomes a coherent impl in a later version of A?
I think there has to be special syntax for named/path-referenced/symbolic impls, even if the impl does not have an identifier name, so that the compiler can know "this impl only resolves if you tell me _specifically this impl_" and the impl provider has a way to create a solid consumer contract about how to use that impl in particular.
Also, not having an identifier name would mean you can't have different impls of Foo for Bar in the same module. That's probably not a limitation anyone would care about, but it's there.
I also don't see an issue with having multiple impls of the same trait, as long as they don't provide duplicate items inside a module. I often do multiple impl blocks to break up larger logic and organize docs, though this is generally not for trait impls, but I don't see why it couldn't be.
Let me be clear though, I'm not saying this is the best path forward on the coherence/orphan situation necessarily, just a minor critique of the blog posts position. This is a famously tricky issue, and I suspect there is no silver bullet here. Though I have always wanted some way to add flexibility to the orphan rule.
I think that's fine. Same as what happens if B adds a new function with the same name as a function in A that you were using unqualified.
> What happens, for example, if an impl that was an orphan impl in one version of A becomes a coherent impl in a later version of A?
Nothing much?
When something near the bottom needs work, should there be a process for fixing it, which is a people problem? Or should there be a mechanism for bypassing it, which is a technical solution to a people problem? This is one of the curses of open source. The first approach means that there will be confrontations which must be resolved. The second means a proliferation of very similar packages.
This is part of the life cycle of an open source language. Early on, you don't have enough packages to get anything done, and are grateful that someone took the time to code something. Then it becomes clear that the early packages lacked something, and additional packages appear. Over time, you're drowning in cruft. In a previous posting, I mentioned ten years of getting a single standard ISO 8601 date parser adopted, instead of six packages with different bugs. Someone else went through the same exercise with Javascript.
Go tends to take the first approach, while Python takes the second. One of Go's strengths is that most of the core packages are maintained and used internally by Google. So you know they've been well-exercised.
Between Github and AI, it's all too easy to create minor variants of packages. Plus we now have package supply chain attacks. Curation has thus become more important. At this point in history, it's probably good to push towards the first approach.
In a situation where you're building, I find the orphan rule frustrating because you can be stuck in a situation where you are unable to help yourself without forking half of the crates in the ecosystem.
Looking for improvements upstream, even with the absolute best solutions for option 1, has the fundamental downside that you can't unstick yourself.
With AI this pace difference is even more noticeable.
I do think that the way that Scala approaches this by using imports historically was quite interesting. Using a use statement to bring a trait definition into scope isn't discussed in any of these proposals I think?
The article author does talk about naming trait impls and how to use them at call sites, but never seems to consider the idea that you could import a trait impl and use it everywhere within that scope, without extra onerous syntax.
Does this still solve the "HashMap" problem though? I guess it depends on when the named impl "binds". E.g. the named Hash impl would have to bind to the HashMap itself at creation, not at calls to `insert()` or `get()`. Which... seems like a reasonable thing?
So once you've identified this, now you might consider the universe of possible solutions to the problem. One of those solutions might be removing existentials from your language; think about how Scala would work if implicits were removed (I haven't used Scala 3, maybe this happened?). Another solution might be to decouple the whole concept of "existential implementations of typed extension points" from libraries (or crates, or however you compile and distribute code), and require bringing instances into scope via imports or similar.
Two things are true for sure, though: libraries already depend on the current behavior, whether that makes sense or not; and forcing users to understand coherence (which instance is used by which code) is almost always a giant impediment to getting users to like your language. Hence, "orphan rules", and why everyone hates Scala 2 implicits.
That said, I would love to see a solution in my favorite class of solution: where library authors can use and benefit from this, but the average user doesn't have to notice.
I tend to think that the non-existential Scala system was _so close_, and that if you _slightly_ tweaked the scoping rules around it, you could have something great.
For example, if - as a user - I could use `.serialize(...)` from some library and it used _their_ scoped traits by default, but if I _explicitly_ (named) imported some trait(s) on my side, I could substitute my own, that'd work great.
You'd likely want to pair it with some way of e.g. allowing a per-crate prelude of explicit imports that you can ::* import within the crate to override many things at once, but... I think that with the right tweaks, you could say 'this library uses serde by default, but I can provide my own Serializer trait instead... and perhaps, if I turn off the serde Cargo feature, even their default scoped trait disappears'.
I don't think it's a people problem in the way we usually talk about the folly of creating technical solutions to people problems.
If something like serde is foundational, you simply can't radically change it without causing problems for lots and lots of people. That's a technical problem, not a people problem, even if serde needs radical change in order to evolve in the ways it needs to.
But sure, ok, let's imagine that wasn't the case. Let's say some new group of people decide that serde is lacking in some serious way, and they want to implement their changes. They can even do so without breaking compatibility with existing users of the crate. But the serde maintainers don't see the same problems; in fact, they believe that what this new group wants to do will actively cause more problems.
Neither group of people even needs to be right or wrong. Maybe both ways have pluses and minuses, and choosing just depends on what trade offs you value more. Neither group is wrong about wanting to either keep the status quo or make changes.
This is actually a technical problem: we need to find a way to allow both approaches coexist, without causing a ton of work for everyone else.
And even if we do run into situations where things need fixing, and things not getting fixed is a people problem, I'd argue for this particular sort of thing it's not only appropriate but essential that we have technical solutions to bypass the people problems. I mean, c'mon. People are people. People are going to be stubborn and not want change. Ossification is a real thing, and I think it's a rare project/organization that's able to avoid it. Sure, we could refuse to use technical workarounds when it's people we need to change, but in so many cases, that's just running up against a brick wall, over and over. Why do that to ourselves? Life is too short.
Having said that, I totally agree that there are situations where technical workarounds to people problems can be incredibly counter-productive, and cause more problems than they solve (like, "instead of expecting people to actually parent their kids, force everyone to give up their privacy for mandatory age verification; think of the children!"). But I don't think this is one of them.
In many languages, if you want to integrate package A with package B, you can make and share a package AB, which people can reuse. That scales, and facilitates reuse, and avoids either package having to support everything.
In Rust, if the integration involves traits, integration between package A and package B must happen either in A or in B. That creates a scaling problem, and a social problem.
AFAIK, it’s not really very common to be able to extend foreign types with new interfaces, especially not if you own neither.
C++ can technically do it using partial specialization, but it’s not exactly nice, and results in UB via ODR violation when it goes wrong (say you have two implementations of a `std::hash` specialization, etc.). And it only works for interfaces that are specifically designed to be specialized this way - not for vanilla dynamic dispatch, say.
Most integration libraries in Nuget (aka c#'s cargo) are AB type libraries.
E.g. DI Container: Autofac Messaging Library: MediatR Integration: MediatR.Extensions.Autofac.DependencyInjection
There are many examples of popular libraries like this in that world.
There are only like 3 significant languages with trait-based generics, and both the other ones have some way of providing orphan instances (Haskell by requiring a flag, Scala by not having a coherence requirement at all and relying on you getting it right, which turns out to work out pretty well in practice).
More generally it's an extremely common problem to have in a mature language; if you don't have a proper fix for it then you tend to end up with awful hacks instead. Consider e.g. https://www.joda.org/joda-time-hibernate/ and https://github.com/FasterXML/jackson-datatype-joda , and note how they have to be essentially first party modules, and they have to use reflection-based runtime registries with all the associated problems. And I think that these issues significantly increased the pressure to import joda-time into the JVM system library, which ultimately came with significant downsides and costs, and in a "systems" language that aims to have a lean runtime this would be even worse.
When I used to write Scala, I accepted the fact that I don't have a background in type/set/etc. theory, and that there were some facets of the language that I'd probably never understand, and some code that others had written that I'd probably never understand.
With a language like Rust, I feel like we're getting there. Certain GAT syntxes sometimes take some time for me to wrap my head around when I encounter them. Rust feels like it shouldn't be a language where you need to have some serious credentials to be able to understand all its features and syntax.
On the other end we have Go, which was explicitly designed to be easy to learn (and, unrelatedly, I don't like for quite a few reasons). But I was hoping that we could have a middle ground here, and that Rust could be a fully-graspable systems-level language.
Then again, for more comparison, I haven't used C++ since before they added lambdas. I wonder if C++ has some hairy concepts and syntax today on par with Rust's more difficult parts.
https://tartanllama.xyz/posts/cpp-initialization-is-bonkers/
But out here on this miserable old Earth I happen to think that Rust’s errors are pretty great. They’re usually catching things I didn’t actually intend to do, rather than preventing me from doing those things.
As it happens, you are replying to the person who made Rust's errors great! (it wasn't just them of course, but they did a lot of it)
Rust had a better start, not the least because it wasn’t designed on top of an existing language like C++ was, but who knows what it will look like in 30 years.
… … … … Unqualified name lookup has been challenging in C++ since even before C++11. Overload resolution rules are so painful that it took me weeks to review a patch simply because I had to back out of trying to make sense of the rules in the standard. There's several slightly different definitions of initialization. If you really want to get in the weeds, starting playing around with std::launder and std::byte and strict aliasing rules and lifetime rules, and you'll yearn for the simplicity of Rust.
C++ is the absolute most complex of any of the languages whose specifications I have read, and that's before we get into the categories of things that the standard just gives up on.
Annotations like std::launder, lifetime manipulation, etc solve a class of problems that exist in every systems language. They inform the compiler of properties that cannot be known by analyzing the code. Rust isn't special in this regard, it has the same issues.
Without these features, we either relied on unofficial compiler-specific behavior or used unnecessarily conservative code that was safe but slower.
This is both fundamentally true and misleading. Rust has to solve the same issues but isn't obliged to make all the same bad choices to do that and so the results are much better.
For example C++ dare not perform compile time transmutations so, it just forbids them and a whole bunch of extra stuff landed to work around that, but in Rust they're actually fine and so you can just:
const FOO: bool = unsafe { core::mem::transmute::<i8, bool>(2) };
That blows up at compile time because we claimed the bit pattern for the integer 2 is a valid boolean and it isn't. If we choose instead 0 (or 1) this works and we get the expected false (or true) boolean instead of a compiler diagnostic.C++ could allow this but it doesn't, rather than figure out all the tricky edge cases they just said no, use this other new thing we made.
I am confused by this assertion. You can abuse the hell out of transformations in a constexpr context. The gap between what is possible at compile-time and run-time became vanishingly small a while ago.
I think your example is not illustrative in any case. Many C++ code bases work exactly like your example, enforced at compile-time. That this can be an issue is a hangover from retaining compatibility with C-style code which conflates comparison operators and cast operators. It is a choice.
C++ can enforce many type constraints beyond this at compile-time that Rust cannot, with zero effort or explicit type creation. No one should be passing ints around.
const bool z = (const bool)((int8_t)2);
Is perfectly valid C++.
It's not insane, it's just ... melt-inducing.
Articles discussions new features always have difficult syntax. There have been proposals like this going on from the start.
Fortunately the language team is cognizant of the syntax and usability issues with proposals. There have been a lot of proposals that started off as very unwieldy syntax but were iterated for years until becoming more ergonomic.
Both better and worse.
The current version of idiomatic C++ is much cleaner, more concise, and more powerful than the version of C++ you are familiar with. You don't need C-style macros anymore. The insane template metaprogramming hacks are gone. Some important things that were problematic to express in C++ (and other systems languages to be fair) are now fully defined e.g. std::launder. C++ now has expansive compile-time programming features, which is killer for high-performance systems code, and is more expressive than Rust in important ways.
The bad news is that this was all piled on top of and in addition to the famous legacy C++ mess for backward compatibility. If you are mixing and matching ancient C++ with modern C++, you are going to have a bad time. That's the worst of all worlds.
But if you are lucky enough to work with e.g. an idiomatic C++20 code base, it is a much simpler and better language than legacy C++. I've done a few major C++ version upgrades to code bases over the years; the refactored code base was always smaller, cleaner, safer, and easier to maintain than the old version.
You can mitigate it with some practices but that this is even necessary is a crime. Initialization is one of the most basic things in software development. How do you fuck it up so badly?
On a day to day basis it doesn’t cause me issues but it offends me just on principle.
How does the Rust language team weigh the benefits of solving user problems with new language features against the resulting increased complexity? When I learned Rust, I found it to be quite complex, but I also got real value from most of the complexity. But it keeps growing and I'm not always sure people working on the language consider the real cost to new and existing users when the set of "things you have to know to be competent in the language" grows.
{ "a": 1, "b": 2 }
I use it and want to serialize it as: [ 1, 2 ]
What we’re doing is fine. You should get your serialization and I should get mine. But if either of us declares, process-wide, that one of us has determined the One True Serialization of PairOfInts, I think we are wrong.Sure, maybe current Rust and current serde make it awkward to declare non-global serializers, but that doesn’t mean that coherence is a mistake.
Well, fine, but then you need to actually implement a module system or something. Currently trait impls are program-wide, and if you say that you're not allowed to make global impls of a trait then that's the same as saying you're not allowed to implement traits at all.
In any case, the OP’s proposed “incoherent” scheme actually is a module system of sorts for conflicting trait impls, and it seems about right for something like serialization.
> If a crate doesn’t implement serde’s traits for its types then those types can’t be used with serde as downstream crates cannot implement serde’s traits for another crate’s types.
You are allowed to do this in Scala.
> Worse yet, if someone publishes an alternative to serde (say, nextserde) then all crates which have added support for serde also need to add support for nextserde. Adding support for every new serialization library in existence is unrealistic and a lot of work for crate authors.
You can easily autoderive a new typeclass instance. With Scala 3, that would be:
trait Hash[A]:
extension (a: A) def hash: Int
trait PrettyPrint[A]:
extension (a: A) def pretty: String
// If you have Hash for A, you automatically get PrettyPrint for A
given autoDerive[A](using h: Hash[A]): PrettyPrint[A] with
extension (a: A) def pretty: String = s"<#${a.hash.toHexString}>"
> Here we have two overlapping trait impls which specify different values for the associated type Assoc. trait Trait[A]:
type Assoc
object A:
given instance: Trait[Unit] with
type Assoc = Long
def makeAssoc: instance.Assoc = 0L
object B:
given instance: Trait[Unit] with
type Assoc = String
def dropAssoc(a: instance.Assoc): Unit =
val s: String = a
println(s.length)
@main def entry(): Unit =
B.dropAssoc(A.makeAssoc) // Found: Playground.A.instance.Assoc Required: Playground.B.instance².Assoc²
Scala catches this too.This is largely based on a paper I read a long time ago on how one might build a typeclass/trait system on top of an ML-style module system. But, I suspect such a setup can be beneficial even without the full module system.
Would much rather see a bunch of libraries that implement everything for a given use case like web-dev, embedded etc.
Unfortunately this is hard to do in rust because it is hard to implement the low level primitives.
Language’s goal should be to make building things easier imo. It should be simple to build a serde or a tokio.
From what I have seen in rust, people tend to over-engineer a single library to the absolute limit instead just building a bunch of libraries and moving on.
As an example, if it is easy to build a btreemap then you don’t have to have a bunch of traits from a bunch of different libraries pre-implemented on it. You can just copy it, adapt it a bit and move on.
Then you can have a complete thing that gives you everything you need to write a web server and it just works
Having everything compatible with everything else and having everything implement every case means every individual part is over-complicated. So it is bad no matter how you combine it together.
I've written a decent bit of Rust, and am currently messing around with Zig. So the comparison is pretty fresh on my mind:
In Rust, you can have private fields. In Zig all fields are public. The consequences are pretty well shown with how they print structs: In Rust, you derive Debug, which is a macro that implements the Debug trait at the definition site. In Zig, the printing function uses reflection to enumerate the provided struct's fields, and creates a print string based on that. So Rust has the display logic at the definition site, while Zig has the logic at the call site.
It's similar with hash maps: in Rust you derive/implement the Hash and PartialEq trait, in Zig you provide the hash and eq function at the call site.
Each one has pretty stark downsides: Zig - since everything is public, you can't guarantee that your invariants are valid. Anyone can mess around with your internals. Rust - once a field is private (which is the convention), nobody else can mess with the internals. This means outside modules can't access internal state, so if the API is bad, you're pretty screwed.
Honestly, I'm not sure if there is a way to resolve this tension.
EDIT: one more thought: Zig vs Rust also shows up with how object destruction is handled. In Rust you implement a Drop trait, so each object can only have one way to be destroyed. In Zig you use defer/errdefer, so you can choose what type of destructor runs, but this also means you can mess up destruction in subtle ways.
Is this really that big a downside? It encourages good APIs.
The alternative of everything being public is the kind of feature that quickly becomes a big disadvantage in larger systems and teams, where saying “just don’t footgun yourself” is not a viable strategy. If there’s a workaround to achieve some goal, people will use it, and you end up with an unmaintainable mess. It’s why languages whose names start with C feature so prominently on CVE lists.
I won't be using Rust moving forward. I do like the language but it's complicated (hard to hold in your head). I feel useless without the LSP and I don't like how taxing the compiler and LSP are on my system.
It feels really wasteful to burn CPU and spin up fans every time I save a file. I find it hard to justify using 30+ GB of memory to run an LSP and compiler. I know those are tooling complaints and not really the fault of the language, but they go hand in hand. I've tried using a ctags-based workflow using vim's built in compiler/makeprg, but it's less than ideal.
I also dislike the crates.io ecosystem. I hate how crates.io requires a GitHub account to publish anything. We are already centralized around GitHub and Microsoft, why give them more power? There's an open issue on crates.io to support email based signups but it has been open for a decade.
I find it slightly humorous that this sentence contains three words which would be understood completely differently by the majority of the English-speaking population.
We want our languages to make it easy to write correct programs. And we want our languages to make it hard to write incorrect programs. And trying to have both at once is very difficult.
The problem with this is that it's systemic and central to Rusts trait-based ecosystem composition.
Go’s has a version but it's much smaller and more local. In Go, consumer-defined structural interfaces remove most of the pressure that causes the Rust problem in the first place which is producer led.
Real, but of more concern to folks designing widely-used libraries than to folks using said libraries.
> Anyone can give me a good read what Traits even are?
You can think of traits as analogous to interfaces in OOP languages (i.e. pure virtual abstract classes in C++ terminology).
They just define a set of methods that types can implement to conform to the trait, and then consumers can treat implementing types as if they were the trait.
The major differences are: traits are implemented outside the actual type implementation, so arbitrary trait implementations can be added after the type has been written (this is why we need coherence), and rust uses traits as compile-time bounds for generics (templates).
So they're finally rediscovering OCaml!