What's New in Rust 1.41
Episode Page with Show NotesJon Gjengset: All right, Ben, how about we get started by applauding ourselves this time? For once we are recording the episode the day after the release dropped.
Ben Striegel: Great. And then we’re gonna edit it for about two weeks. And then it’ll be totally stale by then.
Jon: Yeah, we’re gonna release it, same day as 1.42 is released.
Ben: Excellent. That’s a plan.
Jon: So today we are talking about Rust 1.41. It’s pretty exciting.
Ben: I think so. I mean, maybe not as exciting as in previous releases, but again, as we mentioned before, not every release has to be super exciting. It’s maybe a good thing, a sign of maturity. To start having just, like, normal regular releases. But, I mean, every release gives you improvements to various bug fixes and compiler things. It isn’t all about features.
Jon: Well, one thing, actually, that I like about this release is, it really feels as though it’s like, let’s tidy sum things up that have been like loose ends. This is something that I think came through a little bit in the in the Rust 2020 blog post series of like something the community wanted was to see more, sort of, finish things as opposed to start new ones, and this has a bit of that flavor to me, which is nice.
Ben: Yeah, And there are always, like, you know, hundreds and hundreds of commits in every release. But not all of them were gonna be, like, big and flashy. A lot of them are just, like, burning down bugs or implementing new things for future features as well.
Jon: Future futures, you say.
Ben: Future features. I mean, I guess there are maybe some future
future
s features in there, too.
Jon: It’s true. Before we dive in, though, I just briefly want to sort of re-mention for those who only recently have joined the podcast, which is this is a community podcast. It is not just Ben and me talking about new Rust releases. In particular, this is sort of a community effort where not only can you help us by, sort of, contributing your own episodes or ideas for episodes, or if you want to do sort of interviews, and we can help you get set up with that. But also, if you want to help with any of the stuff that happens behind the scenes like if you know audio editing for example, then jump in to the discord channel we have, and then, like, you could help out that way, too. And it’d be— this is sort of something we all make together. And the more people help, the better the end product ends up.
Ben: Shall we dive in?
Jon: Yeah, let’s do it.
Ben: Okay, what’s up first?
Jon: First up is relaxed restrictions when implementing traits.
Ben: So this has to do with trait coherence, which is kind of a notion that a lot of people don’t want to think about. It’s a thing they understand vaguely, where, okay, I have a trait. I have a type, but sometimes I just can’t implement that trait for that type. Do you want to go into what trait coherence is and just talk about why it’s there? What it’s there to do?
Jon: Yeah, I think I think it’s useful to give at least some of the intuition for why there are even rules around this, like, when different crates can implement different types or traits in particular. So the thinking here is, you can end up in some really weird positions as the compiler, if you don’t have some rules about who’s allowed to do what.
In particular, imagine that there is some crate, let’s use hashbrown
as
an example, which is the sort of new fancy hashmap implementation. Imagine
that it did not implement serde::Serialize
. And then you have two, sort
of, unrelated crates that both want to use the hashbrown
map, and they
both want to be able to serialize it. And so they implement Serialize
for
hashbrown::Hashmap
. Let’s imagine that that was totally allowed. And then
Ben comes along. And you, Ben, you write your own crate and you depend on both
of those crates and you include a map in whatever type you’re writing. And
then you want to serialize that. The compiler now has a problem. It has
two possible implementations of Serialize
for the hashbrown::Hashmap
,
and it doesn’t know which one to pick, right? It could choose the one that
whoever wrote crate A that you depend on wrote, or the one that whoever
wrote crate B that you depend on wrote, but it doesn’t know which one to
pick. And different languages have different approaches to this. In Rust
we use coherence, in which the compiler enforces that there’s only ever one
choice. And the way it does that is by limiting which crates are allowed to
implement which traits for which types.
Ben: Yes, so effectively as a library author, you’d be the one who’d be dealing with this issue. And so you would never get to the point where a user would figure out, hey, I can’t actually use this crate with this crate because they both do this thing, because each individual library would hit this error before they could ever compile their code.
Jon: Yeah, exactly. So in the example we gave before, if one of these
intermediate crates tried to implement the Serialize
trait, which is
a foreign trait to that crate. It’s not implemented in that crate; It’s
implemented in the serde
crate. And then they try to implement that trait
for hashbrown::HashMap
, which is a foreign type. Then the compiler would
say, you’re trying to implement a foreign trait for a foreign type, and
you are not allowed to do that. And of course, this might be frustrating if
you’re the developer of that library, because you need serde::Serialize
,
but this is what the compiler does, to ensure that you don’t end up in these
positions where you don’t know which implementation to choose.
Ben: And that’s what coherence is. But it’s still sometimes kind of annoying, understandably. And so in this new release, we have some relaxation of the coherence rules. And so, do you want to go over what that means?
Jon: Yeah, I think it’s useful to go through that quickly. I want to point out before we dive into it, that at the bottom of the few paragraphs in the release notes about this, there’s a link to the RFC that proposed this change, and I think that RFC is really well written in terms of making you understand both why this is necessary and also why this relaxation is useful and why it’s, sort of, both sufficient and necessary for things that people want to do in practice.
And the example that they give in the release notes, I think, is a good one,
which is imagine that you want to implement, generic over T
, you want to
implement From<YourVec<T>>
for Vec<T>
.
Ben: And From
is a trait defined in the standard library. As is Vec
,
a type defined in the standard library, so you don’t own either of those.
Jon: Exactly. So this is an instance of the pattern, which is, you
implement some foreign trait for some foreign type, and previously this was
just not allowed. You just could not write this. And this is awkward because
sometimes you really want to say that you can turn my type into some other
type and you don’t control the other type. You don’t control the From
trait,
and so you don’t have any any way to do this. What this RFC proposed that is
now, sort of, landed on stable is essentially to relax the rules in such a way
that we can still guarantee that there is only ever one implementation for the
compiler to choose from. And the way they do that is, to simplify greatly,
is to say that you are allowed to have an implementation of a foreign trait
for a foreign type if— you have at least one local type that appears in
the foreign trait’s type arguments and it appears to the left in that list—
Ben: it’s just the first one of the list, essentially.
Jon: It can be multiple, too. Just the leading type parameters of
the trait have to be local. And if there are any generic parameters that
appear in the type you’re implementing the trait for, then they need to
be covered. That is, they need to appear in— as type parameters to that
type. And the rules here might seem somewhat arbitrary, like, why did they
choose these rules? And the RFC is really good about going through this
and what “covered” means and why it’s necessary. But to try to give some
intuition here, it doesn’t really matter whether your local type is to the
left or to the right. The RFC just chose one, essentially at random. But the
reason you want to do this is if every crate is held to the same standard
of “your local types have to appear to the left,” then you can’t have two
crates that accidentally overlap in their implementation. For example,
imagine that Ben’s crate implements From<(HisType, T)>
and my crate implements From<(T, MyType)>
then those two implementations
are technically overlapping because my T
could be HisType
or his T
could be MyType
. And so, as long as the compiler enforces that all the
local type parameters appear to the left, then these would automatically be
seen as non-overlapping by the compiler because they couldn’t possibly overlap.
Ben: Yeah, it seems a little bit ad-hoc, but it still works out at the end. It’s mostly they’re kind of like, just as a convenience. I don’t think people were really gonna— again, most folks don’t really understand what coherence does or the rules for it, and in this case, it kind of just, making things work more the way you want them to work.
Jon: Yeah, exactly. And if anything, the discussion here is mostly to tell you that there is such a thing as coherence. Yes, there is a good reason why the compiler prevents you from compiling some of these impls that you want to write. And you really should go read the RFC if this is something that that you find interesting.
Ben: So what’s up next? We have cargo install
updates packages when
outdated. What is cargo install
? Is that, like— I have seen some confusion
online where people learn about cargo install
and they’re like, well,
does that actually put dependencies inside my project or something? And no,
cargo install
is a subcommand that comes with cargo
by default. And
it is used to install binaries from crates.io. I’m not sure you if can
get it from other locations, but for example, what it’s mostly used for,
if somebody wants to, say, add a new subcommand to cargo, cargo ____
,
then you would be able to install that from crates.io by running cargo
install
. And so previously, there was no real built-in way to get updates
for any packages that you had installed through cargo install
. There were,
in fact, packages you could install with cargo install
, that when run
would update your packages installed with cargo install
. Now those are
obsolete. Now, cargo install
is just— when you run it, it will update
any binaries that you have installed though cargo install
. It’s kind of
like a little lightweight package manager just for extending cargo
itself.
Jon: Yeah, I don’t know whether it will actually— I don’t know if you can
run cargo install
without any arguments and then have it update everything
you’ve installed. I think you still have to give the name for a package.
Ben: I think so, yeah.
Jon: So you can do, like, cargo install foo
. And then, in the past,
if you later ran cargo install foo
again, it would just say “foo is already
installed,” no matter which version you had installed and which version was
the latest. Whereas now, if you run cargo install foo
, it will update foo if
its version is newer. Previously, the only real way to do this was with the
--force
flag, which just installed foo again, and replaced the old one,
regardless of what the new version was, like, even if it was the same version.
There’s one thing I want to mention about cargo install
, which is,
Ben mentioned that it installs from crates.io, and that is true. There
is a way to install something that’s not on crates.io. So you can— with
cargo install
, you can give --path
and then a path to a cargo repository,
and it will install the binaries from that repository. Or you can do cargo
install --git
and give a git url, and then it will clone that and build
it. And this can be useful if you want to run, like, the master version of
some particular binary, like, they’ve implemented a feature and you want to
test that out before it’s been released.
Ben: Next on the list is the less conflict-prone Cargo.lock
format,
and this is actually a, kind of, a bait and switch, a slight of hand, if
you will, because this is not a new feature. We actually talked about this
feature back in our 1.38 episode. What’s new about it this time is that this
new Cargo.lock
format is now on by default, which is also still kind of
misleading because it is only used if you make a brand new project. If cargo
does not find a lock file in your project, then it will generate using this
new format, otherwise it will respect the old format. And why would we want
to have it for the past, like three or four releases, but not actually have
it turned on?
Jon: Well, I mean, I think this is partially just, you want to be careful about introducing very new features without having them tested a lot first.
Ben: I think in my case, the reason the reason that I understand it is that you don’t want people on older toolchains to get, kind of like, blindsided by this new format. Say they’re using a dependency that happens to just run— you know, erase their lock file and then make a new one. But their version of the toolchain, the consumer’s version of the toolchain, doesn’t yet understand it. And so if you’re on an older version of Rust, and then people start using this new version of the lock file— they might not even know they’re doing it, is the thing, too, if you just happen to, like, blow away the lock file somehow. And so the idea is just to be polite to people who don’t update, like, totally in lockstep. Now, as long as you’ve updated any time since July? August? or so, which is when 1.38 came out, then you will understand any new lock files generated by this. And so mostly it was there to get the knowledge of the new lock file into Rust even though it wasn’t being used yet. If you are using an older version, like 1.37 or earlier, now might be a time to think about upgrading. Just because you might start getting some strange errors about, like, hey, this lock file is corrupt, or— I’m not sure what the actual error looks like. But, yeah.
Jon: I see. So it was really sort of a forwards compatibility thing of, like, we want to land understanding this format before we even think about standardizing it. You might never see this in practice, but for for future users, we want to make sure.
Ben: It’s one of those strange things where, for example, in the Rust compiler— because Rust is a bootstrapping compiler that is built in Rust— one of the consequences is that you can’t actually use new features in the compiler until the feature is in the previous version of the compiler. So you can implement the feature, but you can’t actually use it until you actually build the compiler with itself.
Jon: So should I immediately remove all my Cargo.lock
files and then
rebuild them?
Ben: There’s certainly nothing stopping you. And I mean, if you write
a library, understand that this will increase, potentially, the minimum
Rust version of your library to 1.38. I know these days if you’re in the
async ecosystem, then you’re definitely on, or hopefully on 1.39 by now
with the async/await. But there are some libraries out there that do try to
have, like, rigid— or diligently adhere to certain minimum Rust version
standards. I think regex
is one of them, kind of thing, or generally very
widely used ones try and have some amount of care taken for users on older
versions. So, I mean, you can if you want to. Maybe there are other reasons
that you’ve already, if you’re using async/await and there’s no reason not
to, for example. But it’s up to you. I think, I kind of like it a lot. You
want to go— we actually should go over what it does, in fact—
Jon: Yeah, I was about to ask, like, Why do I even have lock files?
Ben: Yeah, So in this case, the idea is— there’s a good RFC too, again, explaining what this does. But the idea is that if you are on working on a project that has a lock file, so, a binary release of some kind, then you have multiple developers, kind of, like, committing to the same lock file. If they both have branches with it, both make changes like, say, they both add a dependency to the lock file. In the previous version of the lock file format, the way it worked was, essentially, you have one section where it’s, like, here are all your crates, here is, you know, the crate name and then kind of a little json, the crate’s version, crate’s metadata, that kind of stuff. But all the crates in your project in the lock file also have what’s called a crate hash. The idea is that we just hash the crate and kind of do a sanity check of— saying, hey, so we’re making sure that whenever you grab a version of this crate we’ll hash the source and figure out if anything has gone wrong. And I’m just a little bit— a little security (unintelligible— 16:41) which is a very nice little thing.
But previously those hashes, for whatever reason, weren’t stored in line with all of the various information for each crate. They were all stored together in one big hash blob at the bottom of the file. And the way that git works is that if any two changes to lines, that are within some distance of each other, change then it’ll give you a conflict. And so you would get, like, all kinds of version conflicts of trying to merge it. Hey, like, this thing has changed, but actually really shouldn’t have mattered. Obviously, if you both change— two branches changed the same crate, you want to get a conflict, and you still will in this case, obviously, because you’ve changed the same lines. But now, lines should be far enough apart, that this isn’t a problem. At least hopefully not.
And there’s also a few other changes, too. Here, just reducing some redundancy, making there be less churn in general in the lock file, just to hopefully also reduce chances of certain conflicts happening.
Jon: Do you want to talk a little bit about what a lock file even is? Like, why do we have lock files?
Ben: Do you talk about that?
Jon: Sure. So lock files are— they’re a way to get— they buy you a couple of things. A lock file essentially notes down which exact version of every dependency you have, you are currently using. And there are a couple of reasons why you might want to do that. One of them is for reproducible builds. So if I build my crate and it builds correctly and everything runs fine, then I can publish my crate and the lock file and then, sort of, know that anyone who tries to also install whatever that crate is they will get the same version, it will build the same way, it will perform the same way. If I didn’t include the lock file, they might get a later version of whatever things I depend on. And although that shouldn’t break anything, sometimes it does; sometimes there’s a bug or performance regression. So that is one reason. Or maybe I publish a hash of the binary that gets produced so other people can check that they got the true same thing that I got when I built, and then you certainly want the same underlying dependencies to be used.
The other reason to use a lock file is— to some extent at least— for security. Like, you might care that I have reviewed this version of this crate, and I know it doesn’t have any security vulnerabilities in it, and I will update that when I have reviewed the newer version. If I didn’t include a lock file, you might get a later version that, like, someone has introduced malware into or something. And that is sort of a secondary concern that people use with lock files.
Ben: Kind of a minor thing. But I think people who work on products with many contributors, it’s actually a pretty nice quality of life change. I think there— I think it was maybe inspired by (unintelligible— 19:27) like, a year or two ago I saw, I think Facebook, was doing something with Rust, and one of their problems was they just weren’t committing the lock file. Generally, with binaries you’ll commit, libraries, you won’t. And for their binary they were trying to do really weird things of, like, not committing it and running their own tools to try and deal with conflicts. And I was like, why you doing this? Just commit the lock file. And they were, like, well we have a million developers all trying to commit to the same monorepo and having all these conflicts, and that— in that case it is just definitely a pain. And so hopefully this reduces that pain.
Jon: There’s actually one more reason I just thought of, now that you mention that, which is a lock file means that you probably don’t have to re-compile dependencies quite as often. Because otherwise, if you don’t have a lock file, then cargo is just going to check for updates every time. And then there are probably a bunch of point updates for a bunch of your dependencies, and then it’s gonna fetch those and then rebuild that part of the dependency graph and rebuild your library on top of it. With a lock file, it doesn’t have to do that unless you explicitly say you want to update your dependencies.
Ben: So next there’s a line item called “more guarantees when using
Box
es in FFI” and, so, I think the actual the change itself isn’t quite
that interesting. We could go over that definitely, but I want to want talk
about what we do with FFI types, like, what does it mean for us to say, hey,
this type on the Rust side is compatible with the type on the C side. Do
you want go through the Box<T>
FFI change real quick?
Jon: Yes. So this is sort of a change that I think many people assumed
was the case already, Which is that, if you have a heap pointer in Rust,
so a Box<T>
, and specifically one where T
is Sized
. So it’s not, like
a virtual dynamic dispatch pointer, like it’s not a dyn Trait
Box
. It
is just a Box
of some type that is Sized
, then what Rust now guarantees
is that that type can just be cast to a raw pointer and then be given to C,
And it’s fine to use, as if it were just, like, a T*
in C. And I think in
the past people just assumed that this was the case. But now, in some sense,
the Rust team has, like, gone through and checked that this is actually the
case and that this is something they do want to guarantee going forward.
Ben: Yeah, that’s kind what I want to talk about, too, with regard
to some of these guarantees that, you know, the Rust team will say, hey,
like, this type is compatible because I don’t believe it actually involves
any source code changes. This is kind of a social contract. People will
tend to use certain types on the Rust side and say, hey, in my mind, a
Box<T>
is just a pointer to a type. And so I’m going to, kind of, like,
this should be compatible, and it should be— a Box
is just, you know—
it is an address to the heap. And C just has pointer addresses to anything,
so it is compatible, conceptually.
But, like, compilers are hard, and we can’t always, you know, take things for
granted. And so sometimes it’s nice for the Rust language team to say, yes,
we will make sure to always guarantee that if you use a Box
in this way,
according to these, kind of like, a few restrictions, like the size restriction
and so on, that it will always just work. There won’t be any weird unsafety. At
least not from this. There’s still plenty of things that can go wrong with FFI.
But we have other things in Rust too, that exhibit the same features. And
so, for example a reference to an Option
of T
is guaranteed by Rust to
be compatible with a nullable pointer. Correct?
Jon: An Option<&T>
.
Ben: That’s it. Yeah.
Jon: I don’t know whether a reference to an Option
of a T
has the
same guarantee.
Ben: Sorry. Other way around. But there are things that we, kind of, guarantee in Rust. There’s no, there’s no attribute, as far as I know that says, hey, you know, make sure this is true. It’s just something that would exist in a theoretical Rust specification. It might be in the reference manual. Not sure. But yeah, there are certain things that are just, like, social contracts, that are part of the language that we can’t really enforce, it’s kind of, just like, one of those unsafe things that, hey, let’s now say, we have a policy in the compiler of, we won’t do any weird optimization that would break this.
Jon: Yeah, and I think this matters in particular for people writing
unsafe code, where, once you write unsafe code there are all these— this
exact specification matters a lot to you. And this is why it matters whether
or not the compiler gives this guarantee, because that dictates whether
you’re allowed to do something in unsafe code or not. So, for example, the
reason why there’s a restriction on the T
in Box<T>
having to be Sized
for this to be true is because if you have, for example, a Box<dyn Trait>
,
then that is actually a fat pointer in Rust. It is not just a normal pointer,
and so casting it to a pointer in C would not be legal. It would not produce
correct results. And so this is why you really want the compiler team or
the lang team, I guess, to sit down and figure out what are the language
guarantees that we give you for this type.
You’ll notice there’s a— I don’t think it’s linked in the release notes, but we’ll do it in the notes for this episode, which is— there’s something called the Rust Unsafe Code Guidelines Working Group, which are— they’re basically going through all of the various things you might care about in unsafe and figuring out, what are the actual rules? Like rules around alignment of fields, rules around padding, uninitialized memory, alignment of pointers. Like, all of this stuff, and trying to nail down, what is the specification? What are the guarantees that you can rely on when writing unsafe code.
Ben: And speaking of the Unsafe Guidelines Working Group, one related work is the whole thing with the Miri interpreter, and we have an episode with an interview that I did with an author, a contributor to Miri. Look back in our backlog for that, it’s one of the Rust Fest interviews. And Miri is a pretty cool tool that can dynamically detect undefined behavior from within your Rust code. So kind of like a sanitizer for any, kind of like, clang, we have, the valgrind, and UBSan and ASan and TSan, and so hopefully this should be a one-stop shop for all kinds of Rust things. And it should also hopefully be comprehensive, like no false positives, no false negatives. Still dynamic, obviously, so not quite up to the Rust ideal. But it’ll be good to be able to say, hey, look, I ran my test suite over this with Miri and didn’t find any undefined behavior, so that gives me some confidence that this particular thing, this unsafe code that I’ve written, actually is safe.
Jon: Yeah, I think if you’re writing any unsafe code in your library, your CI should probably include a Miri test. It’s just like, just do it. It’s relatively easy to set up. It is not that slow to run, unless you have particularly weird code. And it just is going to save you from a lot of worrying.
Ben: A few library changes this time. No constification, Sadly. No things
became const this time, maybe next time. But one thing I want to talk about
is, there is a type in Rust called NonZero
. And these are integer types,
the idea being that, if you use these types, that the value of these types
will never be zero, which— they exist for interesting reasons. They’re
good for organizations. Certainly the idea being, that we mentioned Option
before, where Option
has both a Some
and a None
case, and in the case
of like an Option<&T>
, you can kind of squirrel away the None
case into
the null, like the “000” case. And then that means that you can always apply
optimization where you can represent that Option
to a pointer to T
as just
a pointer to T
. And so there’s no memory overhead at all to this, which is
really important for certain— like passing things to FFI, but also just
for general optimizations, of not using up more memory than you need. One
of those examples of zero cost abstractions. And I think this actually was
inspired by servo’s use case. The idea of being, hey, we also have these
Options
to numbers, these integers and we know, we can guarantee that these
won’t ever be zero, and so we would like it to not use up, like, an extra
byte of memory, just for representing the Option
of these numbers. Just
to give us some extra type safety, without any extra run time penalty.
Jon: And in fact it’s probably even more than that. It’s not just a byte, It’s like— a full padding and alignment—
Ben: Yeah, which is actually— so if you have a u64
you’d have to,
you know, padding— double the size of the type effectively. And so, those
aren’t new. The NonZero
types themselves aren’t new, but now they implement
From<NonZero___>
of a smaller type, if those integers— if the smaller type,
or if type is smaller. And so, for example, a NonZeroU16
now implements
From<NonZeroU8>
. And the reason I want to call this out is because there’s
kind of a mirror thing, where this just happened to work, even without the
NonZero
types. And so the u16
type in the standard library implements
From<u8>
. And so it’s totally lossless, and you might be used to doing this,
just using the as
keyword.
And I think the as
keyword is, kind of like, if we were ever to do a
retrospective of Rust, of things we would do differently, we might not have
the as
keyword, frankly, just because— it doesn’t cause any problems in
this case, but it’s kind of poor for error handling and also for going between
some different types. It is less than ideal, specifically, like with— we’re
going to— floats to integers, and integers to floats. There is some gnarly
undefined behavior that is still (unintelligible— 26:24), kind of like
vaguely lurking in the background. I think it is the oldest open unsound
issue. Issue tag is “unsound”. The bug tracker is about converting from
integers to floats, just because of certain semantics of llvm and floating
points. And it’s hard to do— to resolve without also imposing some runtime
overhead. And so it’s, kind of like, I think we shouldn’t have used as
in the first place for this. So yeah, consider using, From
and Into
for these numeric conversions.
Jon: This is also something that the Unsafe Code Guidelines Working Group— that’s far too long— is looking at, is stuff like, what are legal bit representations of different types? Like, for example, should we require that the only valid bit patterns for boolean true and false are zero and one?
Ben: That is true. I’m pretty sure.
Jon: I think that is also true, but they have to go through this for a number of different types and things like casting a number to a float is another instance of, like, do you end up with pattern that’s undefined? Or is this supposed to be defined behavior? And if so, what kind of defined behavior?
Ben: And that’s kind of (unintelligible— 27:30) again, what ties into
the, MaybeUninit
type that we got a few releases back, just get inspired
by the working group things of saying, hey, like, some of the ways we’re
currently doing things are just totally bogus. We should have a different
way of doing things. And so Rust is getting safer. Slowly.
Jon: Rust is also—
Ben: Unsafe Rust is getting safer. Hopefully safe Rust is as safe as ever. Not getting— was never— not implying that was ever unsafe. Or was it?
both: (make dramatic music)
Jon: And, speaking off numbers, there’s another number that’s soon going away, which is that the Rust compiler is going to remove or reduce support for 32-bit Apple targets in the next release. How do you feel about this?
Ben: You heard it here, Rust is now removing support for 32 bit numbers entirely.
Jon: Yep.
Ben: u16
to u64
. Right. Going right over it.
Jon: Well, u16
is fine, and u64
are fine, you just can’t use you u32
.
Ben: No, I really— I don’t really understand, like, I’m not an Apple person. And so I don’t know how old the hardware is, that, like, when was the last 32-bit Apple thing released? I don’t even know.
Jon: Many years ago.
Ben: Because I think the reason for doing this is not because we really want to, I think it’s— llvm is just removing support for it. So I think it’s, kind of, hey, we’re just marching forward. The endless march of, quote-unquote, progress, I don’t know.
Jon: Well, I don’t think— it’s not that they’re saying, we’re gonna remove support. It’s that it’s no longer going to be like a top-tier supported target. So in all likelihood, it will keep working. But it won’t be a goal to still support it. It also would be, like, if it works, it works.
Ben: Yeah.
Jon: So one thing that got buried— or not buried, but one thing that was noted here, in addition, just the links, is this link to a blog post from the Inside Rust Blog, which is, like, a blog for the people who care more about the internals of the compiler, that talks about constant propagation. And this is a really cool thing that has just landed in 1.41 and it’s sort of the start of a longer process that’s ongoing. At its core, constant propagation is really just, if you have some values that you know are constant, and then you see some operations on those values, then you might be able to just replace that computation as well with the result of the of the constant computation. The idea here is to evaluate anything you can at compile time.
What gets really cool in this particular release is that you can propagate
this into control flow as well. So one example that they give in the linked
blog post is, imagine that you have some constant that’s an Option<u8>
, and
somewhere in your file you set it equal to, like, sum of 12 and then later in
your code, you have, a match on that constant. Where if it’s None
, you mark
it as, unreachable. And if it’s Some
then you do something with it. Well,
with constant propagation, you can eliminate that entire branch at compile
time because the compiler knows that that constant is Some
. So it doesn’t
even need to evaluate the, sort of, conditional on whether this is None
or Some
, it could just proceed, assuming that it’s Some
, and evaluate
that at compile time. And the reason this matters is because it means that
we can produce less code to give to llvm, which is sort of the underlying
thing that turns the Rust MIR into executable machine code. And it turns
out that one of the big bottlenecks for Rust compilation at the moment is
that llvm compilation phase. And so if we can give less code to llvm, then
llvm also runs faster. And so the whole Rust compilation process goes faster.
Ben: Yeah, I was surprised, too, because I mean, obviously the impetus for this is gonna be originally, kind of, doing more things a compile time, we can do less things at run time and therefore produce faster code at runtime. Or faster code that— code that runs faster at runtime. And in this case, but you would always suspect, hey, like, but we’re doing more at compile time, so actually, it’s gonna, like, be worse for performance, and we don’t want that, really. But surprisingly, it actually is better for performance. So in the benchmarks, they said a 2-10% improvement on a variety of test cases in both debug and release mode. And so, by just generating better code in this case, or less-worse code in this case, by doing more operations at compile time, we actually improve compile time, in the end.
Jon: I think maybe the intuition here is that, whenever you go down a level— so like, from MIR to llvm IR, for example, you’re sort of removing information that could be used to do something smarter. So it’s true that we’re, sort of, now doing some logic in MIR that could be done in llvm instead. But in the MIR level, we have more of this information. We know more about the guarantees and the invariants that exist in MIR and in Rust code. That means that you can pretty efficiently do this constant propagation where it would be a little bit harder in llvm because you have relaxed guarantees about your input.
Some other things that were a little bit buried in the lead here of, sort of, Rust and cargo changes, is that cargo now has support for what are known as profile overrides. And you seem particularly excited by this, Ben. What was the use case you had in mind?
Ben: I think just in my case, I have a Rust project— again, we just
mentioned compile times. And so in this case— there’s kind of a longer story
here, with regards to— just talking to, you know, dtolnay, the author of
serde
, and kind of, like, worrying about hey, if I’m a library author,
and I happened to use serde
to generate some code, that then a consumer
of my library, why should they need to also get the dependency on serde
and then compile that, and then, run that just to get some extra code. I
mean, I could theoretically just copy paste this code into my project,
and they wouldn’t even need serde
, would they? And so, as the author of
serde
, David Tolnay is obviously kind of sensitive to this. And so he has
a few things— like there’s a thing called wat
of his, which is kind of,
I think, a wasm interpreter, which you would then include as an alternative
to installing serde
. And so, apparently in his test, it’s a lot faster,
which is kind of, like, insane. But it makes sense. You would have, like,
you would be interpreting this thing as opposed to compiling and then running
this thing, which could be faster, generally, if you are only running very
little code but compiling a lot of code, for example.
And so, an alternative in this case is that now, with cargo, there’s a few
extra little features— we’ll add a link to documentation to go over it
more. But now you can say, hey, like, for certain dependencies, I want to,
anywhere in my crate’s dependency graph, transitively or directly, I want to
say, hey, you know, build this with a certain optimization flag or any of the
various flags, although I think opt-level
is the biggest one that people
are most excited about. And so one of the the things you can override is the,
just build override. Which changes any build scripts, any proc_macro
stuff,
to— you can then apply it to whatever you want. Like so opt-level = 0
, for
example. And so, this may or may not change the speed of your compilation. And
so, one thing to mention too is that, like, if you change optimization level
of your procedural macros, you’re not changing the speed of your final binary,
because you are just changing the speed with which some library generates
the exact same code, that then goes into your final binary. Because of the
whole, the code gen separation here with what serde
does and how it works.
And so, for example, I’m just reading the other Reddit comments here of, like, someone tried changing their build override to not optimize at all, and they got a 5% gain. Someone else tried it and they got, like, they cut their time in half, for example. And so it really depends on the— what your code looks like, I think. So you should experiment and see, for example, if this improves your compile times.
Jon: Yeah, I mean, the thinking here is, you might have tried to do
something like cargo check
or cargo test
or cargo build
, I guess is the
most obvious example. You run cargo build
and it takes some amount of time,
and then you run cargo build --release
, and it takes a longer amount of
time. And the difference between those two is, with release, it’s going to
apply a bunch of optimizations which takes a bunch of time. And what this
will let you do is, say, for this dependency or this set of dependencies,
like build scripts or proc_macro
, as has been mentioned, I don’t want to
do optimizations because they don’t matter. So even if I compile in release
mode, I don’t want to optimize those libraries. And this might buy you a
bunch of— basically, it might give you cargo build
performance for some
of your dependencies, even though you’re building with release, without any
real cost to your final binary.
The other use case for this, which is what got me a bit excited, is to do
it the other way around, where, imagine that you have some library and you
depend on a library that, just like, is extremely slow in the default debug
mode. And so if you run your test suite and you build this dependency in
debug mode, your test suite just takes forever. Examples of this are things
that do compression, encryption, dealing with image files, that sort of
stuff where you really just always want to compile them in release mode,
even if you’re doing a debug build, because otherwise your debug build is
just going to be horribly slow to run. And so here you can now do that with
these profile overrides as well. You can say, like profile.dev
, which is,
like, the default development build, .package.compression-handling-library
,
and set opt-level = 3
. And now it will be built to be fast and then all
the other code will still be built quickly and in debug mode. And so this is
a really nice way to sort of customize your build, so that the things that
need to be fast are built slowly so that they’re fast, and the things that
can be slow are built quickly and are slow to run.
Ben: So I think that’s all for our release notes this time.
Jon: I had one more, actually, that I found while scrolling through the
infinite list of changes, which is, you can now arbitrarily nest receiver
types in the self position. Now this is a little bit weird. Some of you might
be aware that, in addition, to be able to write methods for impl blocks to,
say, like &self
or &mut self
or just take ownership of self
, you can
also write self:
and then some type that dereferences to self. And the prime
example of this is Pin
, that some of you might be aware of, right? You can
write fn poll(self: Pin<&mut Self>)
. And previously you could do this with
other types as well. You could implement a method for your type, that’s like
self: Box<Self>
. So that method would only be callable if the caller has a
Box
of your type. This might be handy, where in certain cases, you might
want to have this restriction, but previously you could only go one type deep
so you could say self: Arc<Self>
, self: Box<Self>
. Now you can nest these
arbitrarily; you can do things like self: Arc<Box<Rc<RefCell<Self>>>>
.
Ben: Hopefully not, you wouldn’t need that. But I think I better use
case is like, you know, maybe a Pin<&mut Self>
Jon: So Pin<&mut Self>
you could already do, but now you can do, like,
Box<Pin<Self>>
, which you could not previously do. We should note that
these types still all have to be types from the standard library. You can’t
have arbitrary self receiver types for types that are implemented outside of
the standard library. Although that, I think, is coming under a feature flag.
Ben: It’s definitely— there’s an RFC for it. I think, the, kind of
impetus for getting the initial implementation out was for Pin
for having
any kind of arbitrary self type. And then, I think, I’m not sure if, like,
the Pin<&mut Self>
was a hard coded thing, just for Pin
itself just
because of the design of the futures
trait. But yeah, now it is gradually
becoming less and less hard-coded, less and less special-cased. So hopefully
it should come eventually.
Jon: Yeah. And the reason this is neat is, imagine that you implement,
like, some hyper-optimized reference-counted type. Currently it’s not that
ergonomic to use, whereas once this lands, you could actually have this
sort of self: YourArc<Self>
and have that just work. So that’s something
to watch out for in the future, maybe one day.
That, I think, is everything I had.
Ben: All right.
Jon: I want to, yet again, thank the people who volunteer their time to edit these episodes. We are eternally indebted to you. And I also want to thank Ben, for all the great RustFest interviews. It’s been really fun to listen to.
Ben: I have still, like, seven more to edit.
Jon: That’s great. I’m excited to hear all of them.
Ben: They’re coming out, though they’re not going to be super timely. But these things take time, especially when you’re on a volunteer basis. Hey, speaking of: again, the call to action, as we mentioned before. If you want to come, like, edit these— any background at all in audio editing, which I guess— I don’t, maybe Jon does. But also, he has no time for that. Jon’s got pretty busy. He’s got some secret projects he was telling me about that you wish you could hear about. Oh, the secrets. Feels so good to know things that no one else knows. In any case, if you want to help out, come follow those links to any of our Twitter, Discord, etc. and just give us a hail. And then we’ll throw an audio file at you, and you toss it back to us after it’s edited. And then we’ll just put it up, and put your name in the credits.
Jon: And you also get to hear the episode before anyone else.
Ben: That’s true. You still won’t know the secrets. Jon’s terrible dark secrets.
Jon: That’s right. That’s right. They’re buried deep, deep inside. And on that note, we will, I guess, speak again, next release in what, six weeks? Is it six weeks?
Ben: It’s always six weeks.
Jon: I mean, I don’t know. Maybe Rust has its own schedule.
Ben: I mean, Firefox did move to a four-week release schedule.
Jon: That’s true.
Ben: So I mean, hopefully— I don’t want to go to a four-week schedule. That’s too much. Even six weeks is pushing it.
Jon: I guess one thing we didn’t talk about was the the Rust 2020 roadmap which was released.
Ben: I think it’s an RFC still. But I mean—
Jon: Oh, that’s true.
Ben: But also it’s kind of, in this case it is— they’re trying to set fewer concrete goals themselves because people tend to get, like, you know, it’s hard to guarantee that any given thing will get done because there are so many things that are being done on a volunteer basis. You can’t just be like “you’re fired if you don’t get this done,” like, no, you’re a volunteer, what are they going to do? In this case, it’s kind of like, again, polishing up various things from previous years. But people should read it— we’ll leave a link to it in the release notes here.
Jon: Yeah, I think I was a big fan of the direction they chose for this year. Even though it’s relatively high-level, I think it’s a good direction to target.
And with that I guess we’re signing off again. And there’s no futures await pun this time.
Ben: No, no.
Jon: This is very sad. Goodbye then!