What's New in Rust 1.50 and 1.51
Episode Page with Show NotesJon Gjengset: Hello, Ben.
Ben Striegel: Hello, Jon. Good to hear you again.
Jon: What a surprise to find you here.
Ben: Not really a surprise, because the Rust release train is right on schedule.
Jon: And I guess also because we planned it, I suppose.
Ben: We did plan it. That is true. Not surprised, because we didn’t just bump into that right here.
Jon: Yeah. All right. How do you feel about being halfway to 100 releases?
Ben: Wow, it’s it’s really, uh It’s quite a milestone. I personally am halfway to 100. Excellent.
Jon: Mm. The real question, I guess, is what happens when we tick over to 100. Does that mean that the “1.” gets implemented to “2.”?
Ben: I suppose.
Jon: Scary stuff. Scary stuff. The real question is, does Rust 2 come before Rust 1.100?
Ben: I mean, what do you think? What do you— let’s make— right now, let’s make a bet and see if history will venerate our decision.
Jon: I think there’s a really strong desire to never release Rust 2, because I think it’s just going to cause a lot of pain.
Ben: In that case, then let’s make a bet on at which version Rust will begin doing the Java thing, and just totally omit the one point before every single release number.
Jon: Oh, yeah, that’s a good one. I think that’s going to happen with release 100.
Ben: Maybe. It’d be a good milestone for it.
Jon: Yeah, I like that.
Ben: All right, that’s our prediction. What is it, six more years, we’ll find out if it’s true.
Jon: And this podcast will be around to give you the message.
Ben: We’ll be here. Write it down.
Jon: All right. So today we’re doing Rust 1.50 and Rust 1.51. As has become tradition now. And that that actually works really well, because 1.50 is sort of the end of the era before const generics. And 1.51 is the start of the era after, and clearly, this is going to be a land shift moment. Right, Ben?
Ben: Well, it’s been rolling in for a while. I think we have a few things from the previous releases that have been showing the harbingers of const generics.
Jon: And that’s why I’m pretty happy why we’re doing both of these at the same time. Because I feel like 1.50 is sort of cleaning up all the stuff inside the— or rather, inside the standard library that makes use of const generics. And then 1.51 is sort of, now we’re ready to let it loose on the world, at least in some limited fashion.
Ben: I wouldn’t quite say all this stuff. There are still some things that aren’t quite perfect, as we’ll get into today.
Jon: That’s true. But that’s that just gives us the opportunity to make things even better in the future.
Ben: There’s no challenges, just opportunities.
Jon: Exactly. So the first thing in 1.50 is const generic array indexing. So
this one struck me as a little weird, because I’ve always been like, can’t you
always just index into arrays? Like, wasn’t that always a thing? But I think
what’s really changed here is now arrays implement the Index
trait for any
length array.
Ben: Like officially, they do. Previously, it was magic, right?
Jon: Yeah. Yeah, exactly. And I think the magic didn’t implement the trait.
Like, you were able to use, like, square brackets to index, But the the Index
trait wasn’t implemented for arrays of arbitrary lengths.
Ben: So we can remove some edge cases from the compiler, where it’s like,
yeah, square brackets should defer to Index
, unless it’s an array, in which
case use this magic over here.
Jon: Yeah, and I feel like it’s also nice for library authors, right? Like
now, if you’re generic over Index
, you can actually take things that are
arrays without having to force the user to make a slice first.
Ben: That’s pretty cool. I didn’t think about that use case. Yeah, that’s nice.
Jon: I mean, I don’t know how many use cases are generic over Index
. The
best example I can think of is, imagine that you have a data type that’s, like,
generic over its backing storage, and it implements Index
if its backing
storage implements Index
, then that will now work with arrays.
Ben: Yeah.
Jon: The next thing that, in 1.50 I guess, is const value repetition for arrays.
Ben: So this isn’t actually— yeah, this is not, in fact, a brand new stabilization. This is just sort of an “oopsie,” as they call it in the compiler world.
Jon: Yeah, which is so funny. Do you want to talk about the oopsie a little?
Ben: Yeah, sure. So in the array repetition syntax. Maybe you’ve seen it
before. It is [some_expression; some_number]
, like, you know, 5, 10, whatever.
And that gives you an array of that length filled with that value. Previously—
sort of, in this release, it is acknowledged that you are now allowed to use a
const as the value here. Not for the length, which has always been true, but for
the expression itself. And so— but in fact, this has been true for— what is it?
Rust 1.38, this actually became true. It was accidentally kind of released into
the wild. And in this case, it’s not like, really a bad thing that it was, but
it wasn’t entirely intentional. Just one of those things that kind of like, went
under the radar. And so now this is acknowledged as being stable.
Jon: Yeah, like, accidentally worked.
Ben: Yes. So it’s official. It actually is here, and you can rely on it without feeling bad now.
Jon: I think one thing that that made me really excited for this is that you
might go, like, well, why are constants interesting here? But with all the work
that’s been happening on const— like, the const
ification that we’ve been
talking about over the past, what, year or so?
Ben: Years.
Jon: — is really cool, because now you can do things like, you can store—
you can create an array of Vec::new()
s because Vec::new
is a const method
now. Or imagine you have some super complex data structure. You can now store a
const None
, because None
is a const. Well, I guess maybe None
you could
always do that with? No, it had to implement Clone
— or Copy
, actually, but
now you can have some really complex type that does not implement Copy
. But if
you have an Option
of it, you can still create an array of them by setting
them to None
. Because None
is a const expression.
Ben: Yeah, it’s one of the— I think arrays are the big beneficiaries of the const generic support landing now, where it’s kind of like, they’re going from sort of magical language-provided things, to more like library type, almost. Still a primitive, still fundamental, and privileged, too, with their own syntax, but way less of their own, like, special little snowflake sort of data structure.
Jon: Yeah, it’s really nice.
The next one is kind of interesting because this ties back to I think what we talked about in 1.49 in the previous episode.
Ben: Certainly, yeah.
Jon: Which was— in 1.49, just to sort of recap, was, it suddenly enabled you
to implement Drop
for unions, and it allowed union fields to not just have to
be Copy
, but they could also be ManuallyDrop
. And the idea here is that when
you drop a union, you don’t know which variant the union is. That’s sort of the
point of a union. It might be one of many different things, so you can’t
actually drop any of the fields. And any type that implements Copy
does not
implement Drop
. So that’s why a union could implement Drop
and contain
Copy
types. And what changed in 1.49 was that you— it was also allowed to
implement Drop
if it contained ManuallyDrop
types, because they don’t get
dropped, they’re manually dropped. And what was realized in 1.50 was that
because they are ManuallyDrop
, it should be safe to assign to them as well.
Right? So imagine that you have a union. You don’t really know which variant you
have of the union. It’s always safe to just assign to a given field of any of
the variants of the union if it’s ManuallyDrop
, because the act of writing to
the union is safe, right? Because reading from a union is unsafe anyway. And the
value you overwrite gets dropped, but it implements ManuallyDrop
so it doesn’t
get dropped. So the change here is that now it is safe to assign to a
ManuallyDrop
union field, whereas previously it was—
Ben: Yeah, this was a surprise.
Jon: Yeah, it was only safe to assign to Copy
fields before.
Ben: Because my mental model of unions is like, well, they’re the unsafe version of enums. And so, like, everything is unsafe, in my mind, for a union. But actually, it makes sense, sort of like how making a raw pointer in Rust isn’t unsafe, although we’ll get to that in a sec. But reading it is. In this way, sort of, writing to a union is totally fine in this case. But reading from it isn’t.
Jon: Yeah.
Ben: So Rust is pretty precise about this, to not have to have too much unsafe lying around, where it’s not needed.
Jon: And I do really like this sort of endeavor to find— let’s make all the
things that are safe be safe. Another thing that that I thought was— like, it
seems like a sort of small change. But it gets us a really cool optimization
that I don’t think we’ve talked about on the podcast before. And this is the
idea that on Unix platforms, File
now has a niche. nitch? neesh?
Ben: I say neesh. I’m not sure; I’m not an English (unintelligible— 9:57).
Jon: I think it’s nitch, because a neesh, if something is neesh, it’s— Well, maybe, I don’t know.
Ben: This is kind of niche.
Jon: Yeah, I don’t know. Well, I guess a niche is like, something that’s highly specialized and narrow. So maybe that is what it is.
Ben: So, yeah, let’s talk about that real quick. So a niche for File
on
Unix platforms. And so we were just talking about unions, and how a union is the
unsafe version of an enum. And they’re unsafe because the compiler, or also, the
runtime code itself can’t actually check what variants of your quote-unquote
“enum” you are in right now. You’ve got to keep track of that manually. And
programmers are distracted, sometimes. Sometimes they’re tired, haven’t had
their coffee, so they can get it wrong. And that’s the idea of, enums can do it
safely; unions can do it unsafely. And enums do it safely, very often most of
the time, or, you know, not a naive sense, by not just having, you know, “thing
A” and “thing B” that are in the enum, the data that you actually want to track,
but also having another field, a secret hidden field, that tracks which variant
of the enum is currently active. It just exists in the enum.
But, Rust being a language all about zero cost abstractions, wants to try really
hard to not impose this extra memory cost on users. And so, what if we can kind
of squirrel away the enum variant data, somewhere inside the data that’s being
held in the enum already. And so the obvious example of this is imagine a
reference. Your enum contains a reference somewhere. References in Rust can’t
ever be null. They can’t ever have the value 00000000
there. So let’s just say
that, like— but it’s not invalid for it to exist like that on, say, the file
system, or in memory. And so if Rust wanted to, it could put some data there.
And in this case—
Jon: I guess in this case, like a single bit of data.
Ben: All you need is a single— you have a single bit of data, which doesn’t
sound like a lot, but actually that’s enough to completely eliminate a byte’s
worth, or more, of data from an Option
, an Option
being the classic enum
where it’s either a thing or it’s not, because in the not case, it only takes a
single bit of data to describe. So if you have an option with Some
contains a
reference to anything, that Option
, the size of it will only be the size of a
single reference, and not the size of two, in fact, because you would actually,
because of alignment and padding, or I guess just padding in this case, you
would need to have double the size, right?
Jon: Yeah, and I think it’s interesting because it doesn’t have to be just
Option. And I think this is something that’s been worked on internally, sort of
in rustc
, which is to find a way to— and this might have landed already on
nightly, that the niche optimization is actually supported for, I think it’s any
enum where one variant does not hold data and there are only two variants.
Ben: I think it’s been it’s been stable for a long, long time. It’s not just nightly. I think this has been around for, like, years now that users have been able to take advantage of this.
Jon: Yeah, but I think what’s in nightly now is, if the niche is larger, and there are more variants, and all of them hold no data except for the primary one, then you can take advantage of that, too.
Ben: Okay, yeah, it gets smarter all the time, so it’s a pretty involved
thing. I think it’s smart enough to even have, like, if you have nested
Option
s, it might be able to contain the data for both variants inside of the
niches.
Jon: Yeah, depending on the size of the niche, right?
Ben: Yeah.
Jon: And I guess, to bring this back to the changelog, what has changed in
in 1.50 is that a File
on a Unix platform is represented as what’s known as a
file descriptor, which is really just a number. But it is specifically a signed
number. And it’s a number that cannot be -1
because -1
is used on Unix to
represent that an operation failed and that you need to go check the error
codes. And so -1
is a niche for File
on Unix platforms, because it just— a
File
cannot ever hold that value as its file descriptor. And so it can be used
to represent None
for an Option<File>
instead.
Ben: Yeah, squeezing out memory savings. That’s a great use of Rust’s time, I think.
Jon: That’s true.
Ben: And those are all for the language features. We have some library stuff, unless you want to talk more about that.
Jon: No, no, I think the library changes in 1.50 are are pretty interesting.
The first one is, like, such an innocuous looking method. So this is the then
method on bool
. And the then
method is pretty straightforward. If you have a
bool
, there’s now a method on bool
called then
, that you pass in a
closure, and then it returns an Option
of the result of the closure. So the
idea is that if the boolean is false
, then then
will not call the closure
and return None
. If the bool
is true
, then then
will return Some
of
the value of calling the closure. So this is sort of a cheap, hacky way to turn
a boolean into an Option
. But the amount of work that it has taken to
stabilize this method is just ridiculous. Mara, who’s one of the, I think,
standard library maintainers. So Mara commented on Twitter that this took two
years of discussions, more than 10 proposed renames, several hundred GitHub
comments, with over 10,000 words in total, four proposed final commenting
periods, and several video calls for this to actually land in its current
format.
Ben: It’s bikeshedding of the highest order. It’s so small and trivial, that means that everyone has to get a word in.
Jon: Yeah, and it’s interesting, right? Because people had a lot of opinions
here. There were everything from, this should never be stabilized because you
can do it so simply already, it’s not worth having a special method for. There
was the argument that, maybe we should just implement, like, Into<Option<()>>
for bool
, and then people can just use into()
combined with Option::map
.
Maybe it should be called and_then
, instead of then
. It’s just— there were
so many proposals, so many suggestions. And ultimately, I think you’re right
that it’s mostly bikeshedding. But I feel like it’s nice to have something that
landed. That’s just like, short and sweet. And I know for me, like, this will
come in handy in certain one liners that can be tidied up.
The other thing that that landed in 1.50 is the clamp
method on Ord
, and on
floating point numbers, and the two are sort of related, like the reason it
could be added to Ord
was, I think, because it was added to f32
and f64
.
Hm, that might not be true. It’s a good question.
Ben: They don’t implement Ord
, no. Only PartialOrd
.
Jon: Yeah, you’re right, you’re right. I think the implementation of of
clamp
for Ord
is really interesting, though. So clamp
, for those of you
who don’t know, is it’s a method that you give it a lower bound and an upper
bound, and it returns the input value, but sort of saturated to those bounds. So
if the input value is lower than the lower bound, it returns the lower bound
instead. If the value is higher than the upper bound, it returns the upper bound
instead. And this is not hard to implement yourself using min
and max
, but
it is sort of weird, like you have to pass the lower bound to max
and the
upper bound to min
to get the right value.
Ben: It’s unintuitive, yeah.
Jon: And it sort of twists your brain. And so clamp
is just a much nicer
way to do it. And there’s been a clamp
method on the integer types for a
while. But now that it’s on Ord
, if you have your own types that implement
Ord
, you can now use clamp
directly yourself, which is really nice.
Ben: Yes, and clamp
is defined on Ord
, but the floating point types do
not implement Ord
, but there are now just inherent versions of these on those
types— you can still use clamp
.
Jon: I wonder what happens if you clamp an f32
by not-a-number.
Ben: Well, it talks about this, so if it’s— let’s see. This function returns
NaN
if the initial value was NaN
as well. I’m not sure what happens if you
give it a NaN
in the clamp
value, because you can’t obviously check at
compile time.
Jon: Yeah.
Ben: We could try that out.
Jon: Nice.
Ben: clamp
it to a NaN
. Oof.
Jon: What happens if you— if the upper and lower bound are both NaN
s,
which NaN
do you get back?
Ben: Yeah, what is the sign of the NaN
that you get back, if they’re both
NaN
s, but they’re different signs.
Jon: Yeah.
Ben: Nightmare of floating point.
Jon: The other stabilizations, I think they’re not too exciting. There’s
or_insert_with_key
on BTreeMap
and HashMap
entries. So if you haven’t used
or_insert_with
, it’s a really cool— if you haven’t used entry
, you should
definitely look up the entry
API. It’s fantastic. But if you have used it, you
might have used the or_insert_with
method, which is, if you have an Entry
into a HashMap
or BTreeMap
, or_insert_with
is, if this Entry
doesn’t
hold the value yet, then insert the value you get from calling this closure. And
or_insert_with_key
makes it so that that closure is supplied with the key for
the entry that’s about to be inserted. This is like a nice quality of life
improvement.
Ben: I think that’s it for the new APIs.
Jon: Yeah, and then, of course, there’s the traditional set of
const
ifications. None that struck me as particularly odd this time. Except
maybe pow
. So this is the exponent function for integer types. So now that can
be computed at compilation time.
I did also take my my usual deep dive into the detailed release notes. I found
two that were pretty interesting. The first is that the compare_and_swap
atomic method has now been deprecated. And this is something I think that has
been— it was decided a long time ago, like back in Rust 1.12 or something, that
this is probably something we wanted to do, but it hasn’t actually landed now
until 1.50. And the idea behind deprecating compare_and_swap
is that atomic
operations are already pretty hard to understand and make your way through. And
the standard library already has the compare_exchange
method, which is
strictly more powerful than compare_and_swap
, to the point that
compare_and_swap
— I think the compare_and_swap
function is implemented by
calling compare_exchange
. And so the idea was, we should surface as few atomic
operations as we can get away with, and then just document those really well.
And that’s what’s happened here. That compare_and_swap
is now deprecated, and
they recommend compare_exchange
instead. If you’re curious about the
difference between these, it really gets into, like, atomic memory orderings,
which are their own kind of special weird sauce. But basically,
compare_exchange
is the same as compare_and_swap
, except that you can
specify different memory orderings for if the swap succeeded or failed. And this
is important in certain lock free algorithms, and also just algorithms that end
up calling compare_exchange
in a particularly hot loop or on a hot path, where
you can do something useful if the exchange fails. In that case, you don’t want
the compiler to sort of, or the CPU to sort of stall out because of a forced
memory ordering if the exchange failed. There’s a lot more detail there, that I
don’t think it’s worth getting into here, but basically, compare_exchange
is a
more powerful version of compare_and_swap
, so we should just have that be the
one that’s in the standard library.
The other thing I found in the detailed release notes is that Cargo now has
reproducible crate builds. And before you get too excited, this does not mean
that Cargo builds are reproducible. That’s still something, I think, that’s
being investigated. But that’s a much, much larger task. Instead, this is if you
run cargo package
, which is what happens when you run cargo publish
as well.
cargo package
will take all of your source files and your Cargo.toml
manifest and stuff, and basically create a source tarball for your project. And
that’s what ends up getting uploaded to crates.io. So if you run a
cargo build
, when all your dependencies are downloaded, it’s those .crate
files, which are really source tarballs that get downloaded from crates.io and
then extracted. And then that’s what cargo builds. And now the process of
constructing those tarballs is reproducible. So if Ben and I have checked out
the same commit of the same, sort of, repository, and we both run
cargo package
or cargo publish
, for that matter, we end up producing and
potentially uploading the exact same tarball. Like, they would hash to the same
value. This normally doesn’t matter too much, but it is really handy when you
have larger build systems and stuff, where you want to make sure that you only
recompile if you know that something has actually changed. Which means that if
something hasn’t changed and you run the same procedure, you don’t want it to
create, like, a new file that seems like something has changed.
Ben: I imagine it could also be useful just for security, to like, know that you have read the repo for a crate, and then to actually ensure that the repo contains the code, or the crate contains the code in the repo.
Jon: Yeah, that’s also true.
Ben: Package it yourself and compare the hash.
Jon: Yep, yep. Absolutely. And this is something you can imagine, like
doing— you could, like, check out a bunch of just random crates, and then do a
cargo package
, and then check that the hash matches what’s in crates.io, to
see that no funky business has happened.
Ben: Yeah.
Jon: I think that’s all I had for 1.50. I feel like 1.50 was, like, a sort of small release. Like all of the stuff there is good, but it’s fairly small stuff, and that’s okay. Like I feel like all of it is good, sort of, forward progress.
Ben: That’s the train model for you. Sometimes, you know, there’s not many passengers on the train. Sometimes there are.
Jon: Would you say that there are many passengers on the train for 1.51?
Ben: There’s at least one really big passenger on this train. Rust 1.51. The
newest release of Rust. The headline feature of this one, we’ve been kind of
teasing for quite a long time now, is the const generics MVP. And you got to
stress the MVP part; this is not the fully realized, the final form of const
generics as it will exist someday. This is just the, kind of, most minimally
useful subset that kind of works right now and is solid enough to stabilize. So
in this case, all of the goodies we’ve been talking about for the past few
releases now, where the standard library is now implementing, say, Index
for
arrays, and doing all kinds of stuff. Now you the user can do that yourself in
your own code. It’s all available to you right now. And I’ve seen plenty of
folks very thrilled to be able to make use of this in their own crates. And so
if, as a Rust library user, I’m sure within the next few months, you will start
seeing your own dependencies update to make use of this in various cases. So do
you want to talk more about the const generics support here, Jon?
Jon: Yeah, I think one thing we’re going to see is a lot of projects deleting macros that they’ve written to implement a trait for arrays of various lengths. Right? Like, people have a macro.
Ben: That feels good.
Jon: Yeah. There’s so many of those, that just implement the trait for all different array lengths up to 32, 64, 100 or whatnot, and all those can now go away. And suddenly arrays of longer lengths will be much more usable in the language. This also means that arrays are more useful for things like, if you want to store, say, an image file that has fixed size— like, known resolution size, you can now actually store that as a contiguous array with a known type, rather than having it always be dynamically sized. And that’s just really nice. It just feels more proper. It feels like you’re encoding more information in the type system. That said, there are still some libraries I’ve seen that have not been able to fully replace their, either macro, or type system shenanigans with const generics, because there are a few restrictions still.
Jon: Yeah, so I mean, the restrictions are— they’re a little bit— definitely
for more constrained, or— not constrained, but more involved use cases. So one
is, you currently can only use, I think it’s bools, characters and integer
types. You can only be const generic over those kinds of types. And I think,
sort of, in the ultimate version, right, you’ll be able to be generic over any
constant value of any type, or of any particular type you give. That’s not
currently something you can do. You also currently can’t have the constant
generic have a value that depends on other generic parameters. So you can’t say
that, like, I’m generic over some const M
. And this other type is going to be
generic over M + 1
, for example. Like, you can’t put in expressions that are
themselves generic.
Ben: And I think expressions in general are kind of limited right now. You
can’t have, like a where
clause for const generics that are like, you know,
where X = M + N
or anything.
Jon: Yeah, exactly. And it also can’t depend on like, other generic types.
So if you have some type that’s generic over T
, you can’t use const generic
inside of there, and then pass in, like, mem::size_of::<T>()
. That’s also not
supported yet. So there are some limitations, but they’re ones that I think are
expected to be lifted over time. And it was more like the current set that we
have enables so many things that weren’t possible before. So we should get that
out the door, because it’s just super valuable in and of itself. And then over
time we can relax the restrictions that are in place.
Ben: Yeah, there’s a great blog post on the Rust Insider blog that we can link to. Well, let’s assume that we have linked to it and you can click it below.
Jon: That future selves have already have had added links to.
Ben: That, one, future yourself is now remembering to link, and that future listeners are now remembering to actually click on.
Jon: Nice. The the other thing that’s really cool, I think, and that that’s sort of related to this const-generic-ification, is that now we finally have a way to take an array and turn it into an owned iterator. With a slight caveat. But do you want to talk a little bit about this?
Ben: Yeah, sure. So, just briefly, for
loops in Rust: pretty cool. You can
kind of, like, if you have, say, a vector, you can just say
for thing in vector
and then curly brace and it gives you an array. And that
is a pretty nice little piece of syntax that any user type can opt into, not
just the standard types, by implementing the IntoIter
trait. Actually, it’s
specifically the iter::IntoIter
trait. (editor’s note: the trait is called
IntoIterator
, which contains a method called into_iter
) And so if you do
this, then whenever you use this type in a for
loop, it will implicitly,
behind the scenes, call into_iter
on that type, and then iterate over that
value that it gives you.
So there’s— now that we can possibly implement into_iter
for arrays. Like,
remember how we just talked about implementing Index
for arrays. It should be
possible to do it, and it is possible to do it. But there is a bit of a problem.
So in Rust, there is a thing called “autoref.” So if you have a method, for
example, let’s say you have a method and it takes— there are receiver types.
There’s self
and &self
and &mut self
in these methods. How do you call
these? And so if you have an &self
method and you have a value foo
, you
don’t need to say, you know (&foo).whatever()
. You can just say
foo.whatever()
and then Rust will, behind the scenes, generate a reference for
you, to that type. This has some complications with the resolution, though, with
how arrays and slices interact. So slices have always been usable in iterators.
You’re always able to say for x in some_slice {
, and that operates via the
into_iter
transformation. But the thing is, slices are always reference types,
and so even if you take a slice by value, the elements inside are going to be
by-reference. And so in this case, if you had an array and you tried to call
.into_iter()
on it, what it’s going to do is, it’s going to say, hey, I don’t
see into_iter
on this, but I do see that if I reference this, I will get the
into_iter
implementation on slices. And so that is, today, what happens if you
have an array and you call .into_iter()
on it. Now it’s not—
Jon: Or just pass it to a for
loop, right?
Ben: Yeah. Any of this. Well, so today if you pass it to a for
loop, it
will— if you have a for item in some_array
, nothing happens. It doesn’t work,
and— does it? I don’t think it does.
Jon: I thought that worked. But let me—
Ben: I think it’s specifically, the problem is about calling .into_iter
on
an array, an actual array. And the thing is, because it’s going to reference
your array and then turn it into an iterator, you’re only going to be able to
get the elements by reference. And so it’s actually no better than if you had
called array.iter()
, because iter
is the trait— or, it is the method that
gives you a by-reference iterator, naturally. Where into_iter
is by value,
iter
is by reference.
Jon: You are completely right that it doesn’t work. If you do like
for item in
, and then you give it an array—
Ben: Yeah, it wants you to add on the ampersand there, to turn it into a
slice first. So the idea is we could add the implementation of into_iter
for
arrays. But now suddenly that would break code that— because it changes the
actual value internally of what you get back from the iterator. So that’s going
to be— previously it’s going to be a reference, now it’s just the raw value, the
owned value. So for at least over, maybe a year and a half now, this problem has
been foreseen, even though const generics hasn’t been stable for, or even usable
for that long. And so the idea is Rust has been warning for quite a while now
that if you have an array, and you called into_iter
, you get this warning that
says, hey, in the future, this might change. Just so you know. This is an
example of a future compatibility warning that could change at some point. And
Rust uses these occasionally, often at edition boundaries, to change behavior in
ways that otherwise might break some code. And so there have been various Crater
runs, and Crater is the tool that gets used to check to see if any of the code
on crates.io would actually break from doing this. And there have been some
regressions, from the theoretical doing this. And so patches have been sent, but
many of these projects are older and obviously aren’t being maintained anymore.
Kind of just languishing out there. And so the idea is, Well, is there a way of
doing this over the edition, for implementing into_iter
for arrays? Is there a
way of maybe just breaking it, or way of mitigating it somehow? And so the
compromise that exists right now is to say, hey, let’s not worry about
implementing this trait, but let’s still provide a way to get an array by value
and not have to make it into a slice first, and only get the elements by
reference. And you could do it via the cloned
method on iterators, to get a
cloned copy of the values inside the iterator. But we don’t want to have to, you
know, introduce unnecessary copies. This is Rust. So there is now a new struct
in the array
module. It’s called IntoIter
, kind of confusingly. But the idea
being, this is hopefully a temporary, transitive sort of thing. And so the idea
is, if you import array::IntoIter
, and then you can call IntoIter::new
with
an array, that gives you an array by value with no extra copies, no fuss. And
so, in the future, this might become redundant. Or, you know, maybe someday even
deprecated in favor, hopefully of having an actual into_iter
implementation on
arrays. But in the meantime, there’s no reason that arrays can’t take part in
the fun of being turned into by-value iterators.
Jon: I was really hoping that maybe it would be possible to have IntoIter
implement From<[T; N]>
, and they can just do for item in array.into()
. But
unfortunately, the type inference is too complicated. But I wonder— it sounds
like maybe this will happen on an edition boundary?
Ben: Yeah, there are some complications, to be determined. I think right now
there’s been another Crater run queued up, to see if the— so one of the things,
the problems, kind of getting into the behind the scenes here, is that a lot of
people wouldn’t have noticed the warning because it might have been happening
inside of their dependencies. And with Cargo, when you compile, a warning in a
dependency isn’t necessarily shown. And so the idea here is that with these
future compatibility warnings, they will bypass Cargo’s hiding automatically, of
any sort of warning, and it will actually show it to the user. And I believe
that recently landed in Cargo. So it’s not stable yet, as far as I know. And so
once that does land, then we definitely should be more, confident that people
will have seen this in their dependencies, will have either fixed it, forked the
dependencies, transitioned to new ones somehow. But that might not happen in
time for the edition. And it’s unclear if an edition could have, like two
different IntoIter
traits to control whether or not you’re doing this or that.
So there’s— it’s up in the air still, we’ll see what happens.
Jon: It’s an exciting future prospect.
Ben: It’s always exciting with Rust.
Jon: Speaking of Cargo, actually, Cargo got a pretty major new feature, in
this 1.51 release, too. And that is Cargo got an entirely new feature resolver,
and that might not mean much to you if you haven’t been digging around in
Cargo’s internals, but basically Cargo’s— one of Cargo’s primary jobs is to
resolve all your dependencies. So that is, given the dependencies that you
specify in your Cargo.toml
, and all of the versions that are available on
crates.io, which versions of which crates do I have to compile, in which order,
with which features? It’s like, solving that problem turns out to be very, very,
very hard. But Cargo does a pretty good job at it. And one of the, sort of,
let’s say, let’s call it simplifying assumptions that Cargo made in the past,
and arguably for good reason, was that if you have two paths to one dependency,
so imagine that you depend on bar
and you depend on baz
, and both bar
and
baz
depend on foo
, so you have, sort of, think of it as a diamond shape. And
then imagine that foo
has two different features, feature a
and feature b
.
And bar
enables a
and baz
enables b
. Then Cargo will sort of unify these
features, so Cargo tries pretty hard to avoid compiling anything twice if it can
avoid it. So rather than compiling foo
once with a
and once with b
, Cargo
will just compile foo
with a
and b
both enabled.
And in general that is the behavior you want, because you don’t want to, sort
of, compile things lots of extra times, and then have lots of sort of duplicated
but not quite duplicated artifacts sitting around. But the downside of this
unification is that Cargo sometimes goes a little too far. So one example of
this would be, imagine that bar
is a dev
dependency and baz
is a non-dev
dependency. Then you don’t really want to merge the features across them because
if you’re doing a release build, and you’re not planning to do a debug build or
you’re not planning to do a test build, you just want to build the release. You
don’t really then want to compile foo
with the features that are only used on
testing. The best example of this might be, imagine you pull in something like
tokio
, right? So tokio
has a lot of different features. Some of them are
very small, like if you have tokio
with no features, it compiles basically
nothing. It’s just like, a bunch of traits and maybe a couple of types. But
there are some features that bring in a lot of stuff, like if you enable the
multi-threaded executor feature, that’s a lot of code that has to be compiled.
Most projects will only include the sort of multi-threaded runtime feature in
their tests, in their dev
dependencies, because they need them for tests, but
not in their normal dependencies. But the way that Cargo used to work, it would
always compile tokio
with all of those features, even if they were only used
for testing, and you were doing a release build. And that is, of course,
unfortunate. It means that you spend a lot more time compiling than you
otherwise would. And there are cases where this even breaks builds, and this is
part of the reason why it was decided that a change had to be made. And that
was, imagine that you’re doing cross-compilation. So let’s say that I’m on Linux
and I want to compile for Windows. Then now imagine that in my build
dependencies, which are run on Linux, I enable the Linux feature of foo
, and
in my dependencies, so what gets built for Windows when I do a cross-
compilation, I enable the Windows feature. In the old resolver, Cargo would then
go, oh, so I’m going to unify and help you out and compile foo
with both the
Linux and Windows features. But the Windows features don’t compile on Linux, and
the Linux features don’t compile on Windows. So I end up just being unable to
compile foo
when doing this kind of cross-compilation. And so this is a
problem. There are some other examples around hosts and proc_macro
s and stuff
that we don’t need to get into. But basically you can see how this becomes a
problem if you end up over-unifying in a sense.
So what happened was that the fantastic Cargo maintainers decided to implement a
new version of the resolver, that had this behavior that it understands when it
needs to keep features separate, and when it can’t unify them. So for example,
you can’t unify across targets, across host types, and you probably shouldn’t
unify across dev
dependencies unless you are doing a test build. And that
landed with this additional field you can set in Cargo.toml
that says
resolver = "2"
. When you set that, you’re telling Cargo to use this new
updated smarter resolver. Unfortunately, the sort of second generation resolver
can’t be the default, because it’s not backwards compatible. There might be some
crates out there that have a feature enabled in their dev dependencies that are
actually needed to build the crate even if you’re not building the tests. But
that wouldn’t be visible with the original resolver, because it just unifies
them for you. So it might be that if we landed resolver = "2"
, as the default,
crates wouldn’t build because they would end up with too few features enabled,
when doing a normal build. So for now, it’s a specific thing you have to opt
into when you create a new package. But I think the plan is, based on the RFC,
that this new resolver is going to be the default in the next edition. It’s a
pretty cool change, like I’m very glad this lands, I think it’s going to make a
lot of compilations be faster because you end up not compiling more than you
need unless you need it. It also has some implications, I think, for things like
no_std
crates where you might want to, like, compile it with std
support
when it’s a build dependency, but without std
support when it’s a real
dependency. Really, this like, second generation resolver’s doing things the
right way. Of course, the downside is that now you might end up compiling some
dependencies more than once. But that’s sort of, it has to do that for
correctness. But it is a cost that you should be aware of, that this might be
coming down the pike.
Ben: Yeah, I think it’s a bit too early to say whether or not it will actually improve compile times, because it might also make them worse. But the idea being that, hopefully you aren’t building your dependencies from scratch very often, so it shouldn’t be a recurring cost.
Jon: Yeah, I think the idea is that it’s going to improve the time for
release builds. It’s going to fix some of these problems that are just like
inherent, right? Like, if you’re doing a cross-compilation that are no_std
, it
might be that you just currently cannot build. And it will fix those.
Ben: Yeah. I agree that it’s necessary for correctness.
Jon: Yeah. There’s another feature that landed in, I guess Cargo and
rustc
. That’s sort of an overlapping feature. And this is this notion of
splitting debug information on macOS. This is a little bit, like, hairy and
weird, but basically, when you do a debug build, or any build that has debug
symbols enabled, this is the kind of stuff where, like you run your program, you
get a backtrace, or your program crashes or you try to debug it in GDB. And
rather than just getting like pointer addresses for everything, you actually get
the names. That’s pulled out of the debug information. And how that debug
information is stored and compiled varies from platform to platform. And on
macOS, it used to be that all of the debug information was put into— like, it
was passed into this tool called dsymutil
, which constructed this big folder
that contained all of the debug information. And that that process was fairly
slow, especially if you had, like, a large binary with lots of debug information
and lots of, sort of, instructions, that just meant there was a lot of debug
information. And in particular, it was slow because dsymutil
had to be run
after linking. So you sort of had to produce your full program binary and then
run it and then end up with all this debug information. And that means that even
if you make a small change to your program, you might need to, like, spend a lot
of time regenerating all of the debug information. And that’s really
unfortunate. It means that your bills are slower than they need to be.
So what’s cool is that in 1.51, macOS will now take advantage of this sort of
different way of storing debug symbols, where rather than extracting them and
storing them separately, the debug tool that you use, this might be GDB, for
example, or LLDB, I guess, is another example. It understands how to find the
.o
files, the like, binary intermediate artifacts that get produced during
compilation, and so you can just leave the debug information in there. And then
there are pointers in the final binary that point back to those files. So if you
try to debug your final binary, it will just sort of go to those files to find
the debug information. Which means that if you make a small change to your
program, only the files that needed to be recompiled, like the .o
files that
need to be regenerated, only those have to be regenerated. And there’s no longer
a need to run this, like dsymutil
at the end. And this really just should
speed up a bunch of compilation cycles on macOS. It is also something I think
that is technically possible on other platforms, but just isn’t supported yet.
Ben: It’s on the way, I believe. Yeah, so for— I think for Windows this has always been the default. I believe— I’m not a Windows developer, but I believe that this is just how debug info has always worked on Windows for all time. And I think that on Unix recently, DWARF v5 came out, and then, like standardized how this should be done for DWARF, the file format. And I think that is being worked on right now. I think LLVM understands it and can do it. I think it just requires Rust to do some more legwork and make sure that it all works.
Jon: Yeah, I think you’re right. Although I’m not sure about Windows, actually. I think Windows uses the same kind of structure—
Ben: I could be wrong.
Jon: I think Windows uses the same kind of structure as macOS, in that it like, puts the debug information in its own separate file. But I think it’s like a file and not a folder, and you don’t have to run this tool, so it just ends up being a lot more efficient, like it’s really just the whole debug section of the final program binary gets dumped into a separate file.
Ben: Yeah. That’s what I expect.
Jon: As opposed to on macOS, where you have to, like, extract everything into this folder. So this is like a neat, just, quality of life improvement for macOS development, I think, which has traditionally been a little bit of a pain with macOS. Because like, it’s not— it’s worked fine, but there’s been some of this like, we have to call this extra tool, and that makes compilation slower and stuff, where this should be— this should definitely, like, improve the experience.
I think we can also speak like, very briefly to why it took this long to land
this on macOS. Because it’s kind of interesting. So under the hood, Rust needs
to be able to produce things like backtraces, right? So if your program crashes,
if there’s a panic, for example, it needs to be able to print where the panic
happened. And that happens by unwinding the stack. It sort of walks the— it
walks the callers backwards to where the panic happened and sort of prints the
name of each one. And traditionally it used to unwind using a particular library
or implementation that didn’t support these kinds of new— this new debug format
on macOS. It only supported the dsymutil
one. But then somewhat recently,
there was a re-implementation of this unwinding component in native Rust, and
that native one does support the new format. And so once that landed, we were
now able to move the macOS default to be using this new format and still have
things like unwinding backtraces work. I think it was a really cool example of
sort of this snowball effect, of once this lands, then this can land, and this
can land. Which is similar to what we saw with const generics.
Ben: That’s it for all the toolchain and language changes, in the blog post anyway, and then we have new stabilized APIs.
Jon: Yeah, so here there’s some funny ones, too. One is called out in the
sort of, release announcement itself, which is ptr::addr_of!
and
ptr::addr_of_mut!
. So I think we’ve touched on this briefly, but to recap, in
Rust, you’re not allowed to create references that are unaligned. Every Rust
reference must be aligned. Which— what exactly alignment means is sort of too
technical to get into here, but roughly there are rules for what a reference can
and cannot point to. But the same rules don’t apply to raw pointers like a
*mut
, for example, is allowed to be unaligned. And this comes up in things
like, if you have a packed struct, so one where there’s no padding in the fields
to sort of come in the order that you list them exactly as you list them. It
might be impossible to generate a reference to any of the fields. And this is a
problem because that means it’s also impossible to generate a raw pointer to
those same fields, because generally how you create a raw pointer is, you take a
reference to the field and then you turn it into a raw pointer. But it’s
illegal, or it’s undefined behavior to create an unaligned reference. So you had
no way to do it. At least if the struct didn’t have repr(C)
, so is using the
Rust representation. And so there’s been a lot of discussion of like, how do we
fix this on soundness? Because it’s necessary for things like the offset_of
macro, which tells you how far into a struct is a given field. Because it
fundamentally needs to do this operation, but there was no safe way to do it.
And what’s happened here is that the sort of standard library team aren’t
willing to commit to any particular mechanism yet. They have some in mind, like
I think there’s an RFC for this idea of raw references that would solve this
problem, but that’s still like some way off before it stabilizes. But the need
for this particular feature of getting a raw pointer directly from a field is so
great that they’ve landed these two macros, which, behind the scenes, use this
sort of yet to be stabilized feature. But the— so they’ve stabilized the macro
themselves, but not the implementation. Which I thought this is like a really
neat example of being able to do this compartmentalization of what you
stabilize.
Ben: Yeah, so as we’ve mentioned before, maybe you’ve inferred the standard library itself is allowed to use unstable features, and that’s because the language and library are pretty closely knit. And so the people behind the language are also behind the library, and so they can guarantee that they won’t make any language changes that will break anything in the library, and vice versa, that the library won’t use or expose any stable APIs that are using nightly features, unless those features are like, vaguely stable-ish. Or at least you can tell that they’re unstable in some way. But it’s funny because in this case, the reason that these are macros is because they literally expand to nightly-only features. And because it’s a macro, you can think of it as just kind of copying and pasting some code out of the macro into your own code. And then you think to yourself, okay, so how does this work, if I’m making my own code? And suddenly it just happens to contain some totally unstable code? And I’m not on nightly. I’m not using any feature flags. And there’s just compiler magic, it turns out, that allows certain macros in the standard library to expand to unstable features, and doesn’t cause any problems for downstream users. And so if you are using this macro, it doesn’t mean that you can suddenly use this unstable, raw reference feature that’s being prototyped. It does mean that the interface will remain stable forever, but it just means that whatever it expands to will compile on your code, even though it’s kind of a cheat.
Jon: Yeah, it’s a really neat cheat. I didn’t know this even existed, but it makes a lot of sense.
Ben: It’s also a little bit thing for me. And it’s for a tiny reason, which
is because I believe this is the— well, if you’re used to Rust, you’ll kind of,
like, maybe understand, without really understanding why, all of the macros that
you see in Rust, it seems like they’re all in the prelude: panic!
, println!
,
format!
. All these things, you don’t need to ever import them. And this is a
relic of the time when in Rust 1.0, it was impossible to namespace macros. They
just had to always live in the root namespace. And so if you put them inside the
standard library, they’re just in everyone’s code by default. I believe this is
the first macro to be stabilized that is not in the root namespace. It is
actually namespaced under the ptr
module. And so it does this, using some
itself kind of unstable support for quote-unquote macros 2.0, which has been in
the works for quite a while now. And it is stable enough to expose here and used
to implement the addr_of
macros. So I think that’s for me, exciting, because I
like to have nice looking documentation, and for me, it’s kind of gross to have,
like, you know, here’s the front page of the standard library API docs. But also
here’s a bunch of— here’s a nice list of, like, all useful modules and then,
like a list of totally random, un-namespaced macros that could live somewhere
else, but don’t.
Jon: Yeah, I wonder, do you think we’re going to see sort of modularization or namespacing of existing macros?
Ben: I’ve honestly thought about writing the RFC for this. Where, just kind
of like, hey, like, find places to live for all these macros, because, I mean,
it’s kind of— it’s weird in a way, where like, you would think that you would
expect, say, like the vec!
macro to live in the vec
module. But it doesn’t.
And so it’s kind of unintuitive in that way, and obviously you can’t ever, you
know, you can’t quote-unquote remove them from the prelude, you would need to,
if you did this, you would need to implement the macro in a module, and then
expose it in the prelude, so that no one’s code would break.
Jon: Well, so there’s also the difference here between the prelude and just
the source of the standard library, right? So there are macros that are in,
like, std::vec
. That’s where vec!
currently lives, I think, the vec!
macro. But it would be a matter of like, moving it into the vec
module. And
then still probably, you would still have to expose it under std
, because
backwards compatibility and whatnot.
Ben: Yeah.
Jon: Speaking of macros, there’s another, like, macro-related change here,
which is panic!
. In particular there’s a new panic_any
function which might
strike some people as weird. Like, why do we need a panic function when we
already have the panic!
macro? Do you want to talk a little bit about the
difference here? I was really surprised to even learn that there was a
difference between std::panic
and core::panic
and why this was needed.
Ben: Well, that’s kind of being ahead of yourself, but actually, so the idea
is that nowadays, at least— I’m not sure if it’s stabilized yet, but
forthcoming, there won’t be a difference between core::panic
and std::panic
.
But in the meantime, there is a feature that is in the pipe right now, that I’m
kind of excited for, and it is implicit formatting arguments for all the
formatting macros. And so, if you’ve used, for example, many other— pretty much
any other language that has, what do you call it? It’s— my mind is—
Jon: Exceptions?
Ben: No, no, no, no. Not exceptions. In Python the “F-strings”. What’s like, the general term for that?
Jon: Like formatting strings?
Ben: Sure, we’ll just call it formatting strings. There’s a better— there’s a more general term for it, but like strings—
Jon: Interpolated strings?
Ben: That’s it. String interpolation. Yeah. And so, like any kind of like,
you know, JavaScript has this, and like, the idea is you can just give, have a
string literal, and then just refer to variables in the surrounding scope, and
it will pull those into the literal for you, and kind of format it nicely. So in
Rust, the way you do it instead is, right now you have, like, say, the format!
macro, which makes a string, format!(
and then your template, kind of like the
string literal with all the curlicues inside. And then after that, you have a
list of various identifiers that the macro itself will capture, and then the
macro will expand to something that can use these things. And so the idea being
that there’s nothing actually stopping Rust from referring to, or you know, any
kind of, like, string literal used in this way, for a formatting macro, from
referring to identifiers in the surrounding scope. And so there was an RFC that
was accepted and implemented, and it is now on its way. It’s maybe being
stabilized, probably this year, I would hope? Not going to give any promises,
I’m not involved. But, the idea being it should kind of bring Rust closer to
other languages that you are familiar with, with string interpolation, where you
don’t need to list out the name of an identifier, both inside the string and
also outside of it.
Jon: That is really exciting.
Ben: So that’s cool. And that’s forthcoming. But there’s a problem. And that
problem is that, so this machinery is going to apply to any macro that
internally calls the format_args!
macro. And format_args!
is the secret
sauce behind all of Rust’s formatting stuff that, like, checks your types are
all correct, and all this. But there is one exception. Just one exception. And
that is, if the panic!
macro, if it is called with an argument that is just a
string literal, it does not call format_args!
. Every other macro, even if
they’re just called with a single argument string literal, still calls
format_args!
. But because panic!
doesn’t, it means that right now it is
legal to invoke the panic!
macro with just— like, imagine, "{foo}"
. And so
any other macro, that would give you an error because format_args!
is like,
hey, you haven’t passed in a foo
, I have no idea what foo
is. panic!
just
panics with the string {foo}
. And that’s the only thing that is different
between panic and like, every other macro at this point. So the idea being,
well, like, what do we do about this? There was one weird kind of historical
exception. How do we fix this? And so the idea— this is kind of the first stage,
which is you provide an alternative, where it’s, if you are doing this, say, and
you like, for whatever reason, need this to work, you don’t want to make any
changes in your own end, this panic_any
function will do the behavior that you
expect.
Jon: Maybe it’s worth talking about, too, like, why is it useful to panic with a type that’s not a string?
Ben: Just for, like, messages. This isn’t like, you know. This is not like,
you know, error handling or exceptions. It’s just the idea being that— well, I
guess what you’re saying here too is, so like, it’ll turn it into a string. So
it’ll like, say, you know, you can— It’s kind of like, if you want to panic
with, like— it’s for println
debugging, I guess, is the idea. But nowadays, a
thing that exists, that did not exist at 1.0 when this was implemented, is the
debug macro, the dbg!
macro, which is way better for println!
debugging than
panic
is.
Jon: So what I was thinking of was actually something slightly different,
which is, if you panic, you can actually supply a value to the panic, and that
value gets returned by whoever catches the panic. So if you call panic!
with a
value that’s not a string, and then let’s say that— so that causes the current
thread to start unwinding, right? Imagine some other thread joins with the
thread that panicked. The value that was supplied to panic
is available to the
thread that joins with that panicking thread. So this is, when you do join
on
a thread handle, what you get back is, like, the error type of that is something
that you can downcast into the value that was passed into panic. And so this is
one way to communicate, like, why did I panic? And it doesn’t have to be a
string. And that’s sort of on purpose, because maybe you want to provide some
richer information about why you panicked, and that’s available through this
mechanism.
Ben: I’ve heard that one before.
Jon: Yeah, so this is why panic_any
doesn’t take a string. It takes any
type that implements Any + Send + 'static
so that you can downcast into it if
you catch the panic. So it’s going to either either be through the
catch_unwind
function in the panic
module, or it could be because you’re
joining with a thread that panicked.
Ben: Anyway, the idea is that— go on, if you want to continue, I was going to continue, but—
Jon: Yeah, so I was just going to say, like, this is the reason why some
people might want to call the panic!
macro in the past, with just a single
argument that is a string that they don’t want interpolated, or something that
isn’t a string at all. Like, it’s just some other value. And this panic_any
function is now the way to do that, rather than using the panic!
macro, which
is going to start formatting.
Ben: Yeah, I’d say it’s still pretty rare. And not a thing you really want to do very often.
Jon: Oh, yeah, no, not at all.
Ben: Anyway, so the idea being that with the edition, this discrepancy will
be papered over and, like, fixed quote-unquote. So panic!
will begin to
participate in the machinery of getting the implicit arguments. But this is just
kind of like adding in the ability for people to migrate as well. And just what
you said.
Jon: Yeah, I think this is really nice, and sort of separating out that the
panic!
macro will be for the same things like the print!
macro are for
except you’re also panicking. And then if you specifically want the feature
where you’re able to propagate a panic value, that you use the panic_any
function, rather than go through the macro. I think that it’s like correct
cleanup. It’s great.
I think there are two other things I wanted to talk about from the stabilized
APIs. The first of them is pretty straightforward, so it’s a bunch of methods
that have been added to slice. This is stuff like split_inclusive
,
strip_prefix
, and strip_suffix
. And these are, at least in my mind, ways of
making slices feel more like strings, or rather, give more of the conveniences
that you currently have with strings. So to take strip_prefix
as an example,
what strip_prefix
does is, on a string that is, you can if you have a string
that says "foobar"
, and you call .strip_prefix
and you pass in the prefix
"foo"
the string prefix “foo”, it will return Some("bar")
, so it will slice
the string for you, to not include the prefix you gave. And if the prefix isn’t
there, It returns None
. So this is a really handy way to, well, strip a prefix
from a string, if it’s there, and to learn whether or not it’s there in the
process. And previously, if you had something like a byte array or something,
and you wanted to strip if it started with a given byte sequence, like for
example, if you’re dealing with ASCII text and you have, like, a u8
slice, and
you want to see whether it starts with this u8
slice, previously that was
actually really annoying to do. And it’s nice to see slices get some of these,
like, just quality of life improvements that we’ve had for strings for a long
time and really could just have for slices too. I think we’re going to see more
of these land in the coming releases, and I’m pretty excited about it.
The other one is task::Wake
, so we’re not going to get into all the details
of, like, how async/await works and stuff. But a Waker
in async/await is a way
for a Future
to say that, or for something to say, this Future
is now ready
to make progress. Like, it returned Pending
in the past, it was waiting. And
now wake it up, try to poll it again, try to make progress. And previously, if
you wanted to, like, write your own little executor or something, creating a
Waker
to pass into polling a future was really, really annoying. You had to
deal with, like, manually constructing, like, a vtable from raw pointers. It was
really hairy, and for good reason, because it should be low overhead. But if you
just wanted something to, like, get up and running and you didn’t really— the
overhead wasn’t that important to you, like you were mocking an executor
something. It was a huge pain. And now what’s landed is the Wake
trait, and
the Wake
trait is sort of a helper trait. And this might be the first of these
I’ve seen in the standard library, where what trait does is it— you implement
the Wake
trait for Arc
of your type. So implemented for your trait, sorry,
for your type, your Waker
type. But all of the methods receive an Arc<Self>
.
And then you implement these sort of methods that are required for wakers that
way. And then the Wake
trait takes care of implementing Waker
so that you
can now pass that into the context that’s needed for a future. It’s a little bit
convoluted, but basically the Wake
trait makes it so that normal human beings
can implement wakers themselves. And that way can implement executors
themselves, without too much, like, of this additional raw, unsafe business with
vtables that you had to do in the past.
Ben: One extremely minor API, wanted to kind of call it here, the
Peekable::next_if
functions that were now stabilized, and not for the
functions themselves necessarily, but because just for the Peekable
combinator
on iterators, which allows you to take an iterator and then make it so that you
can see the next element of the iterator without advancing the iterator. So it’s
a very kind of like, nice, convenient thing that I think needs more press.
Jon: Oh, yeah, Peekable
is fantastic. I use that a lot. And in fact, I
think the easiest way to get a Peekable
is if you have any iterator, you can
just call peekable()
on it and that gives you a Peekable
iterator.
Ben: Yeah, it’s great.
Jon: And then I did my deep dive into the other changes. Here too, I think
there are two that I want to call out. The first one is a really nice quality of
life improvement for rustdoc
, where now it will show you the documentation for
methods that you get through deref, even if you go multiple levels deep. So the
example from the PR is, imagine you have some type foo
that implements Deref
to PathBuf
, or let’s use String
instead. String
might be a better example.
So foo
implements Deref
to String>
and String
, of course, implements
Deref
to str
, and previously if you render the documentation for foo
, you
would only get the methods from Deref
String
, not the ones from String
’s
Deref
to str
. But now you will get the the methods sort of inherited through
Deref
all the way down. So it would include both String
and str
. Or if you
had something that Deref
s to PathBuf
it would include both the methods from
PathBuf
and the methods from Path
, which PathBuf
implements Deref
to
Path
. So that’s nice. It just means that our documentation is going to be more
complete, like it’ll actually show off all the things that you can call without
you having to, like, click your way through a large hierarchy of types.
The other one is much lower level. And that is, there’s this configuration
option you can set for rustc
called target-cpu
, which is basically, which
CPU should you compile this code for, and that dictates things like which
optimizations are enabled, but also which instructions are enabled. So different
processors often have different sort of instructions that do particular things
in a more optimized fashion. And there’s a value you can pass for target-cpu
,
which is target-cpu = native
, which is basically, figure out what CPU I have,
and then compile the fastest code you can for my CPU. And this is something that
can often lead to pretty significant improvements in runtime. But it had this
unfortunate, I guess bug, or it was just very— it was very naive, in that if you
gave target-cpu = native
, it used to be that it just sort of figured out the
name of your processor, and then enabled the features that a processor by that
name should have. And this has two failure modes. One is if your processor
happens to not implement a feature that’s implied by its name, then now you’re
going to get a binary that you can’t run because it’s using instructions your
CPU doesn’t support. There’s also the other way around, where if your CPU
reports to be a processor of a given name or a given family, and that family of
processors don’t usually support a given instruction, but your processor does,
then you would lose out on that optimization on the use of that instruction. And
so now what’s landed is, if you specify target-cpu = native
, it will actually
detect all of the different features, rather than going by processor family. And
so now you should only get the features that your processor supports, And that
might mean that you now get more features, like faster code, more widely used or
wider use of instructions. And it also means fewer mis-compilations. So this is
just like a nice thing for those who want to squeeze every last little droplet
of performance out of their programs.
Ben: Reminds me kind of, of websites that used to test the User-Agent
.
That’s kind of like the old behavior—
Jon: Yeah, it’s exactly like that.
Ben: And now it’s about testing which APIs you support, which is the nice way of doing things.
Jon: Yeah, it’s really the right way to do things is I think, the way to think about it. I think at the tail end here, I didn’t have anything more about 1.51, but there’s been another— we sort of like on this podcast, I think, to call out larger ecosystem efforts that you might want to get involved with. And one of these that that you mentioned, Ben, was the async Rust shared vision, document, or working group. You want to talk a little bit about this?
Ben: Sure. So, async Rust, obviously, just like how const generics became an MVP this release, async support in Rust is still kind of an MVP. There are a lot of future improvements that are kind of needed to make it more ergonomic, more useful, more capable. And a lot of the planning for this has been kind of stalled on, like getting user feedback. And so the people working on async right now are requesting feedback from people saying, Hey, like, how do you use async? How do you, like, how is it currently and what would you expect it to be like? And they kind of want to create this vision document to figure out, okay, where should we go? What we pursue next? What are our goals? Where should Rust async be in, like, five years? So we will link to this, there’s a great blog post by Niko Matsakis. And you can participate. You can— I think he’s been doing weekly meetings with folks kind of trying to, like— in the community, just to, like, be like, what is your need? What do you envision? So if you use async Rust, feel free to chip in, give your experience and what you think Rust should go towards.
Jon: Yeah, and I’ve been watching— they’ve set up, like, a repository for this where you can sort of submit issues and PRs with your story, if you will. And one thing that I think is worth calling out is that it can be really simple. It’s not— you’re not expected to propose, like a solution. You’re not expected to know everything about the problem or anything like that. It’s really just if there— if you’ve ever tried something with async and it didn’t work or something was a pain, then tell us about it so that we know about this use case and know to sort of optimize for it. Know that it’s a real one. And it can really just be about writing a short story about some experience you had with async and Rust. I think that the phrase from the blog post is, “our goal is to engage the entire community in a collective act of the imagination,” which I think is a very nice way to phrase it.
I think that’s everything. I think we got through two more releases, Ben.
Ben: Yeah, we’re getting older. Yeah, As Rust gets older, so do we.
Jon: Yeah, I hope we don’t age with the Rust version numbers. That would be unfortunate. I think as a last call to action, remember that Rustacean Station is a community podcast. So if you have anything you want to see an episode on, if there’s a technical contribution, or a project, or a feature, or a feature request, or just a person you want to talk to, you like someone you want to interview, anything like that. Then please reach out to us. We have a Discord where you can reach both Ben and me and a bunch of people in the community. And we’ll help you get started, get set up and record an episode and get it out there. It gets better, the more people sort of put things that they are interested in out there. And we’re happy to help.
Ben: Yeah.
Jon: And with that, I guess we will now shut down, Ben, and restart our cron jobs. And then we will turn back on again in 12 weeks.
Ben: Three months.
Jon: Nice.
Ben: Alright.
Jon: See you then, Ben.
Ben: See you around, folks.
Jon: Bye.
Ben: Bye.