What's New in Rust 1.52 and 1.53
Episode Page with Show NotesJon Gjengset: Hello, Ben.
Ben Striegel: Hello, Jon. Welcome back to Rustacean Station.
Jon: Thank you. It’s— I guess welcome back is appropriate for the audience, but for us, we’ve just been sitting right here waiting patiently since 1.51, haven’t we?
Ben: Legs are asleep right now. They are fused to the floor. My skin has grown over the chair; I will need an ambulance, or some kind of, at least, knife or scalpel, to ever leave this spot ever again.
Jon: Yeah, I’m very glad that I have, like, a swivel chair at least, so I’ve been able to rotate, but that’s the extent of it really for me.
Ben: I’m actually not even in my house right now, I’m actually reduced to begging from passersby for food and water. But we do it all for the viewers. Or the listeners.
Jon: Yeah, I managed to, like, break a window so I can scream outside to get people to bring me stuff, but I’ve had mixed reactions.
Ben: Anyway, let’s do what we came here to do, and have been here to do, about the next Rust releases, 1.52 and 1.53. These two aren’t super big, but we also, in between those two releases, the Rust team has released a sketch of what the Rust 2021 edition will feature. So we figured we might as well go over that as well. And that should bring us to a nice full length episode. What do you say, Jon?
Jon: I’m very excited, I think we should dive in.
Ben: Let’s dive in. So the first thing on our list here for 1.52— I don’t
have anything— there’s not many big features for me. I have written down some of
the stabilized new things on the char
type, is that previously— I’ve gone over
this so many times before, I’m not going to reiterate, but as of Rust 1.0 many
things that should reasonably be associated consts and functions weren’t,
because those features didn’t exist back then. And so very slowly, parts of
std
are being migrated, here many years later, to finally have various things
be living where they belong, as associated functions and consts, and char
is
the beneficiary in this release, some associated constants on there, some new
functions— so you may no longer need to import the char
module from std
. It
should be much nicer, much more intuitive.
How about you, Jon, what do you think is the biggest feature of this release?
Jon: So I think there are two things that make me excited. The first one is
sort of the headline of 1.52, which is that now if you finish compiling your
project with, like, cargo check
or something, and then you run cargo clippy
,
cargo clippy
will actually run and actually give you output. This has been a
constant headache for so long, where if you ran cargo check
, and your thing
ran correctly, then running cargo clippy
did absolutely nothing. And so you
had to make this arbitrary change, or like, touch one of your source files, just
to run clippy, and finally with 1.52 you don’t have to do that anymore. It makes
me very happy.
But there’s also a stabilization that I think is really cool, which is the str
type now has a split_once
method. And this comes in really handy if you have a
string, and you just want to split it by some delimiter, and you want to get,
like, what was before the delimiter and what was after. Because previously you
only really had the split
method, which gives you an iterator over all of the
split pieces. but it’s just so common that you just sort of want the left and
the right. And you can do this with splitn
, and then the delimiter, and 2
to
only say split twice, and then call next
on the iterator twice, and then
unwrap and whatnot. But with split_once
, you just have one method that does
the thing you want. This comes in handy for things like parsing key-value pairs
that are separated by an equals sign. It’s very nice to have this just built in
now.
Ben: That is pretty nice.
Jon: And if you weren’t already aware of the rsplit
methods, and the
splitn
methods, I recommend you go look those up. So rsplit
is split from
the right instead of from the left, and splitn
is split but only this many
times and then stop splitting. They’re very handy to have in your repertoire.
There were also a bunch of functions that were made const
in this release, as
always. But one thing that was hidden a little bit in the detailed release notes
was that all of the division related functions on numbers are now const
. This
was surprising to me, that they weren’t already. But I think basically, there
was some discussion around what should happen if you divide by zero, or take
modulo zero in a constant, because you can’t panic because you’re in the
compiler. So what do you do? I think the stabilized behavior is just, your code
doesn’t compile, and you get sort of a compiler error that says, any use of this
value is sort of undefined behavior or invalid. But that’s, like, a nice
stabilization, that now you can do division, whereas previously you couldn’t.
Ben: It wouldn’t be undefined behavior, it would always just panic at
compile time if you tried to use it. The reason I think— so yeah, it is kind of
strange that division happened so long after, like— addition’s been there for a
long time, subtraction, multiplication, but I think yeah, first of all,
resolving the idea of, what does like, you know, the error message look like if
you panic at compile time. It should give you a nice well-formatted compiler
error, should be readable, that kind of thing, up to Rust’s usual standards. But
also because Rust has had some limited form of expressions in const position for
a long— since forever. Just to support things like, if you have a const
in
your file, you can do 1 + 1
and that’s a bit different from defining a const
function that returns 1 + 1
because they’ve always just been like, different
machinery, because back in the day there was no const fn
, so for a long time
you’ve been able to do, like, 1 / 0
in your const, and it would just panic at
runtime. And the idea is that’s now becoming a hard error. It’s a lint right
now, on by default, if you have an instance of this, you probably shouldn’t, you
probably— you would know pretty quickly if you did, but it’s becoming a hard
error. So there are like, you know— it is technically a breaking change where
something that did compile before because it was doing const x: i32 = 1 / 0
.
That will just stop compiling sometime in the future.
Jon: I mean, I think the other big benefit, or big difference maybe, of
const fn
s is that they don’t have to be executed at compile time, right? Like
a const fn
is something that— let’s say it takes an argument, right, that is a
number or something. And it does like argument + 1
and that’s your const fn
.
Then if the argument is itself a constant, it can be evaluated at compile time.
But if it’s a runtime value, like it’s read from the user or something, then it
gets run at runtime. And this is sort of the the nice duality you get of
constant fn
s. And what also makes them more complicated.
Ben: Speaking of compile time, there was an interesting development that happened in this release as well, that you won’t get from the release notes for 1.52, but for the point release of .1 that came shortly after. So do you want to talk about this, Jon?
Jon: Yeah, so this was Rust 1.52.1. And it’s always interesting when we get
these patch releases for the compiler, because usually there’s some interesting
thing underlying there. And I actually thought the 1.52.1 blog post was really
good about explaining what the problem was, why the decision was taken the way
it was, and sort of where we go from here. Essentially what happened was, it
turns out that there’s been a decent number of bugs in Rust’s incremental
compilation feature for a while, where artifacts were reused when they couldn’t
safely be reused. And what would happen is, like, you got a compiler error or a
linker error or something, and in reality, you would run, like, cargo clean
and then you would build again and then it’s fine.
But in practice this shouldn’t happen, right? Incremental compiles should just always give you the right result. And what happened was, they added a check to the compiler that would sort of crash early if it ever detected that incremental compile would do the wrong thing.
And the hope, of course, was to fix all of the errors that that change would find, before the actual changed landed. I don’t know exactly how this happened, whether it was a lack of communication, or whether it was sort of, they didn’t anticipate that too many errors would be found. But ultimately what happened was, in 1.52.0, this check was added to the compiler. But there were still cases that hadn’t been fixed. And so when the release came out, suddenly a bunch of people just couldn’t compile their code anymore with incremental compilation, because they would just get these internal compiler errors saying incremental compile would do the wrong thing. And it was such a big problem and happens for so many cases that they ended up cutting this 1.52.1 release, which disables incremental compilation entirely. And this is sort of a controversial decision, right? Because like, shouldn’t you just make it not panic? But the argument is that, well, if we make it not panic, it’s still going to do the wrong thing. And so we would rather never do the wrong thing than just hide the panic from you. And so the the incremental compilation actually remains off even in 1.53. But I believe the plan is that for 1.54 they sort of fixed the remaining major issues there.
And some people are like, well, why don’t you just undo the changes that cause these panics to happen, and in reality this is just sort of detecting errors and bugs that have been there for a long time. So it’s not as though in 1.51 incremental compilation was just fine, it’s just you wouldn’t be told about it until something went wrong later in the process. Which it didn’t always do, you might just end up with a slightly incorrect binary, which in and of itself is a problem.
Ben: Yeah, it should be mentioned too that this mode, incremental compilation, was not on by default for release builds, so it’s unlikely that any kind of code generation error actually made its way into any Rust binary. It was only for debug builds where you mostly see this, because it affects code generation negatively in general. So it’s not really a security concern, but it is not really a thing you want to ever worry about. Debug builds are for debugging bugs in your code, not for bugs in the compiler.
Jon: Yeah, exactly.
There were some other smaller things in 1.52 that I think are worthwhile calling
out. The first one is the upgrade to LLVM 12. This is something that probably
won’t matter to most users, but I think, sort of, for the people who are
internally working on the compiler, this is like, an exciting upgrade. Do you
want to talk a little about about the noalias
discussion?
Ben: Yeah. So I mean, it’s a bit— not controversial, it’s been just kind of,
a bit of a saga at this point. So if you know a little bit of C, maybe you’re
aware of this restrict
keyword, which allows you to tell the compiler, hey, I
know that this pointer doesn’t alias any other pointer. And this is an important
optimization in some cases. So for example, I believe FORTRAN every pointer is
restricted in this way, and this allows certain kinds of code to be extremely
fast, you can have extremely aggressive optimizations applied. So C added this,
but unlike in, say, Rust, it’s kind of difficult to really know whether or not a
pointer is aliased in C. There’s no borrow checker, you’re kind of just on your
own, so it doesn’t really get used extremely much in C. It gets used some, but
Rust has had quite a lot— so there’s obviously support for this in LLVM, and
Rust has had quite a long time of trying to enable this by default for all of
the mutable pointers, all mutable references. Because that’s kind of, you know,
Rust’s whole thing is that if you have a mutable reference, you can’t have two
pointers the same thing; they’re always unique. And so you should in theory be
able to, kind of transparently, you don’t need any kind of keyword in your code.
The compiler should say, hey, I know that this pointer is not aliased, and give
me all those sweet sweet optimizations. And it turns out Rust uses a lot of
mutable references, and kind of—
Jon: Who would have thought, right?
Ben: —overloads LLVM, and finds all kinds of edge cases. I think it was maybe enabled shortly after 1.0, or even before, and then disabled a bit after that, and re-enabled, and then disabled again, and then re-enabled earlier this year, I think? Yeah, for this— in this release. And then, never made it into stable because people testing on nightly found miscompilations in their code from having this enabled.
Jon: Yeah, we should be clear that these miscompilations are because LLVM
has just, like, bugs in how it how it handles things that are marked as
noalias
.
Ben: So, I mean, programs have bugs. It’s kind of like, you know, we have
testing to reveal these and so, the nightly process works, someone said, hey, my
code’s no longer compiling. I think it’s, you know, I can narrow it down to when
you— to the LLVM upgrade enabled noalias
for all these pointers. And so
actually, in contrast to past efforts, the patch for LLVM, or the fix— and the
problem in LLVM was identified almost immediately— a patch was sent in, pretty
quickly accepted, and then Rust had to update, pull in the new fork of LLVM,
update its version, and then update a submodule, and then get all that stuff
squared away. But it only took maybe a month or so for the new nightly to have
the fix. And so this is not enabled, again, in the next release, 1.53, and so we
won’t talk about it until 1.54, our next podcast. But hopefully in the future,
just like with incremental compilation, we will have some shiny new features
enabled, fixing old old bugs.
Jon: I also just think this is a really cool example of just, sort of, a force for good in the ecosystem of— this sort of positive cycle of improvement between the Rust project and LLVM. It’s really cool to see.
Ben: Because like, you know, also if you’re using— it’s possible, entirely
possible— I can clearly imagine that if you’re writing C code and you were
hitting one of these crashes using restrict
, you almost certainly assume it’s
your fault. Because again, it’s pretty hard to really be sure that you’re not
aliasing a pointer in C in every single case. And so, knowing it’s a compiler
bug— assuming it’s your fault, not a compiler bug, and so if you are writing C
and you use the new LLVM version, you’ll also be fixed. So it’s a great, like,
you know, example of Rust giving back to LLVM.
There’s even a blog post, in a different capacity. I forget why, but there was a LLVM blog post between like, Firefox, LLVM and Rust, there was a really cool, I forget what the topic was or the actual bug that was fixed, but it was just another— I’ll try and find out for the release notes, but it’s always great to see projects working together and giving back. Open source. It’s pretty cool.
Jon: Yeah. Another cool change I thought from 1.52 was, there’s a new lint
in town. So this is the unsafe_op_in_unsafe_fn
lint, which I’m really excited
about. So this is from, I think RFC 2585. The basic idea here is that if you
declare an unsafe
function, then previously by default, and still actually,
the body of the function is considered to sort of automatically be an unsafe
block. And this is sort of nice in that, if you declare an unsafe
function,
you can just use unsafe
functions within it without having to explicitly
annotate them, but it becomes a little unfortunate when you write longer, more
complex unsafe
functions because it’s now very hard to tell which parts of the
body of that unsafe
function are actually unsafe, like which ones rely on
things that are unsafe
. And so, what’s changed with this lint, which I think
is still off by default, but I think the idea is to turn it on for the next
edition.
Ben: No, no, it’s not actually the plan. For this year’s edition, or the one after that?
Jon: For the one after.
Ben: Maybe. We’ll see. There’s been some talk about that, but go on.
Jon: Well, it’s true, it’s too far off to say. But basically this lint is
going to make it so that you, if you turn it on, it’s going to tell you to add
unsafe
blocks for any unsafe operations, even if you are inside the context of
an unsafe
function. And so basically you can think of this as sort of a lint
that opts you out of saying that the body of an unsafe
function is itself
unsafe
. I think this is going to be— it’s a little bit of a pain, if you’re
just writing short unsafe
functions where the entire body is just calling some
unsafe thing. But for anything that’s larger, this is going to be, I think an
improvement to the explicitness of what is actually unsafe in there. And that’s
also the motivation of the RFC.
Ben: I can see, like, products that have really like, you know, strict
security requirements mandating that this be on, when I have a kind of like,
really hardened Rust. I assume this will be like part of the repertoire of flags
that you enable, like, you know, all the clippy stuff. Yeah, a lot— it’s kind of
like— unsafe
means two different things, kind of. Like, obviously, like it
means the same thing, like be careful, but like— if it’s as part of a function
parameter, it means that this function should require an unsafe block to call,
but like as an unsafe block, it kind of like means the opposite, where it’s
like, hey, trust me, I think this works.
Jon: Yeah. The way I often think about it is the unsafe
keyword on a
function is a, like, declaration of a contract saying, to call me, you have to
promise these things. And an unsafe
block is saying, I have checked the
contract of all the things inside the unsafe
block, and I assert that I uphold
them. And those are very different things.
Ben: It’s nice too, because there is no “safe” keyword. If like, in theory, you were to write an unsafe function that had an entirely safe body, like that’s not, like, I think— I can’t really think of, right now, of a reason why you could do that, but like, I mean, it’s a thing that should be allowed. And like, parts of your function too, like obviously, can definitely be safe. You want to have, like, tightly scoped unsafe blocks. There is no way to say, this part is safe and this part isn’t safe if the entire body is unsafe. And so I think it’s a good change.
Jon: Yeah, I mean there are definitely cases where you can have a completely
safe body, too. Like you can imagine an unsafe
function that decrements the
reference count for an Arc
. But like, that’s a safe operation, but it might be
unsafe in the context of your particular library because of other invariants
you’re maintaining in the code. And so the function really does need to be
unsafe
, even though the body of it contains no unsafe code.
I had two other ones for for 1.52. One of them is pretty short; they both really are. The first one is that now you can use task lists in rustdoc markdown. So task lists are things like, you might have seen on GitHub, like you can you can have in markdown, like a dash and then square brackets, with either an “X” in it, or a space in it and it gets rendered as, like, a list of check boxes. That is just now supported in rustdoc markdown as well, which is really nice.
And then the other, which I guess is a much more serious change is, there’s this
magical environment variable called RUSTC_BOOTSTRAP
which you probably should
never use. It’s a— or rather I should say, you should never use this environment
variable. If you are absolutely positive that this is the exact right thing for
you to use, then go ask on Zulip whether it is first. Like, be really sure that
you need it. It’s an environment variable that lets you use unstable features on
the stable compiler, which sort of violates all the stability guarantees of the—
of Rust itself.
Ben: Obviously, you opt out of stability at this point. Just understand.
Jon: Yes, exactly. Understand that there are lots of warning signs here, and you should probably not do this. But it used to be that it was possible to set this environment variable from a build script. And what that meant was, I could write a crate that I upload to crates.io or something, that uses unstable features, but is compilable on stable by setting this environment variable. And this is just really bad, because it means I can violate all of the stability guarantees for Rust, even in a published crate, without the caller knowing that this has happened. And so as of 1.52, that is just no longer allowed. If you try to set that environment variable in a build script, it just gets ignored by Cargo. So it’s like a— it’s a good just, sort of, additional safety mechanism for “don’t use this”, especially not for this purpose.
Ben: I will say that on the issue tracker, I do often see a certain jonhoo’s
name associated with many of these discussions about RUSTC_BOOTSTRAP
. I wonder
if he’s being a naughty boy, using it in his own projects.
Jon: I definitely am, but I have asked many people for many permissions first.
Ben: Ooh. I’m telling.
Jon: I think that’s all I had for the 1.52, I guess release series, because it had two patch releases. But then that— I guess that brings us to 1.53, which is, in and of itself, very exciting. Lots of cool stuff in this one.
Ben: Yeah, the coolest thing in this one is definitely the into_iter
implementation for arrays. So this has been, again, a very long saga, in this
case for the sake of backwards compatibility. I think we talked about this
previously, one of the— did we?
Jon: Yeah, we talked about this in, I think 1.50 or 1.51.
Ben: Yeah, I’m not sure— that’s right, we were talking about the upcoming
release, because we were— It’s an exciting fun hack that allows like, I think
that it is entirely natural. It’s as a use of const generics, where now that
const generics are here, and they allow arrays to be kind of like, more first
class than they used to be, in terms of like, being able to implement things for
them in a way that makes sense. And it’s consistent with the rest of the
language. Now, you can just write for i in [1, 2, 3]
. Whereas previously, to
actually iterate over an array, you’d have to either call like, .iter()
.into_iter()
, or take it by a slice with a ref in front. Now, you can just
have a by-value iterator over an array, which is pretty cool, which didn’t
happen for a long time, because specifically, people were calling .iter()
on
arrays, which, because of autoref, would give you the iterator for slices, which
yields elements by reference. And you can work around this by calling, you know,
have your array .into_iter().cloned()
to get things out by value. And, like,
if you have, you know, if types are Copy
, it’s not going to be super expensive
to get them out, but really you should be able to just do this and it’s really
cool that they found a way to make it happen. Do you want to talk about how they
made it happen, Jon?
Yeah, the— I mean, the hack is is sort of beautifully simple. The basic idea is
that in the 2015 and 2018 editions, so basically in all the current editions, if
you have an array and you call into_iter
on it, then autoref will sort of take
place. So autoref is this— we’re going to call it a compiler feature or language
feature, where if you have a method that takes a reference, and what you have is
a value, then it will take the reference of that value for you before calling
the method. So this is what enabled, like if you had an an actual array and you
call .into_iter()
on it, that led you to get a iterator of a slice of the
array, rather than the array by value. What they’re going to do is basically
have that autoref continue to take place for the 2018 and 2015 edition, even
though there is an IntoIterator
implementation for array, so not for the slice
that requires the autoref. And then in the 2021 edition, what they’ll do is not
have that sort of special casing, so that if you have an array, you call
.into_iter()
, it will do the thing that will normally happen for any sort of
dispatch, which is, if the method exists on the thing by value, then use that,
and only apply autoref if it doesn’t exist for value. So this means that now
they can they can sort of stabilize this impl of IntoIterator
for arrays
without breaking any code that relies on the fact that you can use
.into_iter()
and get a slice. And then still not add complexity to the
compiler, but rather sort of remove complexity in the later editions.
Ben: Yeah, so it’ll be like fixed in the next edition we’ll talk about in a sec. So stay tuned for that. Next up I have Unicode identifiers. Jon was telling me that he had a need for this, just the other day.
Jon: Oh yeah, it was a very important need, which is I was doing a live stream where I had to name a type. And naming types is hard, so I decided that for a little bit of fun, I was going to name the type using a Norwegian word. But that Norwegian word had the letter Å in it, which is like an A with a circle above it, and it is not in ASCII, which means that the compiler yelled at me and said you can’t use that as the name. And it was very sad and I had to choose a different Norwegian word that didn’t have any weird characters in it. To the viewers’ great chagrin. But now as of Rust 1.53, you can use Unicode in identifiers. So that is, you can use it in the names of variables, you can use it in the name of constants, and you can use it in the names of—
Ben: Functions and types.
Jon: —fields and types.
Ben: Any identifier position.
Jon: Any identifier.
Ben: It’s not full Unicode though. It’s only— there’s a subset of Unicode called UAX #31, which defines things that are kind of like name-like glyphs that get used as parts of names. So this notably does not include any emoji. So you can’t just put random emoji as the names of things. You could say, because greek letters—
Jon: Yet. This is important, Ben. You cannot yet do this.
Ben: We’ll see. We’ll see. But just know that it’s not coming any time soon, if it ever comes. I know that certain languages allow this, like Swift allows this.
Jon: And I think part of the reason is this nice warning that Rust will currently give you if you use Unicode and you use a character that looks an awful lot like an ASCII character. The compiler will actually warn you that like, maybe you meant this other identifier that looks similar, but it’s actually a different Unicode symbol sequence. And it can do that probably in part because it only uses a constrained part of Unicode.
Ben: And there are actually, I think, no less than three different but related lints trying to warn people against, like, the threat of, say, like confusable attacks or that sort of thing. So if you’re— I think you can just deny it the entire feature if you want with a lint. So you could, if you like, for whatever reason if you’re like, I don’t know, the CIA, and you’re worried about Russian hackers inserting, like, you know—
Jon: Very topical.
Ben: —like cyrillic A into your Rust code that the CIA uses. You can just like be like, aha! Foiled you, KGB. Single lint.
Jon: Deny those Unicode characters that Jon wants.
Ben: Rust is so safe.
Jon: The other feature of 1.53 that I’m really happy about is “or patterns.”
So this is when you write, like a match, then it used to be that you could only
really match sort of one pattern or another pattern. So you could say— let’s say
you’re matching on like, an error type, like an io::Error
and you wanted to
match on like one of— oh, that’s actually a bad example. Let’s say you’re
matching on a number, to take something trivial. Then you could do, like, match
this number 1 | 2 | 3
would match 1
or 2
or 3
. But it gets more
complicated if you wanted to do something, like say you had an Option
of a
number, and you wanted to match, like Some(1)
or Some(2)
. Previously you
would have to like, list those out in that way. But now with or-patterns, you
can put the pipe character anywhere inside of a pattern to signify an “or” in
that location. So you can say something like Some(1 | 2)
to say, match a
Some
where the contents of that matches 1
or 2
. So it just means you can,
you can condense your patterns a lot more, you don’t have to repeat the full
pattern multiple times. It’s just a really nice ergonomics improvement, I think.
Ben: One of the library additions in this version caused some consternation.
So there is now an associated constant on the integer types called BITS
. And
the idea is, it’s pretty simple, it just— if you have an integer type, you just
call ::BITS
and it tells you how many bits are in the type. So good for
knowing how big a type is at compile time. Pretty useful, but the problem is
that a popular library had already done this and was being used in the wild, and
so it was causing an error because it wasn’t sure which one you meant, and the
compiler didn’t know that they were probably the same thing, it just knows that
it sees two instances of these names and it says, hey, whoa, I don’t know which
one to use. I’m going to just back off here.
And so it’s a bit different from the case where so often and std
what what is
allowed to happen is— or, how naming works in Rust is, if you use a thing, if
like locally, if you like— if you define a thing locally and you use a thing,
your local thing will, like, overshadow— the thing you define locally will
overshadow whatever you imported. And so like, if Rust wants to say, like, add a
method to a trait, and then you define that thing locally, I’m not sure if this
is a good example, but like, anyway, the point is, in most cases the thing that
you define locally just gets used. And so if you, I think one example is if you
have a module and you import everything from the module, then you define a thing
locally, Rust, you know, it isn’t going to complain in that case, because it
knows you want the thing you defined locally. Which allows std
to add things
over time, because otherwise it might be impossible to ever add anything, ever,
anywhere in std
, right?
In this case though, this is not a case where the compiler understands, like
hey, we want the local thing or even, you know, it’s— to have like, just two
libraries both defining the same thing. One is std
, yeah, but still, but it’s
still a library. So what should happen? And it’s kind of like, people wondered
for a while, like is it, you know, as usual, whenever anything like this
happens, like, the usual approach is, okay, we’re going to send out patches to
every library that uses this library as a dependency. We’re going to, first of
all fix it in the dependency, then we’re going to ask them to update, and they,
like, backported it to, like, every single major version of the library, so that
anyone on any version only needs to ever do a cargo update
to like, have it
fixed, but still, there’s always code out there that like, never gets updated.
And so how do you deal with that? And so I think eventually they just realized,
well this is just like we have to be able to add things sometimes. We’ve done
our best to mitigate this. We can’t do it over an edition, so we’re just going
to add it.
But it’s an interesting example of like, you know, trying to evolve a library when users can do anything and define anything, how do you just deal with that? How do you like, you know, there’s always this tension and so— every language has this problem too, where like, you know, people like, you know, talk about like Java being enterprise, C++ being like enterprise ready. But if you ever look at a Java release, there’s always a huge compatibility guide, of things that have changed either like, you know, slightly in behavior or sometimes even been outright removed. So it’s a tough problem making a language. The problem is the users, really.
Jon: It’s interesting too, because this, like, I think initially no one really thought that this was going to be a problem. So it almost landed in an earlier release, but then someone caught it on nightly and was like, this broke my compilation. And so they sort of reverted it, and then worked with the ecosystem to try to get rid of the conflicting uses, and then try to stabilize it after all. And it’s like, it’s a little, it’s like, as you say, it’s a weird complicated situation, right, where you don’t want the standard library to stagnate completely either. So you sort of want to have some of this collaboration with the ecosystem, but it’s a challenge, because not all of the ecosystem is necessarily available to you.
This is where things like Crater comes in really useful, but currently you can’t really do a Crater run to test whether a change breaks all the code that companies use, for example. It’s a difficult problem.
Ben: The problem is too many folks use Rust these days, like if no one used Rust then they can change whatever they want to.
Jon: Exactly. There’s some other cool stabilizations too, off the top of my
head, like, the BTree retain
methods is nice. So this is something that that
has existed on vectors and HashMap
s for a while. retain
is just a method
that, you give a closure, and it gets called for every element in the collection
and if you return true then the element is kept, if you return false, then it
gets removed. It’s a really easy way to just sort of filter down the elements in
a collection. Rather than, like, turning it into an iterator and then filtering,
and then collecting back into a thing which can be much less efficient. And that
has traditionally been missing from the BTree data structures, but that’s now
been added.
Durations now have a method called is_zero
that you can call to determine
whether this duration is zero, whereas previously you had to compare it against
like a Duration::new
, give it two arguments, that are both zero. It was just
really awkward to construct a zero duration and now you can just is_zero()
.
It’s very handy.
Ben: One of the minor changes, at least I thought was interesting, was in Rust’s handing of parsing and printing floating point numbers. In particular, being compliant with the floating point spec. Which is kind of— it’s always a rabbit hole, the floating point spec is large and subtle, and in this case there were a few cases where Rust was not in compliance with the spec. So that’s been fixed, as far as we know. But interestingly, this means that it’s a slight behavioral change where if you had a value that was negative zero— and yes, there is negative zero in floating point, it would print it as zero. And you might be thinking, well isn’t that what you want? Well, I mean semantically it is usually the same thing. Usually being the key word, because you know, floating point numbers sometimes does matter. But the idea is that the guideline is that numbers should be able to round trip. And the idea is that if I print out a number and then re-parse that printed number, I should get the same thing back. And so as you might expect if you just have zero, you get the bit pattern for zero if you parse that. If you have zero as a string, say, and then parse it as a float, you get the bit pattern for zero. But zero and negative zero are different things, and so you wouldn’t, you know, if you had negative zero printed as zero, you get the wrong thing back after you parse that. So that’s just one of those interesting little float quirks. And speaking of float quirks, I know Jon is just dying to tell you about this rabbit hole that he’s found.
Jon: Oh, just absolutely dying. Okay, so brace for this one. There is a
stabilization of a method in the standard library called is_subnormal
. This is
a method on f32
and f64
and it tells you whether a given floating point
number is subnormal. And if you don’t know what “is subnormal” means, well,
you’re not the first and indeed I had the same problem and had to look it up. So
subnormal is basically a sort of alternative representation for floating point
numbers that are very close to zero, where you set the exponent to zero. And
then the bits of the floating point number get evaluated in a slightly different
way. That gives you less range for which numbers can be represented by it, but
gives you more resolution close to zero. And as far as I can tell, subnormal
floating point numbers are often not supported by the hardware, they’re often
much slower. I don’t know exactly when you would care whether a floating point
number is subnormal, but now you can. So that’s good. It’s one of those fun
things where, when you’re writing a general purpose programming language, that’s
also designed to work for really sort of low level code, then things like this
just matter. People need to be able to check for all these weird
representational things. And I guess is_subnormal
is now one of them.
Another interesting implementation that I saw, that I was sort of scrolling
through, it’s like huh, what does this mean? So in 1.53 there’s an
implementation of SliceIndex
for (Bound<usize>, Bound<usize>)
. And there’s a
lot of unpacked in that sort of implementation. First of all, there’s a
SliceIndex
trait. So the SliceIndex
trait is a trait that describes types
that can index into a slice. So that sounds complicated because it’s a little
bit weird to think about, but the idea is that if you have a usize
, it can
index into a slice. If you have a Range
, then that Range
can index into a
slice. And now you can also use a tuple of two Bound
s, which is often how you
construct a range in the first place, right? You could say a tuple where the
first element is 0
and the last element is Unbounded
. You can use that to
index into a slice to get everything from zero and onwards, for example. So it’s
sort of an alternative representation for a range. And you can now use that as
the index for slices, which I don’t know if you ever have a use case for it. But
I thought it was interesting to sort of unpack why it’s even a thing and what it
means. It sort of stuck out as the longest thing on the list of stabilized APIs.
I had another couple of smaller things from the changelog around Cargo. The
first one of them is that cargo will no longer assume that the default head of a
git repository is named master
. This is in line with the change that GitHub
and other repositories have been making, of moving from master
often to
main
. And that will now just work in Cargo. You don’t have to specify like
branch = "main"
in your Cargo.toml
, it will just use whatever the head of
the repository is. So that’s a nice sort of ergonomic improvement for things
that don’t use master
.
1.53 also landed RFC 3052 in Cargo, which makes it so that the author
field of
Cargo.toml
manifests is now optional. It will not be generated by default when
you run cargo new
. And the idea here is that, it’s both sort of an issue for
leaking personal information, somewhat accidentally, but it’s also so that if
you want to sort of change your name or remove your name from the internet in
some meaningful way, now the crates.io team has sort of a mechanism for doing
that or you have a way to do that when you’re publishing crates which is just to
remove it from the Cargo.toml
because the field is now considered optional. So
that was a nice improvement I think, that landed in 1.53.
There was also a cool change to rustdoc, which is— and you may be surprised to learn this. In markdown, if you just have a URL that doesn’t have any sort of syntax around it, it doesn’t get automatically turned into a clickable link in the rendered documentation. That’s actually a— like, you can do this on GitHub because it’s a feature that GitHub supports, but in general in markdown and in rustdoc markdown in particular, those kind of bare URLs are not rendered. And in 1.53 there’s now a lint that will tell you this. And if you’ve written a lot of documentation for your crates, which I hope that you have, you will probably see this lint crop up a lot. I know I did, where now if I compile with 1.53 I get like, a screen full of warnings saying all of these URLs are not clickable and then sort of suggesting the syntax for making them clickable, which is just I think put angle brackets around it and that will happen. But this is like another way to sort of gently and slowly evolve documentation to make it more pleasant to consume.
I think that’s all I had from the sort of detailed release notes too. Did you have anything else, Ben?
Ben: That’s it for me.
Jon: I think that brings us to the proposal for the next edition. It’s a big change.
Ben: Let’s go over real quick what an edition means. The idea of an edition
is that it’s an opt-in change to, like, some parts of Rust that might be
backwards incompatible. So the idea is, if the language wants to evolve, like we
talked about the BITS
change recently, and that for whatever reason couldn’t
be an edition change, but we’re talking about some things here where adding
things to std
in certain ways would have been a breaking change. And rather
than just go ahead and do that, the idea is to mitigate that by saying, hey, if
you want this, you have to opt in to this change.
Jon: So Ben, are you telling me that this is just Rust 2.0?
Ben: No, no, no, no, no. Because here’s the crux, here’s the essential difference. Like people, like, you know, learned a lot from the Python 3 / Python 2 split, and crates from any edition always work together. Because fundamentally they’re all just the same code underneath. Editions only ever kind of exist in like, the very front-iest front-end of the compiler. Right? So there’s no code generation difference, there’s no back-end difference. There’s no, like, really even any kind of optimization or middle-end difference, in terms of like, how the compiler understands how the code works or is structured. It’s all just surface-level kind of cosmetic, cosmetic-ish changes, as deep— they can kind of get a little bit deep sometimes, which we will see. But mostly it’s all just things that kind of like, get tweaked and moved around a bit. And the idea is, if you don’t want to use them, you don’t have to. You can just remain blissfully unaware and use your current stuff because almost everything lands in every edition. You don’t need to stop using your edition, if you want to still use, like, the original Rust as it was, like in 2015, you can still do that. You can do that to this day, and you always will be able to do that.
Jon: You almost certainly do not.
Ben: You do not, and you shouldn’t, but you can. And that won’t be a problem because there’s no, like— you can still upgrade your compiler without fear. You can still use libraries that use new stuff. So it’s a pretty sweet solution. And so it happens every three years or so, and it’s about time for the new one. And the last one was pretty big. This one’s going to be much smaller. The last one had like a lot of headline stuff, and it was the first one in ever edition, and they had a brand new shiny toy, they wanted to advertise it and be real big, and kind of like putting up a big, like, marketing event for Rust. We’re talking about like everything we’ve done in the past three years. And since then, it’s kind of been like, well actually we’re pretty good at just talking about things as they happen. So it’s more of a low-key edition. No, like, real huge headlining feature. Which is maybe a good thing.
Jon: Yeah, and, I think it’s easy to sort of understand what the edition means when you— like, this blog post, I thought, was really well written. And I think just seeing the kind of changes that happened too, on an edition boundary, helps a lot in understanding the kind of things that they can change and sort of by exclusion, the kind of things that they can’t.
Ben: And we should mention too, that if you ever like— the standard way of
making a new project in Rust is like, you type cargo new
or cargo init
, if
you’re already in the repository, and it just adds your Cargo.toml
, and the
secret sauce, the magic, the final, like, magic sprinkle of this is that in the
new edition, any new crate always begins with the most recent edition. So like,
the ecosystem gradually moves forward as folks, write new things, even without
having to opt-in their old stuff.
Ben: So yeah, and let’s get into some of the changes. There’s not too many. Not too major, but they should be pretty nice. First up is new additions to the prelude. So, in Rust the prelude is sort of— you can imagine— you’ve written Rust before, you don’t need to—
Jon: What? No, I haven’t.
Ben: You haven’t? Oh, let me talk about it’s written language, it’s super safe, super fast, it’s super cool.
Jon: Rust, isn’t that a game?
Ben: Well it is, actually. It is both. We’ll talk about later, maybe in a
future episode. But, so the idea is in a Rust program, you don’t need to like—
to write a Hello World, there’s no like— you know,
fn main() { println!("hello world"); }
, you don’t need to import println
,
you don’t need to import many various things that are fundamental to many Rust
programs, stuff like Option
, stuff like Vec
. You know, it’s always just
there lying around, ready for you. And if you, you know, if you know the module
system, you know that you can like, type like, you know, use crate_name::*
.
And there is effectively in every Rust program, a line at the top, kind of like
hidden, is use std::prelude::v1::*
. The v1
just kind of being, you know,
like forward thinking, it’s invisible, who cares. The idea being that this
prelude is what gets brought into scope in every Rust program. And so normally,
it’s not a big deal to add things to the prelude because as mentioned before, in
Rust if you define a thing locally it’ll take precedence over whatever you
import, so they can add things there, like types for example, without problems.
So right now, HashMap for example is not in the prelude, but if they wanted to,
they could add it without an edition, just whenever they wanted to, and it
wouldn’t be a breaking change.
But for traits specifically, the idea is if you’re calling a method on certain
things, then the method might become ambiguous if they added a trait to the
prelude. So this new addition, or the addition to the edition is that three
traits widely requested are being added to the prelude, available only with the
new edition. They are TryInto
, TryFrom
, and FromIterator
. And so TryInto
and TryFrom
are just the counterpart to the Into
and From
traits, which
are already in the— they’re in the prelude, right? I’m pretty sure those are
already in the prelude.
Jon: Yeah, they are.
Ben: You know, definitely they are, because you can do like, you
know,x = String::from
and then a string literal, to make a String
. So now
you have this for anything that is not an infallible conversion. And then
FromIterator
makes sense because IntoIterator
is already in the prelude,
because it’s what lets you kind of use for
loops, pretty much, and it’s kind
of just the reverse. And so it’s very useful for taking a thing that’s already
an iterator and turning it into— it’s actually, it’s how collect
works if
you’ve ever seen, you know, a long iterator chain and then .collect()
. This
lets you just use FromIterator
directly, which powers collect
under the
hood, which is useful in a bunch of contexts.
Jon: Yeah, it can also make it a little bit nicer to write certain types of
generic code, where the annotations get a little ugly if you have to use
.collect()
, whereas if you can use FromIterator
, you can sort of— you get to
turn the chain around, and sometimes that that restructuring makes the code a
little nicer to read.
The other thing that that’s on by default in the 2021 edition, or will be on by
default, is we talked I think in the previous episode on Rust 1.51, about
Cargo’s new feature resolver, which you can activate currently by setting
resolver = 2
in your Cargo.toml
. And in 2021, or in the 2021 edition I
should say, that resolver version will be the default. Because it’s just better,
like, you should just always use this. But it’s backwards incompatible for any
given crate, to switch it from 1 to 2 because now your features mean slightly
different things, and your dependencies are compiled in a slightly different
way. But because the edition is an opt-in anyway, it’s totally safe to say if
you opt into the edition, you’re implicitly also opting into the resolver. Those
are— that’s a totally safe transformation to make at an edition boundary.
Ben: We also already mentioned the IntoIterator
for arrays change and how
there is this kind of like very specific, targeted, laser-targeted hack, which
allows code that exists currently that calls, you know, some_array.iter()
to
still function, even though nowadays that should give you the by-value iterator
and not the by-ref iterator. So that hack will not exist for any editions prior
to 2021. So all the code will keep compiling, but now in the new edition and
beyond, it will work as you expect. If you’re writing, like, new code that uses
iterators, and remember that also, you don’t need to normally write .iter()
if
you’re doing things with, you know, arrays and iterators, you could also just
use that in a for loop to say, for blah in the_array
. Or there are definitely
other contexts where you can just, like, make an iterator without using .iter().
So it’s not even the case that, like, this is usually a problem, but it will
become more consistent.
Jon: Yeah, it’s only in the specific case of using like, .into_iter()
as a
method on an array.
Ben: I think just .iter()
is the problem. I think .iter()
. Maybe? I’m
not sure, one of those two.
Jon: No, .into_iter()
is the problem. .iter()
has always worked, because
it always gives you a by-reference iterator, but .into_iter()
on an array
gives you a by-value one.
Ben: OK. You’re right, yes.
Jon: The next one is, and this is I think a big one for me, is that in the
2021 edition, closures are going to capture only the specific fields that are
moved into them. So if you have a struct that has two fields x
and y
, and
you only access, like, struct.x
inside of a closure, and let’s say you have a
type a
, that’s an instance of that struct and you access like a.y
inside the
closure, but you don’t access a.x
. Then the closure will only move a.y
into
the closure and not a.x
. This is a really good sort of quality of life
improvement, or rather it’s just less surprising to users, I think, where
previously you would get these unfortunate borrow checker errors that are really
annoying to work around. If you say, like drop or move a.x
, then then you try
to use a.y
in the closure it, like, fails, and you have to assign a temporary
variable the value of a.y
outside the closure and then move it in. It would
just require these awkward workarounds that are hard to discover. And in the
2021 edition, finally, closures will only capture what they need. And so those
kinds of weird errors or warnings from the compiler will will no longer happen,
and that’s really nice.
Ben: This next one we spoke about last time too, I believe. It’s panic macro
consistency, the idea being that— there’s a new feature incoming, sometime this
year, hopefully, which allows you to implicitly capture formatting arguments
from the environment. So if you’re used to string interpolation from other
languages, used to probably writing kind of, you know, like in your string
literal, have, like, the name of some variable and it just kind of like plugs it
in right there, with no more ceremony needed. And so that is coming to Rust in
limited form later on, but while doing this, it was noticed that the panic!
macro wasn’t actually— in the case of a single argument to panic!
, it wasn’t
actually properly forwarding to the format_args
method. So like, if you had a
panic!
of just a string containing some braces, it would not interpret that as
a thing to format. So that was inconsistent with all the other, like println!
and so and so, and so there’s been some shuffling around where, like, the
panic!
macro will now do this, but there’s now a new macro called panic_any
for the old behavior, and so on. And so that’s just moving forward by default in
the new edition.
Jon: Yeah, and I think panic_any
is actually— it’s a function, not a
macro, specifically for this reason. And then another one is reserving syntax.
So the idea here is it doesn’t actually change any behavior, but essentially
what they’re doing is saying, we’re going to reserve the syntax, like some
prefix, followed by either a pound and then an identifier or followed by a
string in double quotes, a character in single quotes, or a number. So this is
something like— you’ve already seen this, if you’ve ever written, like, a byte
literal string is something like b"a string"
which gives you a slice of
[u8]
, like a byte string, rather than a Unicode or UTF-8 encoded string. And
this sort of reserving this syntax of anything that matches this pattern of
prefix and then #identifier
or "string"
, and so on, means that later on
without an edition boundary, they can add support for more prefixes. So
currently there’s only I think b
for byte strings and byte chars, and r
for
raw or un-interpolated strings. But you can imagine that you want to add
additional prefixes in the future, for things like a shorthand for format
strings, or for null-terminated strings, or for like, using a keyword like
async
as an identifier, like the name of variable. But those things would all
require edition boundaries, if they made it so— if they added a new prefix,
because the the general syntax of say k#keyword
would not be reserved. But now
they’re just, like, blanket-reserving anything that looks like this, so that
they can add these prefixes later on, without that requiring an edition
boundary.
Ben: I’m qualified to talk about this because I wrote the RFC for this actually.
Jon: Oh, nice.
Ben: Yeah, you may be, like, so the actual syntax here, like you from the
previous edition 2018, the idea was, you mentioned you want to be able to
actually write, you know— you have code that maybe has async as an identifier.
And Rust wants to use it as a keyword. So now there’s this thing called raw
keywords in the language, or raw identifiers in the language, which lets you
just, like, use any keyword as an identifier by having this r#
in front of it.
And it’s funny because the reason that was chosen is because that was already it
was a, like, syntactic construction that was already, just by chance, a syntax
error in all versions of Rust, because it was trying to, like— it thought you
were trying to parse a string. So because of that, it could reserve this syntax.
And so we’re kind of expanding that, the hashtag syntax, to other things,
potentially. And so one of the previously proposed ideas for the edition was a,
kind of, a k#
identifier which does the reverse, which lets keywords be used
as identifiers. And it’s kind of, like, there’s no actual use for that yet. But
the idea was, well it’s an edition, so we have to like, you know, it would
require a change to do, so we’ll just do it right now. But the idea is, we don’t
actually do that right now. We can still, like— there’s no time pressure,
there’s no crunch. We can just, like, reserve this syntax now and then like,
worry about the details later.
And the same thing for strings. And so like I mentioned before, format strings,
like Python-style F-strings for example, are a big inspiration for the current
implicit formatting arguments. And so like, at some point people might want to
actually move with having actual F-strings. So that’s that’s now a possibility.
Again, just a possibility. Nothing guaranteed yet. But also it allows things
like say, if you wanted to have, like, just actual string, like capital S
String
literals, you could have those. Maybe an s
prefix on there. A c
, or
like a z
for like C-style null strings. It’s also, like, the bytes have— or
sorry, the char literals have it too. And also numeric literals have it. And
this is actually, it’s funny because the exact same syntax Rust uses for these
these raw keywords. The hashtag gets used by languages like Ada and various
other languages to denote, like, prefixes. Usually for the choice of what radix
the literal should be in. I’m not sure, maybe there could be like a wrapping
string literal or wrapping integer literals, or like, you know, that kind of
thing someday.
Jon: Encryption. An encryption prefix.
Ben: Yeah, like, you know, secrets.
Jon: Yeah.
Ben: Yeah, that could be a thing. Maybe even user defined someday, like overloadable. But the idea— normally almost no code ever actually needs to worry about this. And so you may be wondering, why does this actually need an edition change? And so the idea is that macros that consume syntax, like, token by token, previously could observe the difference, and they might be broken by this. And so because macros consume things per token. Actually you can fix your macros if you have this, by putting in white space anywhere in this, to like, force it to be tokenized separately and not as a single, like, prefixed literal or whatever. But you know, it’s still like, it’s an incredibly unlikely and minor problem to actually affect anybody except maybe if you already have a crate giving you, like, F-strings, say in macros.
Jon: Right, but the nice thing here is the edition boundary lets us not
worry about whether it might be breaking, and just say on the edition boundary
we’re allowed to make this change. And in fact, sort of talking about macros,
that’s also the motivation for another edition change, which is: we mentioned
or-patterns earlier, Right? And when you write a macro definition using
macro_rules
, there’s this, like, fragment specifier there called pat
for a
pattern that that matches anything that is a valid Rust pattern as its input.
And one thing that they had to do when they landed the support for or-patterns
is that a pat
fragment specifier in a macro_rules
does not match anything
that includes an “or”. And the reason for that is because those can’t
necessarily— like if you were to match them, those patterns can’t necessarily go
anywhere patterns can go. That is, there are still some places where you cannot
use or-patterns, and so you don’t want the macro to match an or-pattern when it
can’t necessarily go there, in what the macro expands to. And so they can’t
change what pat
means, within the current edition, but across editions they
can. So in the 2021 edition, pat
will match or-patterns as well, and then
there will be this like, pat_param
fragment specifier, that can be used to
match anything that doesn’t include an “or”, that can be used if you
specifically don’t want to match them. But it would of course be a breaking
change, to change what pat
matches now, but the edition boundary lets us do
it.
Ben: And then finally, there were people who were around for the previous
edition change will remember that some things that used to be used to exist
became warnings, some syntax. So in particular, you used to be able to just— or
you still can, it’s not an error just yet, but it used to not give you any kind
of guff for just using a trait object without the dyn
keyword. So you could
just kind of like say, like if you want an object, you just say like hey, the
name of the trait, and that’s just now a trait object. And so like, when
impl trait
was coming around, it was kind of like, well, do we want to like,
really privilege, like dynamic stuff over static stuff here, like, who should
have the keyword and who shouldn’t? And so I think in the future, it’s going to
be just that like, if you want dynamic dispatch, you need to use the dyn
keyword. Static, you need to use impl
, and like, there was talk, like years
ago, of like, you know, not requiring the impl
keyword if you want to have,
like, you know, a statically dispatched kind of like a, you know, type for
returning, or as inputs to a function. Which I’m not sure it will happen, but
it’s just the case that now we will require the dyn
keyword in the new edition
if you want to dynamically dispatch your traits.
And similarly, in Rust 1.0 it was kind of like, they weren’t yet sure about the
syntax of inclusive ranges, where like, some languages have like three dots
meaning one and two dots meeting the other. And so eventually it was chosen to
use the ..=
syntax for inclusive ranges, as opposed to the just plain ..
for
exclusive. But there had already been stabilization for using ...
in pattern
position. And so that’s been deprecated since the previous edition and that will
now become a hard error in the future.
Jon: I think the big question on everyone’s mind now, Ben, is, but when will I get this edition?
Ben: Sometime in 2021.
Jon: Nice. I think the goal is Rust 1.56. But as this blog post points out as well, like Rust is a project that’s run as an open source project, mainly by volunteers. And so there aren’t really hard deadlines, like 1.56 is the goal, but that might slip, either for sort of personal reasons or for technical reasons, and it’s all fine. The hope though is that it will land in 2021, otherwise I’m guessing the name would change.
Ben: And I believe there will— the current timeline does include a testing, a period of open and public testing in July, so I’m sure they’ll announce that, so keep an eye on the Rust blog and then if all goes well, it should be running the trains by end of July, and out sometime in September or so, I think.
Jon: It’s very excited. I’m very excited. Are you excited, Ben?
Ben: I am. I’m thrilled. I’m just bursting with joy.
Jon: Well, I think that’s all we had for this time around. So I suppose all that’s left for us to do now is just sort of strap back down in our chairs, and just conserve energy until the next time.
Ben: I’m here. I’ve been here, I’ll be here, I’m going back into stasis, I’m hibernating. I will set my alarm for six weeks from now.
Jon: It’s going to be great. I’m already looking forward to being able to stretch my legs again.
Ben: That said, let’s play the outro. Nice seeing you again, Jon.
Jon: Likewise, Ben And I will see you in 12 weeks from now.
Ben: All right, and see you too, folks. Goodnight.
Jon: Bye.