perl v5.36.0 has been released

Yesterday, I released perl v5.36.0. I think this is the most exciting release of perl in quite some time, and I’m hoping that in a few months, I’ll still be as pleased with it as I am today. Here’s a summary of what we got done, what we didn’t get done, and (to some extent) how it got done.

What’s new?

Every new version of perl ships with the “perldelta”, which provides a fairly detailed list of what’s new, down to small optimizations and bugfixes. It’s not comprehensive, but it’s more detailed than you usually want for casual reading. The perldelta for v5.36.0 is weighted to put the good stuff first, but if you want, you can just read my summary here!

A bunch of the things documented below are experiments, and will warn you when you use them. Check them out, use them day to day, but don’t stick them in your critical money-making code just yet!

reducing boilerplate

Perl 7 was announced a little less than two years ago, and from there, differences in priorities within the perl development community came to a head, eventually resulting in perlgov, a new system of governance for the project. If you want to know more about it, you can find a lot about this on the web. I talk about some of it in my What’s New in Perl 5.34 talk from the Perl Conference last year. I won’t be dwelling on it here, but it’s important context for one of the big priorities in v5.36.

Perl 5 is a fairly old and complex language, and over time quite a few things have become acknowledged almost universally as best or worst practices. Always start by using strict. Never use bareword filehandles. Perl 7 was going to change the default behavior of the language so you didn’t need to opt in to best practice enforcement. If you wanted to opt out, you had to. This would cause work when upgrading, which was (I think) the primary objection to the plan.

Even those of us who didn’t want to change the default behavior of perl generally acknowledged that perl’s defaults are not great. Making it easier for people to get the best possible perl was a priority. We started talking about how Perl 7 changed the behavior of the program starting at line 1, while having to write “use strict” would change your program by line 2. The goal became “reduce the entire set of best practice opt-ins to a single line”. This would keep the “no opt-in” behavior as the sometimes annoying or dangerous Perl 5.0 default, but would make it possible to get everything that we think you should be using in a single line. That line is:

use v5.36;

The use VERSION syntax is nothing new, but we made it a priority to focus on what it should do, and to make that happen. I now think it’s practical to write long-lived programs that have use v5.36 as their only boilerplate line. Here’s what it does, in brief:

  • turns on strict (use strict)
  • turns on warnings (use warnings) ⬅ that’s new!
  • turns on all the same new features as v5.34…
  • …except switch, which is finally left out!
  • turns off indirect object syntax
  • turns off multidimensional array emulation
  • turns on subroutine signatures ⬅ woo!

turned on: warnings

This was the first big reason that use v5.34 wasn’t enough for me. I also always wanted warnings. Now I get them. Excellent!

turned off: indirect object syntax

“indirect object syntax” is the thing that lets you write $obj = new Class(...) instead of Class->new(...). The big problem this solves is that it converts many bugs from runtime errors into compile time errors. For example, this code will now fail at compile time, instead of runtime:

#!perl
use v5.36;

# Note that I didn't use Try::Tiny here.

try { some_expensive_operation };

With indirect, this will call some_expensive_operation, then try to call the method try on it. This will (probably) fail, because there is no such method. You meant to use Try::Tiny to get the try function, but forgot. With the indirect feature disabled, this becomes a compile time error. (It’s a slightly arcane error, but it happens at compile time rather than run time, and I think it’s a big win.)

turned off: multidimensional array emulation

I’ll miss this one, because I liked it, but I agree it was really weird and probably caused a lot of confusion.

for my $i (0..10) {
  for my $j (0..10) {
    $value{ $i, $j } = get_value($i, $j);
  }
}

See how I put a list of values in the subscript for %value? That’s not a mistake. Perl will see the list and implicitly join the list’s values into a single string which is used as the hash key. Join with what delimiter, you ask? Why, whatever is in $;, of course!

Without the multidimensional feature in play, you’ll have to write the join by hand. Or do anything else you like.

turned on: subroutine signatures

Well, this took us long enough, huh?

We’ve been kicking this one down the road for a lot of reasons. The biggest one is that we still think we can make subroutines with signatures much faster than subroutines without signatures. We didn’t do it yet, but it’s been so long that these have been sitting unchanged in experimental that it just seemed wrong to leave them there. Now, when you write use v5.36, you get signatures available on your subroutines.

There’s one change, though: you’re now warned against using @_ inside a subroutine that has a signature. Right now, it works, but later it might not. What will it do? Probably nothing you’d ever want to rely on. So just don’t do it!

n-at-a-time iteration

There’s a new (experimental) feature called for_list. It lets you do this:

for my ($left, $middle, $right) (@list_of_triples) {
  printf "%20s %20s %20s\n", $left, $middle, $right;
}

…and it will iterate over the list three-at-a-time. If the list’s length isn’t divisible by three, the missing values are presented as undef. This means, of course, you can write this:

for my ($k, $v) (%hash) {
  ...
}

I will definitely be using this to replace the tedious pattern of iterating over keys and immediately fetching out values. If you want to iterate over indexes and values in an array, we have you covered, too:

for my ($i, $v) (indexed @array) {
  ...
}

What’s indexed? It takes a list of values and returns the same list, zipped together with indexes. It’s useful, but not very interesting. What is interesting is…

the builtin namespace

Ages ago (well, 12 years ago), Yves Orton suggested we start using some namespace for new Perl functions so that we’d stop cluttering up the core namespace. This year, we finally did it. The namespace builtin exists to serve as a place we can put new functions without worrying about clobbering user functions.

If you want to use indexed you can either write builtin::indexed or you can import it:

use builtin 'indexed';

for my ($i, $v) (indexed @array) {
  ...
}

(Those imports are lexical, unlike the global functions you usually get when importing.)

There are a bunch of functions already in builtin in v5.36:

  • weaken, unweaken, is_weak - for managing weak references, now core
  • blessed, refaddr, reftype - for information about refs, now core
  • ceil, floor - for converting floats to integers; no more loading POSIX for these
  • trim - to efficiently trim whitespace from a string; this was a source of enormous contention!

And then there were:

  • created_as_string, created_as_number - these will tell you whether perl can remember that the given variable was created as a literal number or as the result of a numeric operation; perl doesn’t really have types for scalar values, but it can be useful to know what its gut feeling is
  • is_bool - the same thing, but for booleans
  • true, false - produce boolean values without having to write 1 == 1

These are very special-purpose tools, largely there to make doing things like serializing to JSON more predictable. You probably don’t want to use these often, but when you want them, these are a lot nicer than the old way, which was basically “screw around with B”.

defer blocks

If you’ve used Go (or some other languages) you know defer. It lets you write code that won’t be executed until the program is exiting the enclosing block:

{
  my $resource = construct_resource();

  defer { $resource->cleanup(leave_files => 1) };

  $resource->do_stuff;
}

Here, the defer block’s code won’t be executed until the whole enclosing block ends, after do_stuff is called.

Sometimes code like this could be reduced to “well, there’s a destructor on $resource”, but not always. For example, here, we’re passing some arguments in that might not do what the normal destructor would do. Also, there might not be a clear thing to handle the logic on destruction. This has led to a bunch of CPAN modules that are just sugar for “take a code ref, return a useless object, and when the object is destroyed, the code ref is called”. These defer blocks don’t make that entirely obsolete, but they go a long way.

finally blocks

Perl v5.34 added try and catch as experiments. In v5.36, we’ve added finally, which goes after catch and provides code that will run after the try block succeeds or the catch block executes, whichever happens. It’s roughly sugar for a defer block tied to the try/catch.

lots and lots of quote delimiters

Okay, this one is one of my personal favorites, even though it isn’t the most transformative feature we’ve ever added:

use v5.36;
use feature 'extra_paired_delimiters';
no warnings "experimental::extra_paired_delimiters";

my $error = qq«No value found at \$foo->[0]{$_} (fatal!)»;

With extra paired delimiters enabled, many new pairs join (...) and <...> and [...] and {...} as delimiters for quote-like operators. This means you’re much less likely to have to think about escaping or counting brackets. I’m pretty sure I’ll almost always use «...», but there are quite a few good choices.

This whole area was interesting because:

  1. It starts making a foray into non-ASCII syntax, which I think is good.
  2. It exposes the complexity in picking paired brackets; I’m hopefully that ongoing Unicode Consortium work is going to make this their problem, not ours.
  3. It found some very weird code on the CPAN, where someone had (more or less) written s/ ̈/ ̃/g to replace diaereses with tildes. But really, the way their code was written, it was more like s/̈/̃/g, where the combining marks combined with the slashes. So weird!

I look forward to more non-ASCII syntax over time. Probably.

…and more!

Look, there are a lot of other changes in there, but this post is pretty long already and I want to cover other stuff.

What isn’t new?

There were quite a few things we’d talked about getting into v5.36 that we didn’t, and some of them, I hope, are still on the list of things to be gotten done for v5.38.

disabling bareword filehandles

What if we got rid of barewords being usable as filehandles? It’d get rid of a major source of non-lexical variables and at the same time help ensure filehandle discipline (that is: closing things when you’re done!). Turning off the bareword_filehandles feature was meant to forbid using barewords as filehandles, but quite late in the game we realized that it was insufficiently tested, and sometimes forbid using a bareword for a class on the left hand side of an arrow.

dinah:~$ perl -E 'no feature "bareword_filehandles"; print Class->new'
Bareword filehandle "Class" not allowed under 'no feature "bareword_filehandles"' at -e line 1.

Hopefully this just needs some more testing and bugfixing.

mandating clearer source code encoding

I had really wanted to make use v5.36 also imply use utf8, which would declare that the source file on disk is encoded in UTF-8 and should be decoded as read. Eventually, I was convinced that this would be too confusing, because it would break programs that printed their own literal strings to normal filehandles. They’d be printing Unicode codepoints to a byte stream, which would trigger warnings and, sometimes, produce the wrong output.

Eventually, the proposal became RFC 7, which creates a new pragma, use source::encoding. It lets you declare that the source file is declared in UTF-8, equivalent to use utf8, but it also lets you declare the the file is in ASCII. Then, the version bundle can default to implying ASCII. It won’t let you just write non-ASCII in your source, but it will turn non-ASCII source code into a compile time error rather than weird runtime behavior. (It has other nice properties, too.)

finishing more experiments

I really wanted to make declared_refs and refaliasing non-experiments. Unfortunately there are still some serious bugs that make them not good enough. Bummer! If you don’t remember what even these are, they’d let you do this:

for my (\@values) (@array_of_arrayrefs) {
  @values = reverse @values; # affects the ref stored in @a_of_a
}

Making this all work nicely leads to more general-use syntax, I hope.

converting some features into standards

We’re probably not going to turn strict on by default, roughly ever. But there are some things turned on by use v5.36 that should be on by default, I think. These include the postfix_qq feature, which allows postfix dereferencing inside of interpolative (double-quoted) strings, and the unicode_strings feature, which fixes a set of similar bugs in Perl’s handling of non-ASCII strings. (The bugs were so significant that many people ended up relying on them, or building workarounds that would be broken by the fix, which is basically the same thing as relying on them. Still, they’re bugs, and should go.)

Then there’s bitwise, which makes the | operator, and the other bitwise operators, always treat their arguments like numbers. This correctly brings them in line with other operators: operators impose type, not values. I hope we can make the bitwise feature a default.

SSL out of the box

We don’t have a very clear plan here, but since forever, it’s been a pain to go from “I have working perl” to “I have a working HTTPS client”. We’ve talked about ways to fix this. I’d like to see one happen, but I’m also not holding my breath.

runtime module loading

Given a variable with a module name in it, if you want to load the module, you can either use eval or Module:Load. RFC 6 proposes adding builtin::load, which always takes a scalar expression and always treats it as a module name, eliminating some complicated behavior in other libraries.

This has been implemented on the CPAN a dozen times. We just want a single, simple, in-core version.

a lexical importer

I think it’s important that we offer a simple means to say “I want to export this subroutine, but as a lexical, not package, subroutine.” builtin now does this, and other things should be able to follow suit. It would go a long way to eliminating weird namespace cleaning hacks.

lots of other stuff

People suggested many other changes, some well specified, some vague, some exciting, some mundane. In the next year, some of these will happen, some will not, and some will evolve into new ideas that might happen or not. If you want to stay abreast of all this, you can subscribe to perl5-porters, where “changes to perl” (including bugfixes, documentation, and so on) are the order of the day.

How perl v5.36.0 happened

There were a couple big changes in how we got perl v5.36.0 put together.

The PSC

The first was the steering council. Although v5.34.0 was also released during the era of the PSC (Perl Steering Council), it was already halfway developed when the PSC was seated. We had our first meeting in January 2021, when the code freeze for v5.34 was just getting underway. Also, the future of the project (for example, the question of Perl 7) was still up in the air, and that was our business much more than specific changes.

Perl v5.36.0 was managed by Neil Bowers, Paul Evans, and me for very close to its whole lifespan. We quickly established our priorities and set about working on them, keeping lists of things we wanted to push on and how they were going. The three of us were generally very busy with non-Perl things, and I think our weekly calls helped keep us engaged and helped to keep things moving. We didn’t always agree, but we were always able to come to a consensus that we felt good about putting forward. I think this was good for Perl, and it was definitely good for me. I am happiest when working in collaboration with others where I don’t need to worry about the question of good faith or shared goals, and I think this was very much the case in the current PSC.

This, by the way, led to my choice of release epigraph for v5.36.0, from The Three Musketeers:

“What!” cried he, in an accent of greater astonishment than before “your second witness is Monsieur Aramis?”

“Doubtless! Are you not aware that we are never seen one without the others, and that we are called among the Musketeers and the Guards, at the court and in the city, Athos, Porthos, and Aramis, or the Three Inseparables?”

RFCs

The other big change was the RFC process. This was something we’d talked about off and on for many years, with increasing frequency over the last few. A formal process submitting language changes was introduced last year, and we used it to manage suggestions for v5.36. I’ve linked to it a few times already in this post.

My take is that the RFC process has been a qualified success. That is: it’s been a success, but I think we can improve it significantly. I think it lacks clarity. It’s fairly transparent, but insufficiently centered in what we’re doing. It feels like it has too much process without enough rigor. That said, it’s a lot better than, “Hey what ever happened to that idea the one person had that one time?”

I’d like to take a little time in the next month or two, assuming I’m still with the PSC (because there’s an election coming up) to make some refinements to the process. I’d like to use the same time to help clean up our issue tracker, which feels like chaotic than RT did, but is still sort of a big pile of messy stuff.

In Conclusion…

I don’t really actually have a conclusion to draw. Personally, I think v5.36 is the most compelling release of perl in a long time. I look forward to using its features. I think that the exact set of people on this PSC helped make this possible, and I’m anxious about how the next council will do. Probably, they’ll do fine. I just don’t like change.

I think a lot of people say, “Perl can reclaim the high ground in programming languages if only…” and others say, “Because Perl can’t ever become as popular as it once was, spending any time making it better is a waste.” I guess I’m oversimplifying, but I feel like I hear both these opinions almost exactly on a relatively frequent basis. It’s probably not surprising that I think my take is better than those. Perl has a lot of users, including me. If we can improve the life of people who are writing and maintaining code in Perl, we should do it. I don’t think we should focus on overtaking (say) Go as trendy modern language, and I don’t think we should all give up and go rewrite everything in (say) Go. Programming languages are tools, and tools retain value for long after they cease to be the exciting new innovation. I just want to keep my saw sharp.

Written on May 28, 2022