Reflecting, celebrating, and looking forward

As I write, the Perl 6 Christmas release is taking place. It goes without saying that it’s been a long journey to this point. I’ve walked less than half of it, joining the effort 7-8 years ago through. Back then, I was fresh from university, had enjoyed the courses on compiler construction, type systems, and formal semantics, and was looking for an open source project in one or more of these fields to contribute to.

For many years before I got involved with Perl 6, I’d been using Perl extensively for web application development. It helped me live more comfortably through my college and university years, and I had sufficient work coming in that I kept a few other part-timers in a job too. That would be reason enough for being fond of a language, but there was more. Perl is really the language where I “grew up” as a programmer. It’s the first language where I used regexes and closures, it helped develop my early understanding of OOP, and it introduced me to testing culture. It’s the first language where I went to a conference, and realized the value a language’s community can have. All of this has been foundational to what I’ve done in my career since then, which besides Perl 5/6 has seen me deliver code and training in a wide range of languages, including C, C#, Java, and Python.

Through using Perl I got to know about the Perl 6 project – and it seemed a good fit with my interest in languages, compilers, and types. I knew it was a long-running project with plenty of history, but looking at what it was trying to achieve, I was convinced it was worth trying to help out a little to make it happen. It’s surprising just how successful Perl 6 has been over the years at attracting really great people to work on it – knowing full well its long, and at times difficult, history.

From patcher to architect

As usual with anyone new to an open source project, my first contributions were small. Portability things, small fixes here and there, minor features, and the like. Over time, I found myself taking on increasingly large language features. By 2009 I was a regular contributor, and had a decent grasp of much of the Rakudo compiler code base. I caused Patrick Michaud, the lead developer of Rakudo at the time, a good number of “oh no!” moments, as I didn’t yet have a great picture of the overall design – but was putting in notable features anyway. He, and others, did a great job of steering me gently in the right kind of direction.

2010 probably goes down as the most difficult year of my involvement in the Perl 6 project. The first Rakudo Star, a “useful, usable, early adopter’s release of Perl 6”, was generally graded as “not good enough”. More problematic, as I took some steps back and reflected on how to rectify this, was that the issues went right to the very heart of the architecture of Rakudo itself. Rakudo in those days was a traditional compiler: you fed it code in Perl 6, it spat out code in an intermediate language, which was executed on the Parrot VM. Of course, this is what you learn a compiler does at university, and it’s all very clean and simple…and not at all what Perl 6 demands.

Compile time and runtime are not so cleanly distinct concepts in Perl. They never have been, thanks to things like BEGIN blocks. But in Perl 6, we really wanted to both handle BEGIN time – and have lots of meta-programming stuff going on at BEGIN time – and still be able to have separate compilation. This, it turns out, is a tricky problem. And, as I looked into how to solve it, it became clear that it was going require a deep, drawn out, overhaul. Further, it became clear that this was an effort I was going to need to play architect on and largely lead. History has shown it to be the right call; the architecture put in place then has largely survived intact to today’s release, and it enabled a lot of the great things we simply take for granted today. But at the time, it was a lonely path to walk.

In the years since then, I led the way on getting Rakudo ported to the JVM – and have been happy to see that work taken forward by others. I was also a founder of MoarVM, the VM that the Perl 6 Christmas release primarily targets. Most recently, I took up the long-neglected task of getting Perl 6’s concurrency design into decent shape, doing the initial implementations of the features. It goes without saying that none of this would have been possible without the incredible bunch of people in the Perl 6 community who not only contributed their great technical skills to these efforts, but also a good deal of encouragement and friendship along the way.

For 7 years I made a really great job of just being “this guy who hacks on stuff” while managing to disclaim wearing any particular hat – until Larry went and called me out as architect at the Swiss Perl Workshop this last summer. It’s a role I’m proud to hold for the moment, and I look forward to continuing to contribute to the Perl 6 project for some years to come.

So, about the release…

By this point into writing the post – the Christmas release of Perl 6 has already taken place! Hurrah! But…what does it mean?

First, it’s important to understand that we’ve actually released two things (and that there are a couple more to come in the next days).

The first of these is the specification of the Perl 6 language itself, which is expressed as a suite of over 120,000 tests. Versions of the Perl 6 language will be named after festivals or celebrations; our alpha was Advent, our beta was Birthday, and we’re now at Christmas. This is referred to as “Perl 6 Christmas”, or in short as Perl 6.c. The next major language release will most likely be called Diwali, though I’m not sure we’ve worked out how to spell it yet. :-)

The second is a release of the Rakudo Perl 6 compiler that complies with this specification. We don’t imagine we’ll manage to stop people blurring language specification and language implementation, and know full well that when most people say they want “Perl 6 Christmas” they actually want a compiler that implements that version of the language. All the same, it’s a valuable distinction, as it means we remain open to alternative implementations – something that may not be that important now, but may be in a decade or two.

In the coming days, we’ll also produce a Rakudo Star release – which consists of the compiler along with documentation and a selection of modules – and that will also have an MSI, to make life easier for Windows folks.

What happens next?

For me? Rest. A lot of rest. Really a lot of rest. It’s been an exhausting last few months in the run up to the release, chasing down lots of little semantic details that we really wanted to get straightened out ahead of the freezing of the Perl 6 Christmas language specification. It was worth it, and I’m really happy with the language we’ve ended up with. But now it’s time to take care of myself for a while.

Come 2016, the work will go on. However, the focus will shift. For compiler and runtime folks like me, the focus will be largely on performance and reliability engineering. Now we have a stable language definition, it makes much more sense to invest more heavily in optimizing things. That isn’t to say a great deal of optimization work hasn’t already taken place; many of us have worked hard on that. But there’s a lot more that can, and will, be done.

Even more important than that work, however, will be the work that takes place on the Perl 6 ecosystem in the year to come. Since we announced the Christmas target for a stable language/compiler release, a number of new faces showed up up to help with writing modules and building supporting infrastructure. Now, their work won’t have to contend with us compiler hackers breaking the world under them every week – and that hopefully will encourage more to dive into the ecosystem work also. Maturity here will take time, but there’s plenty of expertise and wisdom on these matters in the Perl community.

Rakudo will stick to a monthly release cycle. We’ll be making a number of process changes to help us deliver those monthlies at a higher quality, especially with regard to not regressing on the 6.c language test suite, key modules, and ecosystem tooling. These changes will also introduce a stability level that lies between bleeding edge commits and monthlies. We also expect the language specification itself to have a small number of minor versions between now and Diwali, and we will treat these a lot like we have the Christmas release, with some extra attention going into the release than normal monthlies will get. Those releases will tend to have seen a greater focus on semantics detail, so will for now serve as our “stable track” for those who want something more occasional than the monthlies. We’ll see how that serves us and our userbase, and adjust as needed. It’s all about keeping the ceremony of contributing and releasing low, while keeping the quality of releases up.

Last but not least…

…I’d like to say thank you. Thank you to all those who have been my fellow contributors on the Perl 6 project, for being among the best people I’ve ever worked with on anything. Thank you to all those who came to my Perl conference talks and read my ramblings here over the years, and provided feedback and encouragement. Thank you to those who donated financially to the Perl 6 project, and so enabled Perl 6 to be part of my day job. And last, but absolutely not least, thank you to those of you who have written and run Perl 6 programs over the years, and shared that you were doing so – because perhaps the greatest reward of all for a compiler/VM hacker is seeing others use their work to build their own great creations.

Together, we’ve breathed life into a new Perl. I’m damn proud of what we’ve built together – and I can’t wait to see how people put it to work.

Merry Christmas!

Posted in Uncategorized | 13 Comments

Getting closer to Christmas

The list of things we “really want to do” before the Christmas release is gradually shrinking. Last time I wrote here, the xmas RT list was around 40 tickets. Now it’s under 20. Here’s an overview of the bits of that I’ve been responsible for.

Supply API cleanup

I did the original implementation of supplies a couple of years back. I wasn’t sure how they would be received by the wider community, so focused on just getting something working. (I also didn’t pick the name Supply; Larry is to thank for that, and most of the other naming). Supplies were, it turns out, very well received and liked, and with time we fleshed out the available operations on supplies, and 6 months or so back I introduced the supply, whenever, and react syntactic sugar for supplies.

What never happened, however, was a cleanup of the code and model at the very heart of supplies. We’ve had to “build one to throw away” with nearly everything in Perl 6, because the first implementation tended to show up some issues that really wanted taking care of. So it was with supplies. Thankfully, since everything was built on a fairly small core, this was not to be an epic task. And, where the built-ins did need to be re-worked, it could be
done much more simply than before by using the new supply/whenever syntax.

While much of the cleanup was to the internals, there are some user-facing things. The most major one is a breaking change to code that was doing to create a live supply. As I started cleaning up the code, and with experience from using related APIs in other languages, it became clear that making Supply be both the thing you tapped and the thing that was used to publish data was a design mistake. It not only would make it harder to trust Supply-based code and enforce the Supply protocol (that is, emit* [done | quit]),
but it also would make it hard to achieve the good performance by forcing extra sanity checks all over the place.

So, we split it up. You now use a Supplier in order to publish data, and obtain a Supply from it to expose to the world:

# Create a Supplier
my $supplier =;

# Get a Supply from it
my $supply = $supplier.Supply;
$supply.tap({ .say });

# Emit on it

This also means it’s easy to keep the ability to emit to yourself, and expose the ability to subscribe:

class FTP::Client {
    has $!log-supplier =;
    has $.log = $!log-supplier.Supply;

Since you can no longer call emit/done/quit on a Supply, you can be sure there won’t be values getting sneaked in unexpectedly.

The other change is that we now much more strongly enforce the supply protocol (that is, you’ll never see another emit after a done/quit unless you really go out of your way to
do so) and that only value will be pushed through a chain of supplies at a time (which prevents people from ending up with data races). Since we can ask supplies if they are
already sane (following protocol and serial (one at a time), we can avoid the cost of enforcing it at every step along the way, which makes things cheaper. This is just one of the
ways performance has been improved. We’ve some way to go, but you can now push into the hundreds of thousands of messages per second through a Supply.

Along the way, I fixed exceptions accidentally getting lost when unhandled in supplies in some cases, a data loss bug in Proc::Async and IO::Socket::Async, and could also resolve the RT complaining that the supply protocol was not enforced.

Preparing ourselves for stronger back-comparability

Once the Perl 6 Christmas release of the language is made, we’ll need to be a lot more careful about not breaking code that’s “out there”. This will be quite a change from
the last months, where we’ve been tweaking lots of things that bothered us. To help us with this change, I wrote up a proposal on how we’ll manage not accidentally changing tests that are part of the Perl 6 Christmas language definition, allow code to be marked with the language version it expects, and how we’ll tweak our process to give us a better chance of shopping solid releases that do not introduce regressions. Further feedback is still welcome; as with all development process things, I expect this to continue to evolve over the years.

I/O API cleanups

A few tickets complained about inconsistencies in a few bits of the I/O APIs, such as the differing ways of getting supplies of chars/bytes for async processes, sockets, and files. This has received a cleanup now. The synchronous and asynchronous socket APIs also got a little further alignment, such that the synchronous sockets now also have connect and listen factory methods.

Bool is now a real enum

This is a years old issue that we’ve finally taken care of in time for the release: Bool is now a real enum. It was mostly tricky because Bool needs setting up really quite early on in the language bootstrap. Thankfully, nine++ spent the time to figure out how do to this. His
patch nearly worked – but ran into an issue involving closure semantics with BEGIN and COMPOSE blocks. I fixed that, and was able to merge in his work.

Interaction of start and dynamic variables

A start block can now see the dynamic variables where that were available where it was started.

my $*TARGET_DIR = 'output/';
await start { say $*TARGET_DIR } # now works

Correcting an array indexing surprise

Indexing with a range would always auto-truncate to the number of elements in an array:

my @a = 1, 2, 3;
say @a[^4]; # (1 2 3)

While on the surface this might be useful, it was rather good at confusing people who expected this to work:

my @a;
@a[^2] = 1, 2;
say @a;

Since it auto-truncated to nothing, no assignment took place. We’ve now changed it so only ranges whose iterators are considered lazy will auto-truncate.

my @a = 1, 2, 3;
say @a[^4]; # (1 2 3 (Any)) since not lazy
say @a[0..Inf] # (1 2 3) since infinite is lazy
say @a[1..Inf] # (2 3) since infinite is lazy
say @a[lazy ^4] # (1 2 3) since marked lazy

Phaser fixes

I fixed a few weird bugs involving phasers.

  • RT #123732 noted that return inside of a NEXT phaser but outside of a routine would just cause iteration to go to the next value, rather than give an error (it now does, and a couple of similarly broken things also do)
  • RT #123731 complained that the use of last in a NEXT phaser did not correctly exit the loop; it now does
  • RT #121147 noted that FIRST only worked in for loops, but not other loops; now it does

Other smaller fixes

Here are a number of other less notable things I did.

  • Fix RT #74900 (candidate with zero parameters should defeat candidate with optional parameter in no-arg multi dispatch)
  • Tests covering RT #113892 and RT #115608 on call semantics (after getting confirmed that Rakudo already did the right thing)
  • Review RT #125689, solve the issue in a non-hacky way, and add a test to cover it
  • Fix RT #123757 (semantics of attribute initializer values passed to constructor and assignemnt was a tad off)
  • Hunt down a GC hang blocking module precomp branch merge; hopefully fix it
  • Review socket listen backlog patch; give feedback
  • Write up rejection of RT #125400 (behavior of unknown named parameters on methods)
Posted in Uncategorized | 1 Comment

What one Christmas elf has been up to

Here’s a look over the many things I’ve been working on in recent weeks to bring us closer to the Christmas Perl 6 release. For the most part, I’ve been working through the tickets we’ve attached to our “things to resolve before the Christmas release” meta-ticket – and especially trying to pick off the hard/scary ones sooner rather than later. From a starting point of well over 100 tickets, we’re now down to less than 40.

NFG improvements

If you’ve been following along, you’ll recall that I did a bunch of work on Normal Form Grapheme earlier on in the year. Normal Form Grapheme is an efficient way of providing strings at grapheme level. If a single grapheme (that is, thing a human would consider a character) is represented by multiple codepoints, then we create a synthetic codepoint for it, so we can still get O(1) string indexing and cheaply and correctly answer questions like, “how many characters is this”.

So, what was there to improve? Being the first to do something gives a low chance of getting everything right first time, and so was the case here. My initial attempt at defining NFG was the simplest possible addition to NFC, which works in terms of a property known as the Canonical Combining Class. NFC, to handwave some details, takes cases where we have a character followed by a combining character, and if there is a precomposed codepoint representing the two, exchanges them for the single precomposed codepoint. So, I defined NFG as: first compute NFC, then if you still see combining characters after a base character, make a synthetic codepoint. This actually worked pretty well in many cases. And, given the NFC quick-check property can save a lot of analysis work, NFG could be computed relatively quickly.

Unfortunately, though, this approach was a bit too simplistic. Unicode does actually define an algorithm for grapheme clusters, and it’s a bit more complex than doing an “NFC++.” Fortunately, I’d got the bit of code that would need to change rather nicely isolated, in expectation that something like this might happen anyway. So, at least 95% of the NFG implementation work I’d done back in April didn’t have to change at all to support a new definition of “grapheme”. Better yet, the Unicode consortium provided a bunch of test data for their grapheme clustering algorithm, which I could process into tests for Perl 6 strings.

So far so good, but there was a bit of a snag: using the NFC quick check property was no longer going to be possible, and without it we’d be in for quite a slowdown when decoding bytes to strings – which of course we do every time we get input from the outside world! So, what did I do? Hack our Unicode database importer to compute an NFG quick check property, of course. “There, I fixed it.”

So, all good? Err…almost. I’d also back in April done some optimizations around assuming that anything in the ASCII range was not subject to NFG. Alas, \r\n has been defined as a single grapheme. And yes, that really does mean:

> say "OMG\r\n".chars

I suspect this will be one of those “you’ll can’t win” situations. Ignore that bit of the Unicode spec, and people who understand Unicode will complain that Perl 6 implements it wrong. Follow it, and folks who don’t know the reasoning will think the above answer is nuts. :-) By the way, asking “how many codepoints” is easy:

> say "Hi!\r\n".codes

Making “\r\n” a single grapheme was rather valuable for a reason I hadn’t expected: now that something really common (at least, on Windows) exercised NFG, a couple of small implementation bugs were exposed, and could be fixed. It was also rather a pain, because I had to go and fix the places that wrongly thought they needn’t care for NFG (for example, the ASCII and Latin-1 encondings). The wider community then had to fix various pieces of code that used ord – a codepoint level operation – to see if there was a \r, then expected a \n after it, and then got confused. So, this was certainly a good one to nail before the Christmas release, after which we need to get serious about not breaking existing code.

As a small reward for slogging through all of this, it turned out that \r\n being a single grapheme made a regex engine issue involving ^^ and $$ magically disappear. So, that was another one off the Christmas list.

There were a few other places where we weren’t quite getting things right with NFG:

  • Case folding, including when a synthetic was composed out of something that case folded to multiple codepoints (I’m doubtful this code path will ever be hit for text in any real language, but I’m willing to be surprised)
  • Longest Token Matching in grammars/regexes (the NFA compiler/evaluator wasn’t aware of synthetics; now it is)

And with that, I think we can call NFG done for Christmas. Phew!

Shaped arrays

I’ve finally got shaped arrays fleshed out and wired into the language proper. So, this now works:

> my @a[3;3] = 1..3, 4..6, 7..9;
[[1 2 3] [4 5 6] [7 8 9]]
> say @a[2;1]
> @a[3;1] = 42
Index 3 for dimension 1 out of range (must be 0..2)

This isn’t implemented as an array of arrays, but rather as a single blob of memory with 9 slots. Of course, those slots actually point to Scalars, so it’s only so much more efficient. Native arrays can be shaped also, though. So, this:

my int8 @a[1024;1024];

Will allocate a single 1MB blob of memory and all of the 8-bit integers will be packed into it.

Even if you aren’t going native, though, shaped arrays do have another sometimes-useful benefit over nested arrays: they know their shape. This means that if you ask for the values, you get them:

> my @a[3;3] = 1..3, 4..6, 7..9; say @a.values;
(1 2 3 4 5 6 7 8 9)

Whereas if you just have an array of arrays and asked for the values, you’d just have got the nested arrays:

> my @a = [1..3], [4..6], [7..9]; say @a.values;
([1 2 3] [4 5 6] [7 8 9])

The native array design has been done such that we’ll be able to do really good code-gen at various levels – including down in the JIT compiler. However, none of that is actually done yet, nor will it be this side of Christmas, so the performance of shaped arrays – including the native arrays – isn’t too hot. In general, we’re focusing really hard on places we need to nail down semantics at the moment, because we’ll have to live with those for a long time. We’re free to improve performance every single monthly release, though – and will be in 2016.

Module installation and precompilation

I spent some time pondering and writing up a gist about what I thought management of installed modules and their precompilations should look like, along with describing a precompilation solution for development time (so running module test suites can benefit “for free” from precompilation). I was vaguely hoping not to have to wade into this area – it’s just not the kind of problem I consider myself good at and there seem to be endless opinions on the subject – but got handed my architect hat and asked to weigh in. I’m fairly admiring of the approach taken under the .git directory in Git repositories, and that no doubt influenced the solution I came up with (yes, there are SHA-1s aplenty).

After writing it, I left for a week’s honeymoon/vacation, and while I was away, something wonderful happened: nine++ started implementing what I’d suggested! By this point he’s nearly done, and it’s largely fallen out as I had imagined, with the usual set of course corrections that implementing a design usually brings. I’m optimistic we’ll be able to merge the branch in during the next week or so, and another important piece will have fallen into place in time for Christmas. Thanks should also go to lizmat++, who has done much to drive module installation related work forward and also provided valuable feedback in earlier drafts of my design.

Line endings

Windows thinks newlines are \r\n. Most of the rest of the world think they are \n. And, of course, you end up with files shared between the two, and it’s all a wonderful tangle. In regexes in Perl 6, \n has always been logical: it will happy match \r\n or the actual UNIX-y \n. That has not been the case for \n in strings, however. Thus, on Windows:

say "foo!";

Would until recently just spit out \n, not \r\n. There actually are relatively few places that this actually causes problems today: the command shell is happy enough, pretty much every editor is happy (of course, Notepad isn’t), and so forth. Some folks wanted us to fix this, others said screw it, so I asked Larry what we should do. :-) The solution we settled on is making \n in strings also be logical, meaning whatever the $?NL compile-time constant contains. And we pick the default value of that by platform. So on Windows the above say statement will spit out \r\n. (We are smart enough to recognize that a \r\n sequence in a string is a “single thing” and not go messing with the “\n” inside of it!) There are also pragmas to control this more finely if you don’t want the platform specific semantics:

use newline :lf; # Always UNIX-y \x0A
 use newline :crlf; # Always Windows-y \x0D\x0A
 use newline :cr; # \x0D for...likely nothing Perl 6 runs on :-)

Along with this, newline related configuration on file handles and sockets has been improved and extended. Previously, there was just nl, which was the newline for input and output. You can now set nl-in to either a string separator or an array of string separators, and they can be multiple characters. For output, nl-out is used. The default nl-in is [“\r\n”, “\x0A”], and the default nl-out is “\n” (which is logically interpreted by platform).

Last but not least, the VM-level I/O layer is now aware of chomping, meaning that rather than it handing us back a string that we then go and chomp at Perl 6 level, it can immediately hand back a string with the line ending readily chopped off. This was an efficiency win, but since it was done sensitive to the current set of seperators also fixed a longstanding bug where we couldn’t support auto-chomping of custom input line separators.


A couple of notable things happened with regards to encodings (the things that map between bytes and grapheme strings). On MoarVM, we’ve until recently assumed that every string coming to us from the OS was going to be decodable as UTF-8 (except on Windows, which is more into UCS-2). That often works out, but POSIX doesn’t promise you’ll get UTF-8, or even ASCII. It promises…a bunch of bytes. We can now cope with this properly – surprisingly enough, thanks to the power of NFG. We now have a special encoding, UTF-8 Clean-8-bit, which turns bytes that are invalid as UTF-8 into synthetics, from which we can recover the original bytes again at output. This means that any filename, environment variable, and so forth can be roundtripped through Perl 6 problem-free. You can concat “.bak” onto the end of such a string, and it’ll still work out just fine.

Another Christmas RT complained that if you encoded a string to an encoding that couldn’t represent some characters in it, it silently replaced them with a question mark, and that an exception would be a better default. This was implemented by ilmari++, who also added support for specifying a replacement character. I just had to review the patches, and apply them. Easy!

Here, here

I fixed all of the heredoc bugs in the Christmas RT list:

  • RT #120788 (adverbs after :heredoc/:to got “lost”)
  • RT #125543 (dedent bug when \n or \r\n showed up in heredocs)
  • RT #120895 (\t in heredoc got turned into spaces)

The final regex fixes

Similarly, I dealt with the final regex engine bugs before Christmas, including a rather hard to work out backtracking one:

  • RT #126438 (lack of error message when quantifying an anchor, just a hang)
  • RT #125285 (backtracking/capturing bug)
  • RT #88340 (backreference semantics when there are multiple captures)

Well, or so I thought. Then Larry didn’t quite like what I’d done in RT #88340, so I’ll have to go and revisit that a little. D’oh.

Other smaller Christmas RTs

  • Fix RT #125210 (postfix ++ and prefix ++ should complain about being non-associative)
  • Fix RT #123581 (.Capture on a lazy list hung, rather than complaining it’s not possible)
  • Add tests to codify that behavior observed in RT #118031 (typed hash binding vs assignment) is correct
  • Fix RT #115384 (when/default should not decont), tests for existing behavior ruled correct in RT #77334
  • Rule on RT #119929 and add test covering ruling (semantics of optional named parameters in multi-dispatch)
  • Fix RT #122715 and corrected tests (Promise could sink a Seq on keep, trashing the result)
  • Fix RT #117039 (run doesn’t fail); update design docs with current reality (Proc will now throw an exception in sink context if the process is unsuccessful), and add tests
  • Fix RT #82790 (indecisive about $*FOO::BAR; now we just outright reject such a declaration/usage)
  • Check into RT #123154; already fixed on Moar, just not JVM, so removing from xmas list
  • Review RT #114026, which confused invocation and coercion type literals. Codify the response by changing/adding tests.
  • Get ruling on RT #71112 and update tests accordingly, then resolve it.
  • Work on final bits needed to resolve RT #74414 (multi dispatch handling of `is rw`), building on work done so far by psch++
  • Fix RT #74646 (multi submethods were callable on the subclass)
  • Implement nextcallee and test it; fix nextsame/nextwith on nowhere to defer; together these resolved RT #125783
  • Fix RT #113546 (MoarVM mishandles flattening named args and named args with respect to ordering)
  • Fix RT #118361 (gist of .WHAT and .WHO isn’t shortname/longname respectively); RT #124750 got fixed along the way
  • Tests codifying decision on RT #119193 (.?/.+/.* behavior with multis)
  • Review/merge pull requests to implement IO::Handle.t from pmurias++, resolving RT #123347 (IO::Handle.t)

Busy times!

And last but not least…

I’ll be keynoting at the London Perl Workshop this year. See you there!

Posted in Uncategorized | 3 Comments

Last week: Unicode case fixes and much more

This report covers a two week period (September 28th through Sunday 11th October). However, the first week of it was almost entirely swallowed with teaching a class and the travel to and from that – so I’d have had precisely one small fix to talk about in a report. The second week saw me spend the majority of my working time on Perl 6, so now there’s plenty to say.

A case of Unicode

I decided to take on a couple of Unicode-related issues that were flagged for resolution ahead of the release. The first one was pretty easy: implementing stripping of the UTF-8 BOM. While it makes no sense to have a byte order mark in a byte-level encoding, various Windows programs annoyingly insert it to indicate to themselves that they’re looking at UTF-8. Which led to various fun situations where Perl 6 users on Windows would open a UTF-8 file and get some junk at the start of the string. Now they won’t.

The second task was much more involved. Unicode defines four case-changing operations: uppercasing, titlecasing, lowercasing, and case folding. We implemented the first three – well, sort of. We actually implemented the simple case mappings for the first three. However, there are some Unicode characters that become multiple codepoints, and even multiple graphemes, on case change. The German sharp S is one (apparently controversial) example, ligatures are another, and the rest are from the Greek and Armenian alphabets. First, I implemented case folding, now available as the fc method on strings in Perl 6. Along the way I made it handle full case folds that expand, and added tests. Having refactored various codepaths to cope with such expansion, it was then not too hard to get uc/tc/lc updated also. The final bit of fun was dealing with the interaction of all of this with NFG synthetics (tests here). Anyway, now we can be happy we reach Christmas with the case folding stuff correctly implemented.

Fixing some phasing issues with phasers

RT #121530 complained that when a LEAVE block threw an exception, it should not prevent other LEAVE and POST blocks running. I fixed that, and added a mechanism to handle the case where multiple LEAVE blocks manage to throw exceptions.

Amusingly enough, after fixing a case where we didn’t run a LEAVE block when we should, RT #121531 complained about us running them when we shouldn’t: in the case PRE phasers with preconditions failed. I fixed this also.

The usual bit of regex engine work

When you call a subrule in a regex, you can pass arguments. Normally positional ones are used, but RT #113544 noted that we didn’t yet handle named arguments, nor flattening of positional or named arguments. I implemented all of the cases, and added tests.

I reviewed and commented on a patch to implement the <?same> assertion from the design docs, which checks that the characters either side of it are the same. I noted a performance improvement was possible in the implementation, which was happily acted upon.

Finally, I started looking into an issue involving LTM, character classes, and the ignorecase flag. No fix yet; it’s going to be a bit involved (and I wanted to get our case handling straightened out before really attacking this one).


We suddenly started getting some bizzare mis-compiles in CORE.setting, where references to classes near the end of it would point to completely the wrong things. It turned out to be a (MVMuint16) cast that should have been an (MVMuint32) down in MoarVM’s bytecode assembler – no doubt wrongly copied from the line above. It’s always a relief when utterly weird things end up not being GC bugs!

A little profiler fix

If you did –profile on a program that called exit(), the results were utterly busted. Now they’re correct.

Other little bits

Here’s a collection of other things I did that are worth a quick mention.

  • Reviewing new RT tickets and commits; of note, reviewing patch for making :D/:U types work in more places
  • Eliminating remaining method postcircumfix:<( )> uses in Rakudo/tests. Looking into coercion vs. call distinction.
  • Reviewing “make Bool an enum” branch
  • Looking further into call vs. coercion and coercion API as part of RT #114026; post a proposal for discussion
  • Fix CORE::.values to actually produce values (same for other pseudo-packages); fix startup pref regression along the way
  • Studying latest startup profile, identifying a recent small regression
  • Investigate RT #117417 and RT #119763; request design input to work out how we’ll resolve it
  • Fix RT #121426 (routines with declared return type should do Callable[ThatType])
  • Review RT #77334, some discussion about what the right semantics really are, file notes on ticket
  • Fix RT #123769 (binding to typed arrays failed to type check)
  • Write up a rejection of RT #125762
  • Resolve RT #118069 (remove section on protos auto-multi-ing from design docs as agreed, remove todo’d tests)
  • Fix RT #119763 and RT #117417 (bad errors due to now-gone colonpair trait syntax not being implemented)
  • Reading up on module installation writings: S11, S22, gist with input from many in preparation for contributing to design work in the area

And that’s it. Have a good week!

Posted in Uncategorized | 2 Comments

Those weeks: much progress!

This post is an attempt to summarize the things I worked on since the last weekly report up until the end of last week (so, through to Friday 25th). Then I’ll get back to weekly. :-)


The Great List Refactor accounted for a large amount of the time I spent. Last time I wrote here, I was still working on my prototype. That included exploring the hyper/race design space for providing for data parallel operations. I gave an example of this in my recent concurrency talk. I got the prototype far enough along to be comfortable with the API and to have a basic example working and showing a wallclock time win when scaling over multiple cores – and that was as far as I went. Getting the GLR from prototype mode to something others could help with was far more pressing.

Next came the long, hard, slog of getting the new List and Array types integrated into Rakudo proper. It was pleasant to start out by tossing a bunch of VM-specific things that were there “for performance” that were no longer going to be needed under the new design – reducing the amount of VM-specific code in Rakudo (we already don’t have much of it anyway, but further reductions are always nice). Next, I cleared the previous list implementation from CORE.setting and put the new stuff in place, including all of the new iterator API. And then it was “just” a case of getting the setting to compile again. You may wonder why getting it to even compile is challenging. The answer is that the setting runs parts of itself while it compiles (such is the nature of bootstrapping).

Once I got things to a point where Rakudo actually could build and run its tests again, others felt confident to jump in. And jump in they did – especially nine++. That many people were able to dive in and make improvements was a great relief – not only because it meant we could get the job done sooner, but because it told me that the new list and iterator APIs were easy for people to understand and work with. Seeing other people pick up my design, get stuff done with it, and make the right decisions without needing much guidance, is a key indicator for me as an architect that I got things sufficiently right.

Being the person who initially created the GLR branch, it was nice to be the person who merged it too. Other folks seemed to fear that task a bit; thankfully, it was in the end straightforward. On the other hand, I’ve probably taught 50 Git courses over the last several years, so one’d hope I’d be comfortable with such things.

After the merge, of course, came dealing with various bits of fallout that were discovered. Some showed up holes in the test suite, which were nice to fill. I also did some of the work on getting the JVM backend back up and running; again, once I got it over a certain hump, others eagerly jumped in to take on the rest.

GLR performance work

After landing the GLR came getting some of the potential optimizations in place. I wanted a real world codebase to profile to see how it handled things under the GLR, rather than just looking at synthetic benchmarks. I turned to Text::CSV, and managed a whole bunch of improvements from what I found. The improvements came from many areas: speeding up iterating lines read from a file, fixing performance issues with flattening, improving our code-gen in a number of places… There’s plenty more to be had when I decide it’s time for some more performance work; in the meantime, things are already faster and more memory efficient.


I also did some work on S07, the Perl 6 design document for lists and iteration. I ended up starting over, now I knew how the GLR had worked out. So far I’ve got most of the user-facing bits documented in there; in the coming weeks I’ll get the sections on the iterator API and the parallel iteration design fleshed out.

Syntactic support for supplies

At YAPC::Asia I had the pleasure of introducing the new supply/react/whenever syntax in my presentation. It’s something of a game-changer, making working with streams of asynchronous data a lot more accessible and pleasant. Once I’d had the idea of how it should all work, getting to an initial working implementation wasn’t actually much effort. Anyway, that’s the biggest item ticked off my S17 changes list.

Other concurrency improvements

A few other concurrency bits got fixed. RT #125977 complained that if you sat in a tight loop starting and joining threads that themselves did nothing, you could eat rather a lot of memory up. It wasn’t a memory leak – the memory was recovered – just a result of allocating GC semispaces for each of the threads, and not deallocating them until a GC run occurred. The easy fix was to make joining a thread trigger a GC run – a “free” fix for typical Perl 6 programs which never touch threads directly, but just have a pool of them that are cleaned up by the OS at process end.

The second issue I hunted down was a subtle data race involving closure semantics and invocation. The symptoms were a “frame got wrong outer” on a good day, but usually a segfault. Anyway, it’s gone now.

Last but not least, I finally tracked down an issue that’s been occasionally reported over the last couple of months, but had proved hard to recreate reliably. Once I found it, I understood why: it would only show up in programs that both used threads and dynamically created types and left them to be GC’d. (For the curious: our GC is parallel during its stop the world phase, but lets threads do the finalization step concurrently so they can go their own way as soon as they get done finalizing. Unfortunately, the finalization of type tables could happen too soon, leaving another thread finalizing objects of that type with a dangling pointer. These things always sound like dumb bugs in hindsight…)

Fixed size/shaped arrays

Work on fixed size arrays and shaped arrays continued, helped along by the GLR. By now, you can say things like:

my @a :=,3));
@a[1;1] = 42;

Next up will be turning that into just:

my @a[3;3];
@a[1;1] = 42;

Preparing for Christmas

With the release of the Perl 6 language getting closer, I decided to put on a project manager hat for a couple of hours and get a few things into, and out of, focus. First of all, I wrote up a list of things that will explicitly not be included in, and so deferred to a future Perl 6 language release.

And on the implementation side, I collected together tickets that I really want to have addressed in the Rakudo we ship as the Christmas release. Most of them relate to small bits of semantic indecision that we should really get cleaned up, so we don’t end up having to maintain (so many…) semantics we didn’t quite want for years and years to come. Having compiler crashes and fixing them up in the next release is far more forgivable than breaking people’s working code when they upgrade to the next release, so I’m worrying about loose semantic ends much more than “I can trigger a weird internal compiler error”.

The `is rw` cleanup

One of the issues on my Christmas list was getting the “is rw” semantics tightened up. We’ve not been properly treating it as a constraint, as the design docs wish, meaning that you could pass in a value rather than an assignable container and not get an error until you tried to assign to it. Now the error comes at signature binding time, so this program now gives an error:

sub foo($x is rw) { }
 foo(42); # the 42 fails to bind to $x

Error reporting improvements

I improved a couple of poor failure modes:

  • Fix RT #125812 (error reporting of with/without syntax issues didn’t match if/unless)
  • Finish fixing RT #125745 (add hint to implement ACCEPTS in error about ~~ being a special form)
  • Remove leftover debugging info in an error produced by MoarVM

Other bits

Finally, the usual collection of bits and pieces I did that don’t fit anywhere else.

  • Test and look over a MoarVM patch to fix VS 2015 build
  • Reject RT #125963 with an explanation
  • Write response to RT #126000 and reject it (operator lexicality semantics)
  • Start looking into RT #125985, note findings on ticket
  • Fix RT #126018 (crash in optimizer when analysing attribute with subset type as an argument)
  • Fix RT #126033 (hang when result of a match assigned to variable holding target)
  • Reviewing the gmr (“Great Map Refactor”) branch
  • Fix crash that sometimes happened when assembing programs containing labeled exception handlers
  • Review RT #125705, check it’s fixed, add a test to cover it. Same for RT #125161.
  • Cut September MoarVM release
  • Hunt JIT devirtualization bug on platforms using x64 POSIX ABI and fix it
  • Tests for RT #126089
  • Fix RT #126104 (the `is default` type check was inverted)
  • Investigate RT #126029, which someone fixed concurrently; added a test to cover it
  • Fix RT #125876 (redeclaring $_ inside of various constructs, such as a loop, caused a compiler crash)
  • Fix RT #126110 and RT #126115.
  • Fixed a POST regression
  • Fix passing allomorphs to native parameters and add tests; also clear up accidental int/num argument coercion and add tests
  • Fix RT #124887 and RT #124888 (implement missing <.print> and <.graph> subrules in regexes)
  • Fix RT #118467 (multi-dispatch sorting bug when required named came after positional slurpy)
  • Looking into RT #75638 (auto-export of multis), decided we’d rather not have that feature; updated design docs and closed ticket
  • Investigate weird return compilation bug with JIT/inline enabled; get it narrowed down to resolution of dynamics in JIT inlines
  • Fix RT #113884 (constant scalars interpolated in regexes should participate in LTM)
  • Investigate RT #76278, determine already fixed, ensure there is test coverage
Posted in Uncategorized | 1 Comment


It’s amazing just how long has gone by since I last wrote something here. I intend to get back to writing weekly posts about my Perl 6 work again starting around now, but here’s a few big-picture updates of what I’ve been up to.

First came YAPC::Asia. I gave a talk on Perl 6’s concurrent, parallel, and asynchronous features; the slides as well as a video are available. The talk was well received, and I was happy to have some interesting questions at the end. I was also happy to find myself talking with some Go compiler folks during the conference dinner; of course, we talked about concurrency models, and plenty besides. While the conference was great, and I’m glad I took up the invitation to go and speak there, my body basically refused to sleep at local night time. Given I’d only just managed to get shake off the exhaustion that had been troubling me for a while before the trip, this was rather sub-optimal – and meant I was pretty much beat by the time I got back home. Just in time for…the Swiss Perl Workshop.

Switzerland is, thankfully, in the same timezone as Prague, so no trouble with being able to sleep on a night! And the hotel was very comfortable too. Sadly, I still arrived to the workshop really quite exhausted – no thanks to needing to make the trip from Prague to Zurich via a night in Stockholm for annoying immigration-related reasons (I’m generally appreciative of the time the Schengen agreement saves me, but there’s certainly some improvement needed). Despite my tiredness, it was still a useful time. Day 1 was a hackathon, and lots of GLR-related decisions got made – much thanks to having the right people in the same room. On day 2, I gave my concurrency talk again, along with a talk on NFG. And on day 3, I taught a day-long Perl 6 introduction course – to a totally full room.

Despite feeling dizzy and exhausted in some periods, SPW was a great time. But by the time it was over, I really felt my body telling me I was pushing it way too far. The mere thought of getting on another flight gave me a headache. So, sadly I ended up cancelling my trip to YAPC::Europe this year. After attending 10 years worth of YAPC::Europe conferences in a row, I was sad to miss it – and it was a pity to have to cancel my session there too. Happily, lizmat++ stepped up to take my slot and deliver my concurrency talk – which she’d already seen in Japan and Switzerland. I spent several days offline, working my way back home overland with a couple of stops in Austria, thankfully with my wife to take good care of me along the way.

In terms of big Perl 6 implementation news: in the days between my last post here and YAPC::Asia, I got the GLR (Great List Refactor) to a point where others could contribute. And contribute they did! A special mention goes to nine++, who put in an incredible amount of work, meaning that I could merge the GLR work in early September. Since then we’ve been working through various details, but things are looking really good there.

Obviously, I’ve been having to take it somewhat easy in the last weeks. It’s going to be a while before I’m back to full health, and critically I need to avoid doing too much. I’ve been finding a decent amount of time to do Perl 6 things, but with limited energy have prioritized doing things (and helping others do things) rather than writing about doing things. By this point, I think I’ll be able to get back to regular progress reports again, so you’ll have those to look forward to. I’ll hopefully have one out for the last week or so sometime near the weekend.

Posted in Uncategorized | 2 Comments

This week: fixing lots of RTs, digging into the GLR

This report covers the week starting 27th July. The first half was spent teaching the course that I was busy finishing up writing last week; happily, it went extremely well and was a lot of fun to deliver. And since then, I’ve been back to Perl 6 things pretty much exclusively. Anyway, here’s what I got up to.

Fixing the &?ROUTINE pessimization

I noted in a recent episode that I fixed &?ROUTINE’s semantics with regards to closures. And indeed I had. Unfortunately, in doing so, I managed to make a whole bunch of things impossible for the MoarVM inliner to handle. Given how much fiddly work it was to implement multi-level inlining and deoptimization, I guess I should be happy that ending up with a lot less things being inlined would lead to a very measurable slowdown in real-world code – in this case, Text::CSV. This week I took another crack at &?ROUTINE, retaining the correct semantics, but making it so you only pay the cost of having it if you actually use it. And with that, the inlining started working much more effectively again (and Test::CSV regained some speed – and likely other code too).

The status of our-scoped things inside roles

There were four RT tickets on the subject of our-scoped things inside of roles. They all hinged on a seemingly easy question. If this works:

class C {
    our sub s() { }

Then why doesn’t this:

role R {
    our sub s() { }

There’s actually not one, but two reasons why. The first is that a given role is actually rather like a template, generic on the type of the class that it actually ends up being composed into (think about how ::?CLASS is generic inside the body of a role, then realize you could mention it in the our sub; you’ll note the our sub you’re referencing – if you could – would be ambiguous). The second is that you could later define:

role R[::T] {

That’s fine, this role and the previous one are disambiguated by their parameter lists. But it means that the symbol R that we install actually refers to a role disambiguator. (If you’re thinking this sounds like proto and multi subs all over again – you’re spot on. In fact, it’s implemented using the very same mechanism.)

Once you know these two things, it makes sense that you’re not going to reach an our-scoped thing burried two levels deep in genericity. Of course, when you start out with Perl 6 you don’t know those two things, you’ll more likely flail around frustratedly. Now we reject our-scoped declarations inside of roles at compile time, with an explanation of why.


The Great List Refactor, or Great List Re-implementation, or Great List Re-design, was identifed as one of the Big Three Tasks for Perl 6 ahead of the release we’re working towards later this year. In summary, the goal is to take on the semantic, speed, and memory issues with the current list design and implementation. I wasn’t expecting to be the person who led implementation work on this, but in the end it has fallen to me. Thankfully, a lot of the design thinking has already taken place, so it really is a case of focusing on the lower-level data structure design and implementation. Anyway, it takes a while to get from no code to something that’s ready for the wider community to have a conversation around, and so my GLR time in the week covered by this report started out with isolated contemplation and fleshing out code. Spoiler: I actually released it this Monday, and have continued evolving it since; you can find the latest in a Gist I’m keeping updated (though I suspect in the next days I’ll be moving over to working in a Rakudo branch).

Other fixes

I fixed quite a few other tickets too:

  • Fix RT #125675 (tighten up various signatures so we get bind failures, not .count/.arity dispatch failures)
  • Fix RT #125670 (rx{foo} as a parameter default caused compiler crash when it tried to do some static analysis)
  • Fix RT #125715 (problems using EXPORT-d type as a type constraint on an attribute)
  • Fix RT #125694 (private method calls in roles bound too tightly to role type)
  • Fix getting ugly low-level backtrace when sinking last statement in a program
  • Verify RT #125346 is fixed, write a test for it
  • Fix a MoarVM crash involving lexotic (return) handlers and a race condition in frame validation
  • Fix RT #125480 (program counter corruption due to bad interaction of LEAVE/return/closures)
  • Trying to hunt down a MoarVM parallel GC bug. Found one issue and patched it, but it’s seemingly not The One…

See you next week!

Posted in Uncategorized | 1 Comment