What one Christmas elf has been up to

Here’s a look over the many things I’ve been working on in recent weeks to bring us closer to the Christmas Perl 6 release. For the most part, I’ve been working through the tickets we’ve attached to our “things to resolve before the Christmas release” meta-ticket – and especially trying to pick off the hard/scary ones sooner rather than later. From a starting point of well over 100 tickets, we’re now down to less than 40.

NFG improvements

If you’ve been following along, you’ll recall that I did a bunch of work on Normal Form Grapheme earlier on in the year. Normal Form Grapheme is an efficient way of providing strings at grapheme level. If a single grapheme (that is, thing a human would consider a character) is represented by multiple codepoints, then we create a synthetic codepoint for it, so we can still get O(1) string indexing and cheaply and correctly answer questions like, “how many characters is this”.

So, what was there to improve? Being the first to do something gives a low chance of getting everything right first time, and so was the case here. My initial attempt at defining NFG was the simplest possible addition to NFC, which works in terms of a property known as the Canonical Combining Class. NFC, to handwave some details, takes cases where we have a character followed by a combining character, and if there is a precomposed codepoint representing the two, exchanges them for the single precomposed codepoint. So, I defined NFG as: first compute NFC, then if you still see combining characters after a base character, make a synthetic codepoint. This actually worked pretty well in many cases. And, given the NFC quick-check property can save a lot of analysis work, NFG could be computed relatively quickly.

Unfortunately, though, this approach was a bit too simplistic. Unicode does actually define an algorithm for grapheme clusters, and it’s a bit more complex than doing an “NFC++.” Fortunately, I’d got the bit of code that would need to change rather nicely isolated, in expectation that something like this might happen anyway. So, at least 95% of the NFG implementation work I’d done back in April didn’t have to change at all to support a new definition of “grapheme”. Better yet, the Unicode consortium provided a bunch of test data for their grapheme clustering algorithm, which I could process into tests for Perl 6 strings.

So far so good, but there was a bit of a snag: using the NFC quick check property was no longer going to be possible, and without it we’d be in for quite a slowdown when decoding bytes to strings – which of course we do every time we get input from the outside world! So, what did I do? Hack our Unicode database importer to compute an NFG quick check property, of course. “There, I fixed it.”

So, all good? Err…almost. I’d also back in April done some optimizations around assuming that anything in the ASCII range was not subject to NFG. Alas, \r\n has been defined as a single grapheme. And yes, that really does mean:

> say "OMG\r\n".chars

I suspect this will be one of those “you’ll can’t win” situations. Ignore that bit of the Unicode spec, and people who understand Unicode will complain that Perl 6 implements it wrong. Follow it, and folks who don’t know the reasoning will think the above answer is nuts. :-) By the way, asking “how many codepoints” is easy:

> say "Hi!\r\n".codes

Making “\r\n” a single grapheme was rather valuable for a reason I hadn’t expected: now that something really common (at least, on Windows) exercised NFG, a couple of small implementation bugs were exposed, and could be fixed. It was also rather a pain, because I had to go and fix the places that wrongly thought they needn’t care for NFG (for example, the ASCII and Latin-1 encondings). The wider community then had to fix various pieces of code that used ord – a codepoint level operation – to see if there was a \r, then expected a \n after it, and then got confused. So, this was certainly a good one to nail before the Christmas release, after which we need to get serious about not breaking existing code.

As a small reward for slogging through all of this, it turned out that \r\n being a single grapheme made a regex engine issue involving ^^ and $$ magically disappear. So, that was another one off the Christmas list.

There were a few other places where we weren’t quite getting things right with NFG:

  • Case folding, including when a synthetic was composed out of something that case folded to multiple codepoints (I’m doubtful this code path will ever be hit for text in any real language, but I’m willing to be surprised)
  • Longest Token Matching in grammars/regexes (the NFA compiler/evaluator wasn’t aware of synthetics; now it is)

And with that, I think we can call NFG done for Christmas. Phew!

Shaped arrays

I’ve finally got shaped arrays fleshed out and wired into the language proper. So, this now works:

> my @a[3;3] = 1..3, 4..6, 7..9;
[[1 2 3] [4 5 6] [7 8 9]]
> say @a[2;1]
> @a[3;1] = 42
Index 3 for dimension 1 out of range (must be 0..2)

This isn’t implemented as an array of arrays, but rather as a single blob of memory with 9 slots. Of course, those slots actually point to Scalars, so it’s only so much more efficient. Native arrays can be shaped also, though. So, this:

my int8 @a[1024;1024];

Will allocate a single 1MB blob of memory and all of the 8-bit integers will be packed into it.

Even if you aren’t going native, though, shaped arrays do have another sometimes-useful benefit over nested arrays: they know their shape. This means that if you ask for the values, you get them:

> my @a[3;3] = 1..3, 4..6, 7..9; say @a.values;
(1 2 3 4 5 6 7 8 9)

Whereas if you just have an array of arrays and asked for the values, you’d just have got the nested arrays:

> my @a = [1..3], [4..6], [7..9]; say @a.values;
([1 2 3] [4 5 6] [7 8 9])

The native array design has been done such that we’ll be able to do really good code-gen at various levels – including down in the JIT compiler. However, none of that is actually done yet, nor will it be this side of Christmas, so the performance of shaped arrays – including the native arrays – isn’t too hot. In general, we’re focusing really hard on places we need to nail down semantics at the moment, because we’ll have to live with those for a long time. We’re free to improve performance every single monthly release, though – and will be in 2016.

Module installation and precompilation

I spent some time pondering and writing up a gist about what I thought management of installed modules and their precompilations should look like, along with describing a precompilation solution for development time (so running module test suites can benefit “for free” from precompilation). I was vaguely hoping not to have to wade into this area – it’s just not the kind of problem I consider myself good at and there seem to be endless opinions on the subject – but got handed my architect hat and asked to weigh in. I’m fairly admiring of the approach taken under the .git directory in Git repositories, and that no doubt influenced the solution I came up with (yes, there are SHA-1s aplenty).

After writing it, I left for a week’s honeymoon/vacation, and while I was away, something wonderful happened: nine++ started implementing what I’d suggested! By this point he’s nearly done, and it’s largely fallen out as I had imagined, with the usual set of course corrections that implementing a design usually brings. I’m optimistic we’ll be able to merge the branch in during the next week or so, and another important piece will have fallen into place in time for Christmas. Thanks should also go to lizmat++, who has done much to drive module installation related work forward and also provided valuable feedback in earlier drafts of my design.

Line endings

Windows thinks newlines are \r\n. Most of the rest of the world think they are \n. And, of course, you end up with files shared between the two, and it’s all a wonderful tangle. In regexes in Perl 6, \n has always been logical: it will happy match \r\n or the actual UNIX-y \n. That has not been the case for \n in strings, however. Thus, on Windows:

say "foo!";

Would until recently just spit out \n, not \r\n. There actually are relatively few places that this actually causes problems today: the command shell is happy enough, pretty much every editor is happy (of course, Notepad isn’t), and so forth. Some folks wanted us to fix this, others said screw it, so I asked Larry what we should do. :-) The solution we settled on is making \n in strings also be logical, meaning whatever the $?NL compile-time constant contains. And we pick the default value of that by platform. So on Windows the above say statement will spit out \r\n. (We are smart enough to recognize that a \r\n sequence in a string is a “single thing” and not go messing with the “\n” inside of it!) There are also pragmas to control this more finely if you don’t want the platform specific semantics:

use newline :lf; # Always UNIX-y \x0A
 use newline :crlf; # Always Windows-y \x0D\x0A
 use newline :cr; # \x0D for...likely nothing Perl 6 runs on :-)

Along with this, newline related configuration on file handles and sockets has been improved and extended. Previously, there was just nl, which was the newline for input and output. You can now set nl-in to either a string separator or an array of string separators, and they can be multiple characters. For output, nl-out is used. The default nl-in is [“\r\n”, “\x0A”], and the default nl-out is “\n” (which is logically interpreted by platform).

Last but not least, the VM-level I/O layer is now aware of chomping, meaning that rather than it handing us back a string that we then go and chomp at Perl 6 level, it can immediately hand back a string with the line ending readily chopped off. This was an efficiency win, but since it was done sensitive to the current set of seperators also fixed a longstanding bug where we couldn’t support auto-chomping of custom input line separators.


A couple of notable things happened with regards to encodings (the things that map between bytes and grapheme strings). On MoarVM, we’ve until recently assumed that every string coming to us from the OS was going to be decodable as UTF-8 (except on Windows, which is more into UCS-2). That often works out, but POSIX doesn’t promise you’ll get UTF-8, or even ASCII. It promises…a bunch of bytes. We can now cope with this properly – surprisingly enough, thanks to the power of NFG. We now have a special encoding, UTF-8 Clean-8-bit, which turns bytes that are invalid as UTF-8 into synthetics, from which we can recover the original bytes again at output. This means that any filename, environment variable, and so forth can be roundtripped through Perl 6 problem-free. You can concat “.bak” onto the end of such a string, and it’ll still work out just fine.

Another Christmas RT complained that if you encoded a string to an encoding that couldn’t represent some characters in it, it silently replaced them with a question mark, and that an exception would be a better default. This was implemented by ilmari++, who also added support for specifying a replacement character. I just had to review the patches, and apply them. Easy!

Here, here

I fixed all of the heredoc bugs in the Christmas RT list:

  • RT #120788 (adverbs after :heredoc/:to got “lost”)
  • RT #125543 (dedent bug when \n or \r\n showed up in heredocs)
  • RT #120895 (\t in heredoc got turned into spaces)

The final regex fixes

Similarly, I dealt with the final regex engine bugs before Christmas, including a rather hard to work out backtracking one:

  • RT #126438 (lack of error message when quantifying an anchor, just a hang)
  • RT #125285 (backtracking/capturing bug)
  • RT #88340 (backreference semantics when there are multiple captures)

Well, or so I thought. Then Larry didn’t quite like what I’d done in RT #88340, so I’ll have to go and revisit that a little. D’oh.

Other smaller Christmas RTs

  • Fix RT #125210 (postfix ++ and prefix ++ should complain about being non-associative)
  • Fix RT #123581 (.Capture on a lazy list hung, rather than complaining it’s not possible)
  • Add tests to codify that behavior observed in RT #118031 (typed hash binding vs assignment) is correct
  • Fix RT #115384 (when/default should not decont), tests for existing behavior ruled correct in RT #77334
  • Rule on RT #119929 and add test covering ruling (semantics of optional named parameters in multi-dispatch)
  • Fix RT #122715 and corrected tests (Promise could sink a Seq on keep, trashing the result)
  • Fix RT #117039 (run doesn’t fail); update design docs with current reality (Proc will now throw an exception in sink context if the process is unsuccessful), and add tests
  • Fix RT #82790 (indecisive about $*FOO::BAR; now we just outright reject such a declaration/usage)
  • Check into RT #123154; already fixed on Moar, just not JVM, so removing from xmas list
  • Review RT #114026, which confused invocation and coercion type literals. Codify the response by changing/adding tests.
  • Get ruling on RT #71112 and update tests accordingly, then resolve it.
  • Work on final bits needed to resolve RT #74414 (multi dispatch handling of `is rw`), building on work done so far by psch++
  • Fix RT #74646 (multi submethods were callable on the subclass)
  • Implement nextcallee and test it; fix nextsame/nextwith on nowhere to defer; together these resolved RT #125783
  • Fix RT #113546 (MoarVM mishandles flattening named args and named args with respect to ordering)
  • Fix RT #118361 (gist of .WHAT and .WHO isn’t shortname/longname respectively); RT #124750 got fixed along the way
  • Tests codifying decision on RT #119193 (.?/.+/.* behavior with multis)
  • Review/merge pull requests to implement IO::Handle.t from pmurias++, resolving RT #123347 (IO::Handle.t)

Busy times!

And last but not least…

I’ll be keynoting at the London Perl Workshop this year. See you there!

Posted in Uncategorized | 2 Comments

Last week: Unicode case fixes and much more

This report covers a two week period (September 28th through Sunday 11th October). However, the first week of it was almost entirely swallowed with teaching a class and the travel to and from that – so I’d have had precisely one small fix to talk about in a report. The second week saw me spend the majority of my working time on Perl 6, so now there’s plenty to say.

A case of Unicode

I decided to take on a couple of Unicode-related issues that were flagged for resolution ahead of the 6.christmas release. The first one was pretty easy: implementing stripping of the UTF-8 BOM. While it makes no sense to have a byte order mark in a byte-level encoding, various Windows programs annoyingly insert it to indicate to themselves that they’re looking at UTF-8. Which led to various fun situations where Perl 6 users on Windows would open a UTF-8 file and get some junk at the start of the string. Now they won’t.

The second task was much more involved. Unicode defines four case-changing operations: uppercasing, titlecasing, lowercasing, and case folding. We implemented the first three – well, sort of. We actually implemented the simple case mappings for the first three. However, there are some Unicode characters that become multiple codepoints, and even multiple graphemes, on case change. The German sharp S is one (apparently controversial) example, ligatures are another, and the rest are from the Greek and Armenian alphabets. First, I implemented case folding, now available as the fc method on strings in Perl 6. Along the way I made it handle full case folds that expand, and added tests. Having refactored various codepaths to cope with such expansion, it was then not too hard to get uc/tc/lc updated also. The final bit of fun was dealing with the interaction of all of this with NFG synthetics (tests here). Anyway, now we can be happy we reach Christmas with the case folding stuff correctly implemented.

Fixing some phasing issues with phasers

RT #121530 complained that when a LEAVE block threw an exception, it should not prevent other LEAVE and POST blocks running. I fixed that, and added a mechanism to handle the case where multiple LEAVE blocks manage to throw exceptions.

Amusingly enough, after fixing a case where we didn’t run a LEAVE block when we should, RT #121531 complained about us running them when we shouldn’t: in the case PRE phasers with preconditions failed. I fixed this also.

The usual bit of regex engine work

When you call a subrule in a regex, you can pass arguments. Normally positional ones are used, but RT #113544 noted that we didn’t yet handle named arguments, nor flattening of positional or named arguments. I implemented all of the cases, and added tests.

I reviewed and commented on a patch to implement the <?same> assertion from the design docs, which checks that the characters either side of it are the same. I noted a performance improvement was possible in the implementation, which was happily acted upon.

Finally, I started looking into an issue involving LTM, character classes, and the ignorecase flag. No fix yet; it’s going to be a bit involved (and I wanted to get our case handling straightened out before really attacking this one).


We suddenly started getting some bizzare mis-compiles in CORE.setting, where references to classes near the end of it would point to completely the wrong things. It turned out to be a (MVMuint16) cast that should have been an (MVMuint32) down in MoarVM’s bytecode assembler – no doubt wrongly copied from the line above. It’s always a relief when utterly weird things end up not being GC bugs!

A little profiler fix

If you did –profile on a program that called exit(), the results were utterly busted. Now they’re correct.

Other little bits

Here’s a collection of other things I did that are worth a quick mention.

  • Reviewing new RT tickets and commits; of note, reviewing patch for making :D/:U types work in more places
  • Eliminating remaining method postcircumfix:<( )> uses in Rakudo/tests. Looking into coercion vs. call distinction.
  • Reviewing “make Bool an enum” branch
  • Looking further into call vs. coercion and coercion API as part of RT #114026; post a proposal for discussion
  • Fix CORE::.values to actually produce values (same for other pseudo-packages); fix startup pref regression along the way
  • Studying latest startup profile, identifying a recent small regression
  • Investigate RT #117417 and RT #119763; request design input to work out how we’ll resolve it
  • Fix RT #121426 (routines with declared return type should do Callable[ThatType])
  • Review RT #77334, some discussion about what the right semantics really are, file notes on ticket
  • Fix RT #123769 (binding to typed arrays failed to type check)
  • Write up a rejection of RT #125762
  • Resolve RT #118069 (remove section on protos auto-multi-ing from design docs as agreed, remove todo’d tests)
  • Fix RT #119763 and RT #117417 (bad errors due to now-gone colonpair trait syntax not being implemented)
  • Reading up on module installation writings: S11, S22, gist with input from many in preparation for contributing to design work in the area

And that’s it. Have a good week!

Posted in Uncategorized | 1 Comment

Those weeks: much progress!

This post is an attempt to summarize the things I worked on since the last weekly report up until the end of last week (so, through to Friday 25th). Then I’ll get back to weekly. :-)


The Great List Refactor accounted for a large amount of the time I spent. Last time I wrote here, I was still working on my prototype. That included exploring the hyper/race design space for providing for data parallel operations. I gave an example of this in my recent concurrency talk. I got the prototype far enough along to be comfortable with the API and to have a basic example working and showing a wallclock time win when scaling over multiple cores – and that was as far as I went. Getting the GLR from prototype mode to something others could help with was far more pressing.

Next came the long, hard, slog of getting the new List and Array types integrated into Rakudo proper. It was pleasant to start out by tossing a bunch of VM-specific things that were there “for performance” that were no longer going to be needed under the new design – reducing the amount of VM-specific code in Rakudo (we already don’t have much of it anyway, but further reductions are always nice). Next, I cleared the previous list implementation from CORE.setting and put the new stuff in place, including all of the new iterator API. And then it was “just” a case of getting the setting to compile again. You may wonder why getting it to even compile is challenging. The answer is that the setting runs parts of itself while it compiles (such is the nature of bootstrapping).

Once I got things to a point where Rakudo actually could build and run its tests again, others felt confident to jump in. And jump in they did – especially nine++. That many people were able to dive in and make improvements was a great relief – not only because it meant we could get the job done sooner, but because it told me that the new list and iterator APIs were easy for people to understand and work with. Seeing other people pick up my design, get stuff done with it, and make the right decisions without needing much guidance, is a key indicator for me as an architect that I got things sufficiently right.

Being the person who initially created the GLR branch, it was nice to be the person who merged it too. Other folks seemed to fear that task a bit; thankfully, it was in the end straightforward. On the other hand, I’ve probably taught 50 Git courses over the last several years, so one’d hope I’d be comfortable with such things.

After the merge, of course, came dealing with various bits of fallout that were discovered. Some showed up holes in the test suite, which were nice to fill. I also did some of the work on getting the JVM backend back up and running; again, once I got it over a certain hump, others eagerly jumped in to take on the rest.

GLR performance work

After landing the GLR came getting some of the potential optimizations in place. I wanted a real world codebase to profile to see how it handled things under the GLR, rather than just looking at synthetic benchmarks. I turned to Text::CSV, and managed a whole bunch of improvements from what I found. The improvements came from many areas: speeding up iterating lines read from a file, fixing performance issues with flattening, improving our code-gen in a number of places… There’s plenty more to be had when I decide it’s time for some more performance work; in the meantime, things are already faster and more memory efficient.


I also did some work on S07, the Perl 6 design document for lists and iteration. I ended up starting over, now I knew how the GLR had worked out. So far I’ve got most of the user-facing bits documented in there; in the coming weeks I’ll get the sections on the iterator API and the parallel iteration design fleshed out.

Syntactic support for supplies

At YAPC::Asia I had the pleasure of introducing the new supply/react/whenever syntax in my presentation. It’s something of a game-changer, making working with streams of asynchronous data a lot more accessible and pleasant. Once I’d had the idea of how it should all work, getting to an initial working implementation wasn’t actually much effort. Anyway, that’s the biggest item ticked off my S17 changes list.

Other concurrency improvements

A few other concurrency bits got fixed. RT #125977 complained that if you sat in a tight loop starting and joining threads that themselves did nothing, you could eat rather a lot of memory up. It wasn’t a memory leak – the memory was recovered – just a result of allocating GC semispaces for each of the threads, and not deallocating them until a GC run occurred. The easy fix was to make joining a thread trigger a GC run – a “free” fix for typical Perl 6 programs which never touch threads directly, but just have a pool of them that are cleaned up by the OS at process end.

The second issue I hunted down was a subtle data race involving closure semantics and invocation. The symptoms were a “frame got wrong outer” on a good day, but usually a segfault. Anyway, it’s gone now.

Last but not least, I finally tracked down an issue that’s been occasionally reported over the last couple of months, but had proved hard to recreate reliably. Once I found it, I understood why: it would only show up in programs that both used threads and dynamically created types and left them to be GC’d. (For the curious: our GC is parallel during its stop the world phase, but lets threads do the finalization step concurrently so they can go their own way as soon as they get done finalizing. Unfortunately, the finalization of type tables could happen too soon, leaving another thread finalizing objects of that type with a dangling pointer. These things always sound like dumb bugs in hindsight…)

Fixed size/shaped arrays

Work on fixed size arrays and shaped arrays continued, helped along by the GLR. By now, you can say things like:

my @a := Array.new(:shape(3,3));
@a[1;1] = 42;

Next up will be turning that into just:

my @a[3;3];
@a[1;1] = 42;

Preparing for Christmas

With the 6.christmas release of the Perl 6 language getting closer, I decided to put on a project manager hat for a couple of hours and get a few things into, and out of, focus. First of all, I wrote up a list of things that will explicitly not be included in 6.christmas, and so deferred to a future Perl 6 language release.

And on the implementation side, I collected together tickets that I really want to have addressed in the Rakudo we ship as the Christmas release. Most of them relate to small bits of semantic indecision that we should really get cleaned up, so we don’t end up having to maintain (so many…) semantics we didn’t quite want for years and years to come. Having compiler crashes and fixing them up in the next release is far more forgivable than breaking people’s working code when they upgrade to the next release, so I’m worrying about loose semantic ends much more than “I can trigger a weird internal compiler error”.

The `is rw` cleanup

One of the issues on my Christmas list was getting the “is rw” semantics tightened up. We’ve not been properly treating it as a constraint, as the design docs wish, meaning that you could pass in a value rather than an assignable container and not get an error until you tried to assign to it. Now the error comes at signature binding time, so this program now gives an error:

sub foo($x is rw) { }
 foo(42); # the 42 fails to bind to $x

Error reporting improvements

I improved a couple of poor failure modes:

  • Fix RT #125812 (error reporting of with/without syntax issues didn’t match if/unless)
  • Finish fixing RT #125745 (add hint to implement ACCEPTS in error about ~~ being a special form)
  • Remove leftover debugging info in an error produced by MoarVM

Other bits

Finally, the usual collection of bits and pieces I did that don’t fit anywhere else.

  • Test and look over a MoarVM patch to fix VS 2015 build
  • Reject RT #125963 with an explanation
  • Write response to RT #126000 and reject it (operator lexicality semantics)
  • Start looking into RT #125985, note findings on ticket
  • Fix RT #126018 (crash in optimizer when analysing attribute with subset type as an argument)
  • Fix RT #126033 (hang when result of a match assigned to variable holding target)
  • Reviewing the gmr (“Great Map Refactor”) branch
  • Fix crash that sometimes happened when assembing programs containing labeled exception handlers
  • Review RT #125705, check it’s fixed, add a test to cover it. Same for RT #125161.
  • Cut September MoarVM release
  • Hunt JIT devirtualization bug on platforms using x64 POSIX ABI and fix it
  • Tests for RT #126089
  • Fix RT #126104 (the `is default` type check was inverted)
  • Investigate RT #126029, which someone fixed concurrently; added a test to cover it
  • Fix RT #125876 (redeclaring $_ inside of various constructs, such as a loop, caused a compiler crash)
  • Fix RT #126110 and RT #126115.
  • Fixed a POST regression
  • Fix passing allomorphs to native parameters and add tests; also clear up accidental int/num argument coercion and add tests
  • Fix RT #124887 and RT #124888 (implement missing <.print> and <.graph> subrules in regexes)
  • Fix RT #118467 (multi-dispatch sorting bug when required named came after positional slurpy)
  • Looking into RT #75638 (auto-export of multis), decided we’d rather not have that feature; updated design docs and closed ticket
  • Investigate weird return compilation bug with JIT/inline enabled; get it narrowed down to resolution of dynamics in JIT inlines
  • Fix RT #113884 (constant scalars interpolated in regexes should participate in LTM)
  • Investigate RT #76278, determine already fixed, ensure there is test coverage
Posted in Uncategorized | 1 Comment


It’s amazing just how long has gone by since I last wrote something here. I intend to get back to writing weekly posts about my Perl 6 work again starting around now, but here’s a few big-picture updates of what I’ve been up to.

First came YAPC::Asia. I gave a talk on Perl 6’s concurrent, parallel, and asynchronous features; the slides as well as a video are available. The talk was well received, and I was happy to have some interesting questions at the end. I was also happy to find myself talking with some Go compiler folks during the conference dinner; of course, we talked about concurrency models, and plenty besides. While the conference was great, and I’m glad I took up the invitation to go and speak there, my body basically refused to sleep at local night time. Given I’d only just managed to get shake off the exhaustion that had been troubling me for a while before the trip, this was rather sub-optimal – and meant I was pretty much beat by the time I got back home. Just in time for…the Swiss Perl Workshop.

Switzerland is, thankfully, in the same timezone as Prague, so no trouble with being able to sleep on a night! And the hotel was very comfortable too. Sadly, I still arrived to the workshop really quite exhausted – no thanks to needing to make the trip from Prague to Zurich via a night in Stockholm for annoying immigration-related reasons (I’m generally appreciative of the time the Schengen agreement saves me, but there’s certainly some improvement needed). Despite my tiredness, it was still a useful time. Day 1 was a hackathon, and lots of GLR-related decisions got made – much thanks to having the right people in the same room. On day 2, I gave my concurrency talk again, along with a talk on NFG. And on day 3, I taught a day-long Perl 6 introduction course – to a totally full room.

Despite feeling dizzy and exhausted in some periods, SPW was a great time. But by the time it was over, I really felt my body telling me I was pushing it way too far. The mere thought of getting on another flight gave me a headache. So, sadly I ended up cancelling my trip to YAPC::Europe this year. After attending 10 years worth of YAPC::Europe conferences in a row, I was sad to miss it – and it was a pity to have to cancel my session there too. Happily, lizmat++ stepped up to take my slot and deliver my concurrency talk – which she’d already seen in Japan and Switzerland. I spent several days offline, working my way back home overland with a couple of stops in Austria, thankfully with my wife to take good care of me along the way.

In terms of big Perl 6 implementation news: in the days between my last post here and YAPC::Asia, I got the GLR (Great List Refactor) to a point where others could contribute. And contribute they did! A special mention goes to nine++, who put in an incredible amount of work, meaning that I could merge the GLR work in early September. Since then we’ve been working through various details, but things are looking really good there.

Obviously, I’ve been having to take it somewhat easy in the last weeks. It’s going to be a while before I’m back to full health, and critically I need to avoid doing too much. I’ve been finding a decent amount of time to do Perl 6 things, but with limited energy have prioritized doing things (and helping others do things) rather than writing about doing things. By this point, I think I’ll be able to get back to regular progress reports again, so you’ll have those to look forward to. I’ll hopefully have one out for the last week or so sometime near the weekend.

Posted in Uncategorized | 2 Comments

This week: fixing lots of RTs, digging into the GLR

This report covers the week starting 27th July. The first half was spent teaching the course that I was busy finishing up writing last week; happily, it went extremely well and was a lot of fun to deliver. And since then, I’ve been back to Perl 6 things pretty much exclusively. Anyway, here’s what I got up to.

Fixing the &?ROUTINE pessimization

I noted in a recent episode that I fixed &?ROUTINE’s semantics with regards to closures. And indeed I had. Unfortunately, in doing so, I managed to make a whole bunch of things impossible for the MoarVM inliner to handle. Given how much fiddly work it was to implement multi-level inlining and deoptimization, I guess I should be happy that ending up with a lot less things being inlined would lead to a very measurable slowdown in real-world code – in this case, Text::CSV. This week I took another crack at &?ROUTINE, retaining the correct semantics, but making it so you only pay the cost of having it if you actually use it. And with that, the inlining started working much more effectively again (and Test::CSV regained some speed – and likely other code too).

The status of our-scoped things inside roles

There were four RT tickets on the subject of our-scoped things inside of roles. They all hinged on a seemingly easy question. If this works:

class C {
    our sub s() { }

Then why doesn’t this:

role R {
    our sub s() { }

There’s actually not one, but two reasons why. The first is that a given role is actually rather like a template, generic on the type of the class that it actually ends up being composed into (think about how ::?CLASS is generic inside the body of a role, then realize you could mention it in the our sub; you’ll note the our sub you’re referencing – if you could – would be ambiguous). The second is that you could later define:

role R[::T] {

That’s fine, this role and the previous one are disambiguated by their parameter lists. But it means that the symbol R that we install actually refers to a role disambiguator. (If you’re thinking this sounds like proto and multi subs all over again – you’re spot on. In fact, it’s implemented using the very same mechanism.)

Once you know these two things, it makes sense that you’re not going to reach an our-scoped thing burried two levels deep in genericity. Of course, when you start out with Perl 6 you don’t know those two things, you’ll more likely flail around frustratedly. Now we reject our-scoped declarations inside of roles at compile time, with an explanation of why.


The Great List Refactor, or Great List Re-implementation, or Great List Re-design, was identifed as one of the Big Three Tasks for Perl 6 ahead of the release we’re working towards later this year. In summary, the goal is to take on the semantic, speed, and memory issues with the current list design and implementation. I wasn’t expecting to be the person who led implementation work on this, but in the end it has fallen to me. Thankfully, a lot of the design thinking has already taken place, so it really is a case of focusing on the lower-level data structure design and implementation. Anyway, it takes a while to get from no code to something that’s ready for the wider community to have a conversation around, and so my GLR time in the week covered by this report started out with isolated contemplation and fleshing out code. Spoiler: I actually released it this Monday, and have continued evolving it since; you can find the latest in a Gist I’m keeping updated (though I suspect in the next days I’ll be moving over to working in a Rakudo branch).

Other fixes

I fixed quite a few other tickets too:

  • Fix RT #125675 (tighten up various signatures so we get bind failures, not .count/.arity dispatch failures)
  • Fix RT #125670 (rx{foo} as a parameter default caused compiler crash when it tried to do some static analysis)
  • Fix RT #125715 (problems using EXPORT-d type as a type constraint on an attribute)
  • Fix RT #125694 (private method calls in roles bound too tightly to role type)
  • Fix getting ugly low-level backtrace when sinking last statement in a program
  • Verify RT #125346 is fixed, write a test for it
  • Fix a MoarVM crash involving lexotic (return) handlers and a race condition in frame validation
  • Fix RT #125480 (program counter corruption due to bad interaction of LEAVE/return/closures)
  • Trying to hunt down a MoarVM parallel GC bug. Found one issue and patched it, but it’s seemingly not The One…

See you next week!

Posted in Uncategorized | 1 Comment

This week: too little sleep and too little Perl 6

This report covers the week starting the 20th July – which turned out to be the week I had to finish preparing course material and labs for one of Edument’s new courses. That managed to swallow most of the week, so only a handful of Perl 6 things got done. Happily, August is free of any teaching and authoring responsibilities, and will be dominated with Perl 6 work.

Digging into multi-dimensional arrays in Perl 6

I started a Rakudo branch to work on the multi-dimensional array support, and made various decisions about how things will work. I got some way into the changes to Array itself (and had it basically working provided you worked in terms of the *-POS API directly), and started to look at the ways the slicing implementation will need to change to pass multiple dimensions along (so you’ll actually be able to do the access with […] as you’d expect).

Banning confusion

I took on one of the oldest RTs in the queue, which wished for there to be a compile time error on:

say $*a; my $*a;

There’s long been one on:

say $a; my $a;

Basically, it catches use-before-declaration confusion (since the declaration happens at compile time). The second case is clear cut; the first is less so as it involves a dynamically scoped variable and so we’re naturally a bit looser about those. But after a little discussion, Larry said he’d like to outlaw the first just like the second if it wasn’t too hard to implement. I took a look at the code, decided it wasn’t too bad at all, and fixed it.

Unblocking the release

During the preparation for the 2015.07 release, somebody noticed a regression in reporting “return outside of routine” errors, that the tests had missed. I jumped in to get it fixed up, and added a better test so we don’t bust it again.

This week’s token regex engine patches

I fixed RT #125648 (no syntax error for /00:11:22/), as well as looking into RT #77524 (Rakudo treated /a:/ as legal syntax and STD did not; it turns out Rakudo was right on this one).

Other bits

  • Fix RT #125642 and RT #121308 (traits expecting types didn’t report bad type or make suggestions)
  • Analyze RT #125634 and get it down to a much smaller example of what’s wrong; no fix yet
Posted in Uncategorized | 1 Comment

This week: concurrency stuff, multi-dimensional stuff, stuff stuff…

Finally, I got a week of peaceful hacking time at home and not-too-bad health, and so Stuff Got Done. Here’s what.

Progress on multi-dimensional arrays

Last time, I’d gotten decent support into MoarVM for multi-dimensional arrays (including packing natively typed values into a single piece of memory), and had started on porting this functionality to the JVM. This week, I finished that JVM porting work. Doing something for the second time is often pretty good at showing up things you didn’t think through well enough the first time. In doing the port, I thought up a couple of tests I’d not written on MoarVM that would be explodey, or faily, or otherwise unhappy – and found one that I’d written that really didn’t make a lot of sense. So I wrote some extra tests, fixed things, and we now have MoarVM and JVM equipped with the guts needed to dig into implementing multi-dimensional arrays at the Perl 6 level.

Concurrency thinking and tooling

I did two interesting things relating to Perl 6’s concurrency support, both of which are interesting to discuss a little.

For a while, I’ve known that our syntactic relief for concurrent programming needed some attention. For one, our await keyword – so far implemented as the simplest possible thing – does not offer the semantics that would make it genuinely powerful. If you await a Promise today, you block the waiting thread until it’s ready. Sure, it’s a kernel-supported, OS-scheduler-efficient blocking, not busy waiting, but it still swallows one of the thread pool’s threads – which are real OS threads. And that makes it impossible to have many thousands of start blocks “active” and awaiting something to happen in order to make progress. What we want is for an await in a start block scheduled on the thread pool to return control to the thread pool, so the OS thread can be used for something else. Then, once the Promise being waited on reaches some conclusion, the rest of the start block is scheduled for resumption. That was always my vision for it, but until now I never got around to defining the API through which that happens, let alone implementing it. This week I tackled the API design part of the job. I’ll work on the implementation in the coming weeks.

Next in line were supplies. I like where we’ve gone with them so far, but working with them is very much an exercise in functional programming. It’s a bit like not having for loops and if statements, and having to write everything with map and grep. You can certainly do it, but plenty of normal people find code written that way harder to follow. Heck, some of the less-normal folks like me recognize that some solutions just read better when they’re more imperatively specified. I’ve been pondering this for a while. I really want the asynchronous aspects of Perl 6 to be accessible, and I really want people to be able to write operations that combine many asynchronous data sources – including time – without epic functional contortions. Having done my share of teaching Rx.Net, I’ve had plenty of chance to see people grapple with asynchronous data using an API a lot like we have for supplies. When there’s a nice built-in that does Just What You Want, it works out great. But sometimes there’s not, and you have to get creative, and then the result tends to feel clever rather than clear.

So, I also proposed some syntactic relief for working with supplies. It comes in two parts: supply { … } blocks for constructing supplies, and an asynchronous looping construct called whenever that works with it. So far, feedback has been positive; Larry said it looked sane, and other responses ranged from approving up to excited. So, it’s looking promising.

I didn’t update S17 yet, but rather wrote my proposals in a gist, which has all of the details.

The second big thing I did this week was work on a MoarVM bytecode instrumentation that can identify when one thread writes to an object that was created by another thread while not holding a lock. While there are of course patterns where you can legitimately do such a thing, they are the exception rather than the norm – and so having a tool that tells you when such things happen can help identify bugs. I wrote it to help me get a better insight into some of the threading bugs we have in the RT queue. It’s turned on by setting an environment variable, and it instruments bytecode (using the same approach the profiler does) to detect and record such cross-thread writes. It was also not a lot of code to implement, which I guess is good news on MoarVM’s architecture. And yes, there are sophisticated data race detection algorithms out there, but they’d all take a good bit more work to get in place (maybe some day in the future, I’ll take this one). For now, this first, simplistic, approach should help us hunt down a bunch of issues.

Meanwhile, in regex land…

I was active in the regex engine again.

  • Fix RT #125608 (Longest Token Matching did not factor in the first branch of a sequential alternation)
  • Verify RT #125391 (order of zero-width delimiters in .caps) already fixed and write a test for it
  • Fix RT #117955 (quantified captures only captured last items when used in a conjunction)
  • Investigate ways to deal with RT #67128 (calling another grammar); discussion, prototype a fix, find it needs lang design input

Better errors

This week had its share of improved failure modes and better feedback, to enhance Perl 6 user experience.

  • Fix RT #125595 (improve error reporting on bad loop specification, in line with STD)
  • Fix RT #125600 (good error message for running a directory, plus make sure we report such issues on STDERR)
  • Fix RT #115398 and RT #115400 (give good error with location info on trying to parameterize a non-parametric type)
  • Fix RT #125591 (failed to detect various mis-uses of $.x and $!x in signatures at compile time)
  • Fix RT #125625 (misleading/malformed error for useless accessor method generation with my $.a and our $.a)
  • Fix RT #125620 (gist method on custom exceptions with no message method would crash)

Other assorted bits

As usual, there are a few other small things I did that are worth a quick mention:

  • Fixing MSVC MoarVM build after an otherwise-good patch busted it
  • Fix a Proc::Async test file for Windows and add it to those we run
  • Fix RT #124121 (using “but” for role mixins with Array literal did the wrong thing, plus bad behavior with Parcel)
  • Implement does trait on variables (resolves RT #124747)
Posted in Uncategorized | 1 Comment