That week: concurrency fixes, control exceptions, and more

I’m back! I was able to rest fairly well in the days leading up to my wedding, and had a wonderful day. Of course, after the organization leading up to it, I was pretty tired again afterwards. I’ve also had a few errands to run with getting our new apartment equipped with various essentials – including a beefy internet connection. And, in the last couple of days, I’ve been steadily digging back into work on my Perl 6 grant. :-) So, there will be stuff to tell of about that soon; in the meantime, here’s the May progress report I never got around to writing.

Before fixing something, be sure you need it

I continued working on improving stability of our concurrent, async, and parallel programming features. One of the key things I found to be problematic was the frame cache. The frame cache was introduced to MoarVM very early on, to avoid memory allocations for call frames by keeping them cached. Back then, we allocated them using malloc. The frames were cached per thread, so in theory no thread safety issues, right? Well, wrong, unfortunately. The MoarVM garbage collector works in parallel, and while it does have to pause all threads for some of the work, it frees them to go on their way when it can. Also, it steals work from threads that are blocked on doing I/O, acquiring a lock, waiting on a condition variable, and so forth. One of the things that we don’t need to block all threads for is clearing out the per-thread nursery, and running any cleanup functions. And some of those cleanups are for closures, and those hold references to frames, which when freed go into the frame pool. This is enough for a nasty bug: thread A is unable to participate in GC, thread B steals its GC work, the GC gets mostly done and things resume, thread A comes back from its slumber and begins executing frames, touching the frame pool…which thread B is also now touching because it is freeing closures that got collected. Oops. I started pondering how best to deal with this…and then had a thought. Back when we introduced the frame cache, we used malloc to allocate frames. However, now we use MoarVM’s own fixed size allocator – which is a good bit faster (and of course makes some trade-offs to achieve that – there’s no free lunch in allocation land). Was the frame cache worth it? I did some benchmarking and discovered that it wasn’t worth it in terms of runtime (and timotimo++ did some measurements that suggested we might even be better off without it, though it was in the “noise” region). So, we could get less code, and be without a nasty threading bug. Good, right? But it gets better: the frame cache kept hold of memory for frames that we only ever executed during startup, or only ran during the compile phase and not at runtime. So eliminating it would also shave a bit off Rakudo’s memory consumption too. With that data in hand, it was a no-brainer: the frame cache went away, and with it an icky concurrency bug. However, the frame cache had one last “gift” for us: its holding on to memory had hidden a reference counting mistake for deserialized closures. Thankfully, I was able to identify it fairly swiftly and fix it.

More concurrency fixes

I landed a few more concurrency related fixes also. Two of them related to parametric role handling and role specialization, which needed some concurrency control. To ease this I added an Lock class to NQP, with pretty much the same API as the Perl 6 one. (You could use locks from NQP before, but there was some boilerplate. Now it’s easy.) One other long-standing issue we’ve had is that using certain features (such as timers) could drive a single CPU core up to 100% usage. This came from using an idle handler on the async event loop thread to look for new work and check if we needed to join in with a garbage collection run. It did yield to the OS if it had nothing to do – but on an unloaded system, a yield to the OS scheduler will simply get you scheduled again in pretty short order. Anyway, I spent some time looking into a better way and implemented it. Now we use libuv async handlers to wake up the event loop when there’s new work it needs to be aware of. And with that, we stopped swallowing up a core. Unfortunately, while this was a good fix, our CPU hunger had been making another nasty race involving lazy deserialization only show up occasionally. Without all of that resource waste, this race started showing up all over the place. This is now fixed – but it happened just in these last couple of days, so I’ll save talking about it until the next report.

Startup time improvements

Our startup time has continued to see improvements. One of the blockers was a pre-compilation bug that showed up when various lazily initialized symbols were used (of note, $*KERNEL, $*VM, and $*DISTRO). It turns out the setup work of these poked things into the special PROCESS package. While it makes no sense to serialize an updated version of this per-process symbol table…we did. And there was no good mechanism to prevent that from happening. Now there is one, resolving RT #125090. This also unblocked some startup time optimizations. I also got a few more percent off startup by deferring constructing a rarely used DateTime object for the compiler build date.

Control exception improvements

Operations like next, last, take, and warn throw “control exceptions”. These are not caught by a normal CATCH handler, and you don’t normally need to think of them as exceptions. In fact, MoarVM goes to some effort to allow their implementation without ever actually creating an exception object (something we take advantage of in NQP, though not yet in Rakudo). If you do want to catch them and process them, you can use a CONTROL block. Trouble is, there was no way to talk about what sort of control exceptions were interesting. Also, we had some weird issues when a control exception went unhandled (RT #124255), manifesting in a segfault a while back, though by the time I looked at it we were merely at giving a poor error. Anyway, I fixed that, and introduced types for the different kinds of control exception, so you can now do things like:

CONTROL {
    when CX::Warn {
        log(.message);
    }
}

To do custom logging of warnings. I also wrote various tests for these things.

Collaboration

I helped out others, and they helped me. :-)

  • FROGGS++ was working on using the built-in serialization support we have for the module database, but was running into a few problems. I helped resolve them…and now we have faster module loading. Yay.
  • TimToady++ realized that auto-generated proto subs were being seen by CALLER, auto-generated method ones were not, and hand-written protos that only delegated to the correct multi also were not. He found a way to fix it; I reviewed it and tweaked it to use a more appropriate and far cheaper mechanism.
  • I noticed that calling fail was costly partly because we constructed a full backtrace. It turns out that we don’t need to fully construct it, but rather can defer most of the work (for the common case where the failure is handled). I mentioned this on channel, and lizmat++ implemented it

Other bits

I also did a bunch of smaller, but useful things.

  • Fixed two tests in S02-literals/quoting.t to work on Win32
  • Did a Panda Wndows fix
  • Fixed a very occasional SEGV during CORE.setting compilation due to a GC marking bug (RT #124196)
  • Fixed a JVM build breakage
  • Fixed RT #125155 (.assuming should be supported on blocks)
  • Fixed a crash in the debugger, which made it explode on entry for certain script

More next time!

Posted in Uncategorized | 1 Comment

Taking a short break

Just a little update. When I last posted here, I expected to write the post about what I got up to in the second week of May within a couple of days. Unfortunately, in that next couple of days I got quite sick, then after a few days a little better again, then – most likely due to my reluctance to cancel anything – promptly got sick again. This time, I’ve been trying to make a better job of resting so I can make a more complete recovery.

By now I’m into another round of “getting better”, which is good. However, I’m taking it as easy as I can these days; at the weekend I will have my wedding, and – as it’s something I fully intend to only do once in my life – I would like to be in as good health as possible so I can enjoy the day! :-)

Thanks for understanding, and I hope to be back in Perl 6 action around the middle of next week.

Posted in Uncategorized | 4 Comments

Last week: smaller hashes, faster startup, and many fixes

I’m a little behind with writing up these weekly-ish reports, mostly thanks to attending OSDC.no and giving about 5 hours worth of talks/tutorials. That, along with spending time with the various other Perl 6 folks who were in town, turned out to be quite exhausting. Anyway, I’m gradually catching up on sleep. I’ll write up what I did in the first week of May in this post, and I’ll do one in a couple of days for the second week of May.

Shrinking hashes

In MoarVM, we’ve been using the UT Hash library since the earliest days of the VM. There was no point spending time re-implementing a common data structure for what, at the time, was decidedly an experimental project, especially when there was an option with an appropriate license and all contained in a single header file. While we started out with it fairly “vanilla”, we’ve already tweaked it in various ways (for example, caching the hash code in the MVMString objects). UT Hash does one thing we didn’t really need nor want for Perl 6: retain the insertion order. Naturally, we paid for this: to the tune of a pointer per hash, plus two pointers per hash item (for a doubly linked list). I ripped these out (which involved re-writing a few things that depended on them), as well as gutting the library of other functionality we don’t need. This means hashes on MoarVM on 64-bit builds are now 8 bytes smaller per hash, and 16 bytes smaller per hash item. Since we use hashes for plenty of things, this is a welcome saving; it even produces a measurable decrease to Rakudo’s base memory. This also paves the way for further improvements to hashes and string storage. (For what it’s worth, the time savings as a result of these changes were tiny compared to the memory savings: measurable under callgrind, but a drop in the ocean. Turns out a doubly linked list is a rather cheap thing to maintain.)

Decreasing startup time

I did a number of things to improve Rakudo’s startup time on MoarVM. One of them was changing the way we store string literals in the bytecode file. Previously, they were all stored as UTF-8. Now, those that are in the Latin-1 range are stored as Latin-1. Since this doesn’t need any kind of decoding or normalization, it is far cheaper to process them. This gave a 10% decrease in instructions executed at startup. Related to this, I aligned the serialization blob string heap with the bytecode file one, meaning that we don’t have to store a bunch of mappings between the two (another 2% startup win, and a decrease in bytecode file size, probably 100KB over all the bytecode files we load into memory as part of Rakudo).

2% more fell off when I realized that some careful use of static inlines and a slightly better design would lower the cost of temporary root handling (used to maintain the “GC knows where everything is” invariant). This took 2% off startup – but also will be an improvement in a lot of other code too.

Another small win was noticing that we built the backend configuration data hash every time we were asked for it, rather than building it once and caching it. This was an easy win, and saved building it 7 times over during Rakudo’s startup (adding up more things to GC each time, of course). Finally, I did a little optimization of bounds checking when validating bytecode. These two improvements were both below the 1% threshold, though together added up to about that.

The revenge of NFG

Shortly after landing NFG, somebody reported a fairly odd output related bug. After a while, it was isolated to UTF-8 encoding, and when I looked I saw…a comment saying the encoder code would need updating post-NFG. D’oh. So, I took care of that, saving us some memory along the way. I also noticed some mess with NULL-terminated string handling (MoarVM ones aren’t, of course, but we do have to deal with the C world now and then), which I took the opportunity to get cleaned up.

Digging into concurrency bugs

Far too much of the concurrency work I did in Rakudo and the rest of the stack was conducted under a good amount of time pressure. Largely, I focused in getting things sufficiently in place that we could play with them and explore semantics. This has served us well; there was a chance to show running examples to the community at conferences and workshops and gather feedback, and for Larry and others to work on the API and language level of things. The result is that we’ve got a fairly nice bunch of concurrency features in the language now, but anybody who has tried to use them much will have noticed they behave in a rather flaky way in a number of circumstances. With NFG largely taken care of, I’m now going to be turning more of my attention to these issues.

Here are some of the early things I’ve done to start making things better:

  • Fixing a couple of bugs so the t/concurrency set of tests in NQP work much better on MoarVM
  • Hunting down and resolving an ABA problem in the MoarVM fixed size allocator free list handling, which occurred rarely but could cause fairly nasty memory corruption when it did (RT #123883)
  • Eliminating a bad assumption in the ThreadPoolScheduler which caused some oddities in Promise execution order (RT #123520)
  • Adding missing concurrency control to the multi-dispatch cache. While it was designed carefully to permit reads (concurrent with both other other reads and additions), so you never need to lock when doing a multi-dispatch cache lookup, the additions need to be done one at a time (something forseen when it was designed, but not handled). These additions are now serialized with a lock.

Other assorted bits

Here are a few other small things I did, worth a quick mention.

  • Fix and tests for RT #123641 and #114966 (cases where we got a block, but should have got an anonymous hash)
  • Fix RT #77616 (/a ~ (c) (b)/ reverted positional capture order)
  • Review and reject RTs #116525 and #117109 (out of line with decided flattening semantics)
  • Add test for and resolve RT #118785 (“use fatal” semantics were already fixed, but good to have an explicit test case)
  • Tests for RT #122756 and #114668 (an already fixed mixins bugs)
Posted in Uncategorized | 2 Comments

This week: the big NFG switch on, and many fixes

During my Perl 6 grant time during the last week, I have continued with the work on Normal Form Grapheme, along with fixing a range of RT tickets.

Rakudo on MoarVM uses NFG now

Since the 24th April, all strings you work with in Rakudo on MoarVM are at grapheme level. This includes string literals in program source code and strings obtained through some kind of I/O.

my $str = "D\c[COMBINING DOT ABOVE]\c[COMBINING DOT BELOW]";
say $str.chars;     # 1
say $str.NFC.codes; # 2

This sounds somewhat dramatic. Thankfully, my efforts to ensure my NFG work had good test coverage before I enabled it so widely meant it was, for most Rakudo users, something of a non-event. (Like so much in compiler development, you’re often doing a good job when people don’t notice that your work exists, and instead are happily getting on with solving the problems they care about.)

The only notable fallout reported (and fixed) so far was a bizarre bug that showed up when you wrote a one byte file, then read it in using .get (which reads a single line), which ended up duplicating that byte and giving back a 2-character string! You’re probably guessing off-by-one at this point, which is what I was somewhat expecting too. However, it turned out to be a small thinko of mine while integrating the normalization work with the streaming UTF-8 decoder. In case you’re wondering why this is a somewhat tricky problem, consider two packets arriving to a socket that we’re reading text from: the bytes representing a given codepoint might be spread over two packets, as may the codepoints representing a grapheme.

There were a few other interesting NFG things that needed attending to before I switched it on. One was ensuring that NFG is closed over concatenation operations:

my $str = "D\c[COMBINING DOT ABOVE]";
say $str.chars; # 1
$str ~= \c[COMBINING DOT BELOW];
say $str.chars; # 1

Another is making sure you can do case changes on synthetics, which has to do the case change on the base character, and then produce a new synthetic:

my $str = "D\c[COMBINING DOT ABOVE]\c[COMBINING DOT BELOW]";
say $str.NFD; # NFD:0x<0044 0323 0307>
say $str.NFC; # NFC:0x<1e0c 0307>
say $str.lc.chars; # 1
say $str.lc.NFD; # NFD:0x<0064 0323 0307>
say $str.lc.NFC; # NFC:0x<1e0d 0307>

(Aside: I’m aware there are some cases where Unicode has defined precomposed characters in one case, but not in another. Yes, it’s an annoying special case (pun intended). No, we don’t get this right yet.)

Uni improvements

I’ve been using the Uni type quite a bit to test the various normalization algorithms as I’ve implemented them. Now I’ve also filled out various of the missing bits of functionality:

  • You can access the codepoints using array indexing
  • It responds to .elems, and behaves that way when coerced to a number also
  • It provides .gist and .perl methods, working like Buf
  • It boolifies like other array-ish things: true if non-empty

I also added tests for these.

What the thunk?

There were various scoping related bugs when the compiler had to introduce thunks. This led to things like:

try say $_ for 1..5;

Not having $_ correctly set. There were similar issues that sometimes made you put braces around gather blocks when it should have worked out fine with just a statement. These issues were at the base of 5 different RT tickets.

Regex sub assertions with args

I fixed the lexical assertion syntax in regexes, so you can now say:

<&some-rule(1, 2)>
<&some-rule: 1, 2>

Previously, trying to pass arguments resulted in a parse error.

Other assorted fixes

Here are a few other RTs I attended to:

  • Fix and tests for RT #124144 (uniname explodes in a weird way when given negative codepoints)
  • Fix and test for RT #124333 (SEGV in MoarVM dynamic optimizer when optimizing huge alternation)
  • Fix and test RT #124391 (use after free bug in error reporting)
  • Fix and test RT #114100 (Capture.perl flattened hashes in positionals, giving misleading output)
  • Fix and test RT #78142 (statement modifier if + bare block interaction; also fixed it for unless and given)
Posted in Uncategorized | 5 Comments

This week: digging into NFG, fixing “use fatal”, and more

It’s time for this week’s grant report! What have I been up to?

NFG

Last time, I talked about normalization forms in Unicode, and how Rakudo on MoarVM now lets you move between them and examine them, using the Uni type and its subclasses. We considered an example involving 3 codepoints:

my $codepoints = Uni.new(0x0044, 0x0323, 0x0307);
.say for $codepoints.list>>.uniname;

They are (as printed by the code above):

LATIN CAPITAL LETTER D
COMBINING DOT BELOW
COMBINING DOT ABOVE

We also noted that if we put them into NFC (Normal Form Composed – where we take codepoints sequences and identify where we can use precomposed codepoints), using this code:

my $codepoints = Uni.new(0x0044, 0x0323, 0x0307);
.say for $codepoints.NFC.list>>.uniname;

Then we get this:

LATIN CAPITAL LETTER D WITH DOT BELOW
COMBINING DOT ABOVE

Now, if you actually render that, and presuming you have a co-operating browser, it comes out as Ḍ̇ (you should hopefully be seeing a D with a dot above and below). If I was to stop people on the street and ask them how many characters they see then I show them a “Ḍ̇”, they’re almost certainly all going to say “1”. Yet if we work at codepoint level, even having applied NFC, we’re going to consider those as 2 “thingies”. Worse, we could do a substr operation and chop off the combining dot above. This is actually the situation in most programming languages.

Enter NFG. While it’s not on for all strings by default yet, if you take any kind of Uni and coerce it to a Str, you have a string represented in Normal Form Grapheme. The first thing to note is that if we ask .chars of it, we get 1:

my $codepoints = Uni.new(0x0044, 0x0323, 0x0307);
say $codepoints.Str.chars; # 1

How did it do this? To get Normal Form Grapheme, we first calculate NFC. Then, if we are left with combining characters (formally, anything with a non-zero value for Canonical_Combining_Class), we compute a synthetic codepoint and use it inside of the string. You’ll never actually see these. When we output the string, or turn it back into codepoints, the synthetics unravel. But when you’re working with a Str, you’re at grapheme level. Better still, operations like .chars and .substr are O(1).

Under the hood, synthetics are represented by negative integers. This gives a cheap and easy way to know if we’re looking at a synthetic codepoint or not. And because we intern the synthetics, things like string equality testing can be a cheap and fast memory compare operation. On a further implementation note, I went with a partially lock-free trie to do the interning, meaning that no locks are needed to do lookups, and we only acquire one if we have to register a new synthetic- giving thread safety with very little overhead for real-world use cases.

I’m gradually assembling test cases for NFG. Some can be generated systematically from the Unicode NormalizationTests.txt file, though they are rather more involved since they have to be derived from the normalization test data by considering the canonical combining class. Others are being written by hand.

For now, the only way to get an NFG string is to call .Str on a Uni. After the 2015.04 release, I’ll move towards enabling it for all strings, along with dealing with some of the places where we still need to work on supporting NFG properly (such as in the leg operator, string bitwise operators, and LTM handling in the regex/grammar engine).

use fatal

For a while, Larry has noted that a $*FATAL dynamic variable was the wrong way to handle use fatal – the pragma that makes lazy exceptions (produced by fail) act like die.  The use fatal pragma was specified to automatically apply inside of a try block or expression, but until this week a Failure could escape a try, with the sad consequence that things like:

say try +'omg-not-anumber';

Would blow up with an exception saying you can’t sanely numify that. When I tried to make the existing implementation of use fatal apply inside of try blocks, it caused sufficient fireworks and action at a distance that it became very clear that not only was Larry’s call right, but we had to fix how use fatal works before we could enable it in try blocks. After some poking, I managed to get an answer that pointed me in the direction of doing it as an AST re-write in the lexical scope that did use fatal. That worked out far better. I applied it to try, and found myself with only one spectest file that needed attention (which had code explicitly relying on the old behavior). The fallout in the ecosystem seems to have been minimal, so we’re looking good on this. A couple of RTs were resolved thanks to this work, and it’s another bit of semantics cleaned up ahead of this year’s release.

Smaller things

Finally, here are a handful of smaller things I’ve worked on:

  • Fix and test for RT #77786 (poor error reporting on solitary backtrack control)
  • Make ‘soft’ and ‘MONKEY-TYPING’ work lexically
  • Fix and test for RT #124304 (interaction of ‘my \a = …’ declarations with EVAL)
  • Fix for RT #122914 (binding in REPL didn’t persist to the next line); also add tests and make REPL tests work on Win32

Have a good week!

Posted in Uncategorized | 3 Comments

This week: Unicode normalization, many RTs

After some months where every tuit was sacred, and so got spent on implementing Perl 6 rather than writing about implementing Perl 6, I’m happy to report that for the rest of 2015 I’ll now have time to do both again. \o/ I recently decided to take a step back from my teaching/mentoring work at Edument and dedicate a good bit more time to Perl 6. The new Perl 6 core development fund is a big part of what has enabled this, and a huge thanks must go to WenZPerl for providing the initial round of funding to make this happen. I wrote up a grant application, which contains some details on what I plan to be doing. In short: all being well with funding, I’ll be spending 50% of my working time for the rest of 2015 doing Perl 6 things, to help us towards our 2015 release goals.

As part of the grant, I’ll be writing roughly weekly progress reports here. While the grant application is open for comments still, to make sure the community are fine with it (so far, the answer seems to be “yes!”), I’ve already been digging into the work. And when I looked at the list of things I’ve already got done, it felt like time already to write it up. So, here goes!

Uni and NF-whatever

One of the key goals of the grant is to get Normal Form Grapheme support in place. What does that mean? When I consider Unicode stuff, I find it helps greatly to to be clear about 3 different levels we might be working at, or moving between:

  • Bytes: the way things look on disk, on the wire, and in memory. We represent things at this level using some kind of encoding: UTF-8, UTF-16, and so forth. ASCII and Latin-1 are also down at this level. In Perl 6, we represent this level with the Buf and Blob types.
  • Code points: things that Unicode gives a number to. That includes letters, diacritics, mathematical symbols, and even a cat face with tears of joy. In Perl 6, we represent a bunch of code points with the Uni type.
  • Graphemes: things that humans would usually call “a character”. Note that this is not the same as code points; there are code points for diacritics, but a typical human would tell you that it’s something that goes on a character, rather than being a character itself. These things that “go on a character” are known in Unicode as combining characters (though a handful of them, when rendered, visually enclose another character). In Perl 6, we represent a bunch of graphemes with the Str type – or at least, we will when I’m done with NFG!

Thus, in Perl 6 we let you pick your level. But Str – the type of things you’d normally refer to as strings – should work in terms of graphemes. When you specify an offset for computing a substring, then you should never be at risk of your “cut the crap” grapheme (which is, of course, made from two codepoints: pile of poo with a combining enclosing circle backslash) getting cut into two pieces. Working at the codes level, that could easily happen. Working at the bytes level (as C# and Java strings do, for example), you can even cut a code point in half, dammit. We’ve been better that that in Perl 6 pretty much forever, at least. Now it’s time to make the final push and get Str up to the grapheme level.

Those who have played with Perl 6 for a while will have probably noticed we do have Str (albeit at the code point level) and Buf (correctly at the bytes level), but will not have seen Uni. That’s because it’s been missing so far. However, today I landed some bits of Uni support in Rakudo. So far this only works on the MoarVM backend. What can we do with a Uni? Well, we can create it with some code points, and then list them:

my $codepoints = Uni.new(0x1E0A, 0x0323);
say $codepoints.list.map(*.base(16)); # 1E0A 323

That’s a Latin capital letter D with dot above, followed by a combining dot below. Thing is, Uniocde also contains a Latin capital letter D with dot below, and a combining dot above. By this point, you’re probably thinking, “oh great, how do we even compare strings!” The answer is normalization: transforming equivalent sequences of code points into a canonical representation. One of these is known as Normal Form Decomposed (aka. NFD), which takes all of the pre-composed characters apart, and then performs a stable sort (to handwave a little, it only re-orders things that don’t render at the same location relative to the base character; look up Canonical Combining Class and the Unicode Canonical Sorting Algorithm if you want the gory details). We can compute this using the Uni type:

say $codepoints.NFD.list.map(*.base(16)); # 44 323 307

We can understand this a little better by doing:

say uniname($_) for $codepoints.NFD.list;

Which tells us:

LATIN CAPITAL LETTER D
COMBINING DOT BELOW
COMBINING DOT ABOVE

The other form you’ll usually encounter is Normal Form Composed (aka. NFC). This is (logically, at least) computed by first computing NFD, and then putting things back together into composed forms. There are a number of reasons this is not a no-op, but the really important one is that computing NFD involved a sorting operation. We can see that NFC has clearly done some kind of transform just by trying it:

my $codepoints = Uni.new(0x1E0A, 0x0323);
say $codepoints.list.map(*.base(16)); # 1E0C 307

We can again dump out the code point names:

LATIN CAPITAL LETTER D WITH DOT BELOW
COMBINING DOT ABOVE

And see that the normal form is a Latin capital letter D with dot below followed by a combining dot above.

There are a number of further details – not least that Hangul (the beautiful Korean writing system) needs special algorithmic handling. But that’s the essence of it. The reason I started out working on Uni is because NFG is basically going a step further than NFC. I’ll talk more about that in a future post, but essentially NFG needs NFC, and NFC needs NFD. NFD and NFC are well defined by the Unicode consortium. And there’s more: they provide an epic test suite! 18,500 test cases for each NFD, NFKD, NFC, and NFKC (K is for “comparability”, and performs some extra mappings upon decomposition).

Implementing the Uni type gave me a good way to turn the Unicode conformance tests into a bunch of Perl 6 tests – which is what I’ve done. Of course, we don’t really want to run all 70,000 or so of them every single spectest, so I’ve marked the full set up as stress tests (which we run as part of a release) and also generated 500 sanity tests per normalization form, which are now passing and included in a normal spectest run. These tests drove the implementation of the four normalization forms in MoarVM (header, implementation). The inline static function in the header avoids doing the costly Unicode property lookups for a lot of the common cases.

So far the normalization stuff is only used for the Uni type. Next up will be producing a test suite for mapping between Uni and (NFG) Str, then Str to Uni and Buf, and only once that is looking good will I start getting the bytes to Str direct path producing NFG strings also.

Unicode 7.0 in MoarVM

While I was doing all of the above, I took the opportunity to upgrade the MoarVM Unicode database to use the Unicode 7.0 database.

Unsigned type support in NativeCall

Another goal of my work is to try and fix things blocking others. Work on Perl 6 database access was in need of support for unsigned native integer types in NativeCall, so I jumped in and implemented that, and added tests. The new functionality was put to use within 24 hours!

RTs

Along with the big tasks like NFG, I’m also spending time taking care of a bunch of the little things. This means going through the RT queue and picking out things to fix, to make the overall experience of using Perl 6 more pleasant. Here are the ones I’ve dealt with in the last week or so:

  • Fix, add test for, and resolve RT #75850 (a capture \( (:a(2)) ) with a colonpair in parens should produce a positional argument but used to wrongly produce a named one)
  • Analyze, fix, write test for, and resolve RT #78112 (illegal placeholder use in attribute initializer)
  • Fix RT #93988 (“5.” error reporting bug), adding a typed exception to allow testability
  • Fixed RT #81502; missing undeclared routine detection for BEGIN blocks and include location info about BEGIN-time errors
  • Fixed RT #123967; provided good error reporting when an expression in a constant throws an exception
  • Tests for RT #123053 (Failure – lazy exceptions – can escape try), but no fix yet. Did find a few ways to fix it that didn’t work, but led to two other small improvements along the way.
  • Typed exception for bad type for variable/parameter declaration; test for and resolve RT #123397
  • Fix and test for RT #123627 (bad error reporting of “use Foo Undeclared;”)
  • Fix and test for RT #123789 (SEGV when assigning type object to native variable), plus harden against similar bugs
  • Implemented “where” post-constraints on attributes/variables. Resolves RT #122109; unfudged existing test case.
  • Handle “my (Int $x, Num $y)” (RT #102414) and complain on “my Int (Str $x)” (RT #73102)
Posted in Uncategorized | 2 Comments

What I’ve been working on, and what’s coming up

It’s been a little while since I wrote an update here, and with a spare moment post-teaching I figured it was a nice time to write one. I was lucky enough to end up being sent to the arctic (or, near enough) for this week’s work, meaning I’m writing this at 11pm, sat outside – and it’s decidedly still very light. It just doesn’t do dark at this time of year in these parts. Delightful.

Asynchronous Bits

In MoarVM and Rakudo 2014.05, basic support for asynchronous sockets landed. By now, they have also been ported to the JVM backend. In 2014.06, there were various improvements – especially with regard to cancellation and fixing a nasty race condition. Along the way, I also taught MoarVM about asynchronous timers, bringing time-based scheduling up to feature parity with the JVM backend. I then went a step further and added basic file watching support and some signals support; these two sadly didn’t yet make it to the JVM backend. On signals, I saw a cute lightning talk by lizmat using signal handling and phasers in loops to arrange Ctrl+C to exit a program once a loop had completed its current iteration.

While things basically work, they are not yet as stable as they need to be – as those folks implementing multi-threaded asynchronous web servers and then punishing them with various load-testing tools are discovering. So I’ll be working in the next weeks on hunting down the various bugs here. And once it’s stable, I may look into optimizations, depending on if they’re needed.

Various Rakudo fixes an optimizations

I’ve also done various assorted optimizations and fixes in Rakudo. They’re all over the map: optimizing for 1..100000 { } style loops into cheaper while loops, dealing with various small bugs reported in the ticket system, implementing a remaining missing form of colonpair syntax (:42nd meaning nd => 42), implementing Supply.on_demand for creating supplies out of…well…most things, optimizing push/unshift of single items…it goes on. I’ll keep trucking away at these sorts of things over the coming months; there’s some bugs I really want to nail.

MoarVM’s dynamic optimizer

A bunch of my recent and current work revolves around MoarVM’s dynamic bytecode optimizer, known as “spesh” (because its primary – though not only – strategy is to specialize bytecode by type). Spesh first appeared in the 2014.04 release of MoarVM. Since then, it’s improved in numerous ways. It now has a logging phase, where it gathers extra type information at a number of places in the code. After this, it checks if the type information is stable – which is often the case, as most potentially polymorphic code is monomorphic (or put another way, dynamic languages are mostly just eventually-static). Provided we did get consistent types recorded, then guard clauses are inserted (which cheaply check we really did get the expected type, and if not triggering deoptimization – falling back to the safe but slower bytecode). The types can them be assumed by the code that follows, allowing a bunch of optimizations to code that the initial specializer just couldn’t do much with.

Another important optimization spesh learned was optimizing dispatch based on the type information. By the time the code gets hot enough to specialize, multiple dispatch caches are primed. These, in combination with type information, are used to resolve many multiple dispatches, meaning that they become as cheap as single dispatches. Furthermore, if the callee has also been specialized – which is likely – then we can pick the appropriate specialization candidate right off, eliminating a bunch of duplicate guard checks. Everything described so far was on 2014.05.

So, what about 2014.06? Well, the 2014.05 work on dispatch – working out exactly what code we’re going to be running – was really paving the way for a much more significant optimization: inlining. 2014.06 thus brought basic support for inlining. It is mostly only capable of NQP code at the moment, but by 2014.07 it’ll be handling inlining the majority of basic ops in Rakudo that the static optimizer can’t already nail. Implementing inlining was tricky in places. I decided to go straight for the jugular and support multi-level inlining – that is, inlining things that also inline things. There are a bunch of places in Perl 6 this will be useful; for example, ?$some_int compiles to prefix:<?>($some_int), which in turn calls $some_int.Bool. That is implemented as nqp::p6bool(nqp::bool_I(self)). With inlining, we’ll be able to flatten away those calls. In fact, we’re just a small number of patches off that very case actually being handled.

The other thing that made it tricky to implementing inlining is that spesh is a speculative optimizer. It looks at what types it’s seeing, and optimizes assuming it will always get those types. Those optimizations include inlining. So, what if we’re inlining a couple of levels deep, are inside one of the inlined bits of code, and something happens that invalidates our assumptions? This triggers de-optimization. However, in the case of inlining, it has to go back and actually create the stack frames that it elided creating thanks to having applied inlining. This was a bit fiddly, but it’s done and was part of the 2014.06 release.

Another small but significant thing in 2014.06 is that we started optimizing some calls involving named parameters to pull the parameters out of the caller’s callsite by index. When the optimization applies, it saves doing any string comparisons whatsoever when handling named parameters.

So 2014.07 will make inlining able to cope better with Rakudo’s generated code, but anything else? Well, yes: I’m also working on OSR (On Stack Replacement). One problem today is that if the main body of the program is a hot loop doing thousands of iterations, we never actually get to specialize the loop code (because we only enter the code once, and so far it is repeated calls to a body of code that triggers optimization). This is especially an issue in benchmarks, but can certainly show up in real-life code too. OSR will allow us to detect such a hot looop exists in unoptimized code, go off and optimize it, and then replace the running code with the optimized version. Those following closely might wonder if this isn’t just a kind of inverse de-optimization, and that is exactly how I’ll implement it: just use the deopt table backwards to work out where to shove the program counter. Last but not least, I also plan to work on a range of optimizations to generic code written in roles, to take away the genericity as part of specialization. Given the grammar engine uses roles in various places, this should be a healthy optimization for parsing.

JIT Compilation for MoarVM

I’m not actually implementing this one; rather, I’m mentoring brrt++, who is working on it for his Google Summer Of Code project. My work on spesh was in no small part to enable a good JIT. Many of the optimizations that spesh does turn expensive operations with various checks into cheap operations that just go and grab or store data, and thus should be nicely expressible in machine code. The JIT that brrt is working on goes from the graph produced by spesh. It “just” turns the graph into machine code, rather than improved bytecode. Of course, that’s still a good bit of work, especially given de-optimization has to be factored into the whole thing too. Still, progress is good, and I expect the 2014.08 MoarVM release will include the fruits of brrt’s hard work.

Posted in Uncategorized | 2 Comments