This week: fixing lots of RTs, digging into the GLR

This report covers the week starting 27th July. The first half was spent teaching the course that I was busy finishing up writing last week; happily, it went extremely well and was a lot of fun to deliver. And since then, I’ve been back to Perl 6 things pretty much exclusively. Anyway, here’s what I got up to.

Fixing the &?ROUTINE pessimization

I noted in a recent episode that I fixed &?ROUTINE’s semantics with regards to closures. And indeed I had. Unfortunately, in doing so, I managed to make a whole bunch of things impossible for the MoarVM inliner to handle. Given how much fiddly work it was to implement multi-level inlining and deoptimization, I guess I should be happy that ending up with a lot less things being inlined would lead to a very measurable slowdown in real-world code – in this case, Text::CSV. This week I took another crack at &?ROUTINE, retaining the correct semantics, but making it so you only pay the cost of having it if you actually use it. And with that, the inlining started working much more effectively again (and Test::CSV regained some speed – and likely other code too).

The status of our-scoped things inside roles

There were four RT tickets on the subject of our-scoped things inside of roles. They all hinged on a seemingly easy question. If this works:

class C {
    our sub s() { }
}
C::s();

Then why doesn’t this:

role R {
    our sub s() { }
}
R::s();

There’s actually not one, but two reasons why. The first is that a given role is actually rather like a template, generic on the type of the class that it actually ends up being composed into (think about how ::?CLASS is generic inside the body of a role, then realize you could mention it in the our sub; you’ll note the our sub you’re referencing – if you could – would be ambiguous). The second is that you could later define:

role R[::T] {
    ... 
}

That’s fine, this role and the previous one are disambiguated by their parameter lists. But it means that the symbol R that we install actually refers to a role disambiguator. (If you’re thinking this sounds like proto and multi subs all over again – you’re spot on. In fact, it’s implemented using the very same mechanism.)

Once you know these two things, it makes sense that you’re not going to reach an our-scoped thing burried two levels deep in genericity. Of course, when you start out with Perl 6 you don’t know those two things, you’ll more likely flail around frustratedly. Now we reject our-scoped declarations inside of roles at compile time, with an explanation of why.

The GLR

The Great List Refactor, or Great List Re-implementation, or Great List Re-design, was identifed as one of the Big Three Tasks for Perl 6 ahead of the release we’re working towards later this year. In summary, the goal is to take on the semantic, speed, and memory issues with the current list design and implementation. I wasn’t expecting to be the person who led implementation work on this, but in the end it has fallen to me. Thankfully, a lot of the design thinking has already taken place, so it really is a case of focusing on the lower-level data structure design and implementation. Anyway, it takes a while to get from no code to something that’s ready for the wider community to have a conversation around, and so my GLR time in the week covered by this report started out with isolated contemplation and fleshing out code. Spoiler: I actually released it this Monday, and have continued evolving it since; you can find the latest in a Gist I’m keeping updated (though I suspect in the next days I’ll be moving over to working in a Rakudo branch).

Other fixes

I fixed quite a few other tickets too:

  • Fix RT #125675 (tighten up various signatures so we get bind failures, not .count/.arity dispatch failures)
  • Fix RT #125670 (rx{foo} as a parameter default caused compiler crash when it tried to do some static analysis)
  • Fix RT #125715 (problems using EXPORT-d type as a type constraint on an attribute)
  • Fix RT #125694 (private method calls in roles bound too tightly to role type)
  • Fix getting ugly low-level backtrace when sinking last statement in a program
  • Verify RT #125346 is fixed, write a test for it
  • Fix a MoarVM crash involving lexotic (return) handlers and a race condition in frame validation
  • Fix RT #125480 (program counter corruption due to bad interaction of LEAVE/return/closures)
  • Trying to hunt down a MoarVM parallel GC bug. Found one issue and patched it, but it’s seemingly not The One…

See you next week!

Posted in Uncategorized | 1 Comment

This week: too little sleep and too little Perl 6

This report covers the week starting the 20th July – which turned out to be the week I had to finish preparing course material and labs for one of Edument’s new courses. That managed to swallow most of the week, so only a handful of Perl 6 things got done. Happily, August is free of any teaching and authoring responsibilities, and will be dominated with Perl 6 work.

Digging into multi-dimensional arrays in Perl 6

I started a Rakudo branch to work on the multi-dimensional array support, and made various decisions about how things will work. I got some way into the changes to Array itself (and had it basically working provided you worked in terms of the *-POS API directly), and started to look at the ways the slicing implementation will need to change to pass multiple dimensions along (so you’ll actually be able to do the access with […] as you’d expect).

Banning confusion

I took on one of the oldest RTs in the queue, which wished for there to be a compile time error on:

say $*a; my $*a;

There’s long been one on:

say $a; my $a;

Basically, it catches use-before-declaration confusion (since the declaration happens at compile time). The second case is clear cut; the first is less so as it involves a dynamically scoped variable and so we’re naturally a bit looser about those. But after a little discussion, Larry said he’d like to outlaw the first just like the second if it wasn’t too hard to implement. I took a look at the code, decided it wasn’t too bad at all, and fixed it.

Unblocking the release

During the preparation for the 2015.07 release, somebody noticed a regression in reporting “return outside of routine” errors, that the tests had missed. I jumped in to get it fixed up, and added a better test so we don’t bust it again.

This week’s token regex engine patches

I fixed RT #125648 (no syntax error for /00:11:22/), as well as looking into RT #77524 (Rakudo treated /a:/ as legal syntax and STD did not; it turns out Rakudo was right on this one).

Other bits

  • Fix RT #125642 and RT #121308 (traits expecting types didn’t report bad type or make suggestions)
  • Analyze RT #125634 and get it down to a much smaller example of what’s wrong; no fix yet
Posted in Uncategorized | 1 Comment

This week: concurrency stuff, multi-dimensional stuff, stuff stuff…

Finally, I got a week of peaceful hacking time at home and not-too-bad health, and so Stuff Got Done. Here’s what.

Progress on multi-dimensional arrays

Last time, I’d gotten decent support into MoarVM for multi-dimensional arrays (including packing natively typed values into a single piece of memory), and had started on porting this functionality to the JVM. This week, I finished that JVM porting work. Doing something for the second time is often pretty good at showing up things you didn’t think through well enough the first time. In doing the port, I thought up a couple of tests I’d not written on MoarVM that would be explodey, or faily, or otherwise unhappy – and found one that I’d written that really didn’t make a lot of sense. So I wrote some extra tests, fixed things, and we now have MoarVM and JVM equipped with the guts needed to dig into implementing multi-dimensional arrays at the Perl 6 level.

Concurrency thinking and tooling

I did two interesting things relating to Perl 6’s concurrency support, both of which are interesting to discuss a little.

For a while, I’ve known that our syntactic relief for concurrent programming needed some attention. For one, our await keyword – so far implemented as the simplest possible thing – does not offer the semantics that would make it genuinely powerful. If you await a Promise today, you block the waiting thread until it’s ready. Sure, it’s a kernel-supported, OS-scheduler-efficient blocking, not busy waiting, but it still swallows one of the thread pool’s threads – which are real OS threads. And that makes it impossible to have many thousands of start blocks “active” and awaiting something to happen in order to make progress. What we want is for an await in a start block scheduled on the thread pool to return control to the thread pool, so the OS thread can be used for something else. Then, once the Promise being waited on reaches some conclusion, the rest of the start block is scheduled for resumption. That was always my vision for it, but until now I never got around to defining the API through which that happens, let alone implementing it. This week I tackled the API design part of the job. I’ll work on the implementation in the coming weeks.

Next in line were supplies. I like where we’ve gone with them so far, but working with them is very much an exercise in functional programming. It’s a bit like not having for loops and if statements, and having to write everything with map and grep. You can certainly do it, but plenty of normal people find code written that way harder to follow. Heck, some of the less-normal folks like me recognize that some solutions just read better when they’re more imperatively specified. I’ve been pondering this for a while. I really want the asynchronous aspects of Perl 6 to be accessible, and I really want people to be able to write operations that combine many asynchronous data sources – including time – without epic functional contortions. Having done my share of teaching Rx.Net, I’ve had plenty of chance to see people grapple with asynchronous data using an API a lot like we have for supplies. When there’s a nice built-in that does Just What You Want, it works out great. But sometimes there’s not, and you have to get creative, and then the result tends to feel clever rather than clear.

So, I also proposed some syntactic relief for working with supplies. It comes in two parts: supply { … } blocks for constructing supplies, and an asynchronous looping construct called whenever that works with it. So far, feedback has been positive; Larry said it looked sane, and other responses ranged from approving up to excited. So, it’s looking promising.

I didn’t update S17 yet, but rather wrote my proposals in a gist, which has all of the details.

The second big thing I did this week was work on a MoarVM bytecode instrumentation that can identify when one thread writes to an object that was created by another thread while not holding a lock. While there are of course patterns where you can legitimately do such a thing, they are the exception rather than the norm – and so having a tool that tells you when such things happen can help identify bugs. I wrote it to help me get a better insight into some of the threading bugs we have in the RT queue. It’s turned on by setting an environment variable, and it instruments bytecode (using the same approach the profiler does) to detect and record such cross-thread writes. It was also not a lot of code to implement, which I guess is good news on MoarVM’s architecture. And yes, there are sophisticated data race detection algorithms out there, but they’d all take a good bit more work to get in place (maybe some day in the future, I’ll take this one). For now, this first, simplistic, approach should help us hunt down a bunch of issues.

Meanwhile, in regex land…

I was active in the regex engine again.

  • Fix RT #125608 (Longest Token Matching did not factor in the first branch of a sequential alternation)
  • Verify RT #125391 (order of zero-width delimiters in .caps) already fixed and write a test for it
  • Fix RT #117955 (quantified captures only captured last items when used in a conjunction)
  • Investigate ways to deal with RT #67128 (calling another grammar); discussion, prototype a fix, find it needs lang design input

Better errors

This week had its share of improved failure modes and better feedback, to enhance Perl 6 user experience.

  • Fix RT #125595 (improve error reporting on bad loop specification, in line with STD)
  • Fix RT #125600 (good error message for running a directory, plus make sure we report such issues on STDERR)
  • Fix RT #115398 and RT #115400 (give good error with location info on trying to parameterize a non-parametric type)
  • Fix RT #125591 (failed to detect various mis-uses of $.x and $!x in signatures at compile time)
  • Fix RT #125625 (misleading/malformed error for useless accessor method generation with my $.a and our $.a)
  • Fix RT #125620 (gist method on custom exceptions with no message method would crash)

Other assorted bits

As usual, there are a few other small things I did that are worth a quick mention:

  • Fixing MSVC MoarVM build after an otherwise-good patch busted it
  • Fix a Proc::Async test file for Windows and add it to those we run
  • Fix RT #124121 (using “but” for role mixins with Array literal did the wrong thing, plus bad behavior with Parcel)
  • Implement does trait on variables (resolves RT #124747)
Posted in Uncategorized | 1 Comment

This week: less than hoped, but still good stuff

This week actually means “the week starting 6th July”, which is around the time much of Europe was being unreasonably hot. I spent the week in lovely Kyiv with my wife – where the weather was, predictably, also hot. I’d hoped for a nice mix of hacking, sight-seeing, and nice food. Well, I got a decent amount of nice food. Unfortunately, I also had a pretty bad time with the pesky hayfever thing that’s been bothering me this year (mostly due to bad sleep and hot weather), got frustrated enough that I decided to try a different anti-allergy medication, reacted less than awesomely to it, and generally spent plenty of time feeling crap. Happily, things are improving a good bit now that I’m back home, and ahead of me are five weeks with only 3 nights that I need to be away from home. After two and a half months in which I was never in the same place for much more than a week, you can’t imagine how happy I am of that – I can only imagine it’ll do my productivity wonders, and I’m hopeful for my health too. Anyway, let’s take a look through the few bits I did manage to get done.

Multi-dimensional array progress

In the last report, I mentioned I’d got most of the way with the new MultiDimArray representation I was implementing in MoarVM, along with various new ops. This week I got the last loose ends tied up, adding support for cloning, serialization, and deserialization.

With that done, it was time to move on to porting the work over to the JVM. I stubbed in all the new op mappings, so the NQP test file I established while working on the MoarVM implementation would compile on the JVM backend also (the strategy of writing tests at NQP level to exercise backend-level stuff continues to serve us very well). Then I set about working to make them pass. I got up to 71 out of 188 tests passing, which is a decent start – especially given there’s various bits of setup work to do early on.

&?BLOCK and &?ROUTINE

The &?BLOCK symbol refers to the current block of code we’re in (which amongst other things provides a way to write recursive, yet anonymous, blocks). &?ROUTINE is the same, but for the current enclosing routine (sub, method, token, etc.) We’ve had &?ROUTINE for a while, but not &?BLOCK. I set out to implement it, and noted that one had to be careful that it refers to the current closure. Glancing at &?ROUTINE, I noticed it didn’t take sufficient care over closure semantics, and soon had a failing test case exposing the issue. So, I fixed the &?ROUTINE bug, wrote tests for &?BLOCK, and got that in place too. So, one missing feature added and one potential nasty bug down.

when/default semantics

All the way back in April, I tried to deal with RT #71368, which noted that our when/default semantics were out of line with the design in S04. Trouble is, when I tried to bring us in line with them, I found that I would break a bunch of folk’s code. Of note, this pattern would not be allowed:

sub foo($n) {
    $_ = bar($n);
    # use when/default here against $_
}

That is, setting $_ – which you get fresh per block anyway – for the sake of using when/default. Nobody really felt this should be outlawed. I agreed and put aside my changes.

This week I finally got around to returning to the issue to try and bring it to some kind of conclusion. The result was a commit to the design docs to bring them in line with the semantics that folks seem to prefer (which was a nice simplification also), along with adding some more tests to give us better coverage.

One more regex engine bug down

A while back, we got dynamic quantifiers, so you can use a variable (or any expression, really) to decide how many times to match something:

/'x' ** {$n}/

RT #125521 pointed out an icky bug that showed up when you tried to mix this feature with captures. Thankfully, this turned out to be one of the easier kind of regex engine bugs to figure out: some capture-related code paths simply hadn’t been updated to understand dynamic quantifiers.

And the usual other little bits

Here are a few other assorted small things that I dealt with:

  • Fix RT #125537 (type variable resolution failed to look in outer scopes) and add test
  • Fix RT #124940 (for type variable T, my T $x = … could fail to assign)
  • Review test mentioned in RT #125003, correct it, resolve ticket.
  • Fix RT #125574 (missing error on too-late application of ‘is repr(…)’ trait)
  • Fix RT #125513 (could auto-gen a %_ when there already was one in some unusual cases)
  • Review RT #80694, observe weird .^can behavior is gone, add a test for that, suggest a good solution to problem and test it too

Stay tuned for next week’s report, which already has as much to talk about as this one – and we’re only half way through the week!

Posted in Uncategorized | 2 Comments

This week: digging into multi-dimensional arrays – and plenty more

This report covers what I got up to during the closing days of June and the opening days of July.

Multi-dimensional array support in MoarVM

I’ve been pondering how to approach the multi-dimensional array aspects of the S09 Perl 6 design document for a while. I started out the implementation phase by taking another pass through the document, with an eye for things that were likely to hurt, or simply that did not fit with the current Perl 6 language as we have it today. That resulted in a gist that I tossed in Larry’s direction. Thankfully, nothing in that list was a huge blocker for getting the majority of the work done. With the top-down bit of out the way, it was time to move on to the bottom-up. MoarVM’s opset wanted a few additions, the representation API wanted a few extensions, and a new representation (named MultiDimArray) was needed. To recap, a representation is a memory allocation, layout, and access strategy, and is one half of what makes up the Perl 6 notion of “type” (the other half being the meta-object, which cares for all the high-level, VM-independent bits like method dispatch and type relations). Gradually, I test-drove my way through implementing the new multi-dimensonal APIs on the existing 1D dynamic array representation, then started to flesh out the new multi-dimensional representation. By the end of the week, I had the majority of it in place. The new representation can happily, for example, store a 10x10x10 array of 8-bit integers in a single 1000-byte blob. To come is filling out a few more missing pieces, the JVM port of these new guts (thankfully with a nice set of tests to guide the way), and then onwards and upwards to making use of it all at Rakudo level, so we can have multi-dimensional array support in Perl 6.

Another pre-compilation bug nailed

One of the most annoying kinds of bugs people run into are pre-compilation bugs: ones where your modules work fine, but a version of the module pre-compiled to bytecode breaks in some way. While that’s not always the compiler’s fault (for example, if you monkey-patch recklessly, or meta-program carelessly), most of the time it is. This week I hunted down a bug involving variables typed with subset types running into pre-compilation issues. Thankfully, it wasn’t overly difficult to fix once I worked out what was going on – but most happily, it was also the pre-compilation bug afflicting Text::CSV. It now works just fine pre-compiled.

A few tasks down in the regex engine

For a while, there’s been some debate of the failure semantics of the goal-matching syntax. That is, should:

/'(' ~ ')' \d+/

Backtrack on not finding the closing parentheses, as if you’d just written:

/'(' \d+ ')'/

Or should it throw an exception? Now, all of the uses of this construct in the Perl 6 grammar want the exception semantics. So, that’s the behavior we’ve had. However, it was argued (on a few occasions over the years) that this was not a desirable behavior for using the construct in normal regexes. My argument was always, “so just write it the other way” – but after enough tickets on the issue it was time for a review. Patrick Michaud wrote up a possible way forward a while ago, and this week I ran that by Larry, who agreed we’d change things. So, I set about putting the change into effect. Here, the design of the goal matching error mechanism came in handy. Actually, the syntax:

/'(' ~ ')' \d+/

Desugars to:

/'(' \d+ [ ')' || <.FAILGOAL(')')>]/

And the FAILGOAL method threw the exception. So, the behavior change simply meant adding:

token FAILGOAL($missing) { <!> }

The compiler toolchain already overrode FAILGOAL to throw a more helpful exception, so things continued to work for the Perl 6 grammar’s own needs. The only folks left in the middle are those using the goal matching syntax who wanted the exception. Thankfully, that’s easy to get back in your own grammars:

method FAILGOAL($missing) {
    die "Oh noes, I needed a '$missing'";
}

I also fixed another obscure, but potentially infuriating bug involving a mis-guided optimization. I’ll just reference RT #72440 and the patch that fixed it.

More failure mode improvements

Here’s a selection of things I did related to improving error reporting, to improve overall user experience:

  • Fix RT #125120 (bad error reporting if you declared a type X then made a syntax error)
  • Fix RT #108462 (missing redeclaration checks)
  • Fix RT #125335 (lack of escaping in error message about illegal numification)
  • Fix RT #125227 (trait warnings pointed to useless internals line, not relevant position in the source code)
  • Implement RT #112922 (catch impossible default values on parameters at compile time)
  • Add test for and resolve RT #123897 (bad error reporting, improved by implementing RT #112922)

And finally…

There’s the usual collection of things not worth a headline mention, that that are gladly dealt with.

  • Fix RT #125505 (getting Capture elements stripped away Scalar containers)
  • Working on RT #125110 (leading combining characters mis-handed in Str.perl) and unfudge tests, plus further fixes to combining chars and Str.perl
  • Fix RT #125509 (=== didn’t work on Complex), plus a few other issues observed with ===
  • Add test coverage for RT #115868, plus improve the two errors that are produced
  • Fix RT #116102 (ENTER phaser did not work as an r-value)
  • Add test coverage for RT #125483 (the “;;” syntax actually influences mutli-dispatch when placed prior to the first parameter)
  • Update test now “my ${a}” is parsed as legal Perl 6 (allows anonymous hash with key type declarations)
Posted in Uncategorized | Leave a comment

Grant status update

Over the last 3 months, I’ve been working on my Perl 6 Development Fund grant. Those of you following the blog will have seen plenty of posts in that time about what I’ve been up to. This post, 3 months in, is more administrative than technical, but it contains some nice statistics. (And there’ll be another of the regular weekly-ish reports coming in the next few days!)

Time worked

I have worked a total of 165 hours and 38 minutes. This means there are around 84 hours worth of funding remaining on the grant. I expect to deliver nearly all of these hours by the end of July.

April was by far the most productive month (75 hours). May was the month I got married, and I took time from all of my work for that, so worked only 41 hours on the grant. June was a little better (50 hours) but still around 20 hours short of what I’d hoped, largely due to poor health at the start of the month.

Major achievements

The biggest achievement of the work so far is the implementation of NFG (Normal Form Grapheme) in Rakudo on MoarVM. Along the way, I also implemented the Uni class, which provides all of the Unicode normalization forms. The last two monthly releases of Rakudo, made in late May and late June, have had strings working at grapheme level. By now, there are no known outstanding issues in RT relating to NFG support.

Much progress has been made on improving the stability of our concurrent programming features, with various reports from users of improvement. Issues certainly remain, but a number of the most serious bugs have now been addressed.

Startup time is now decidedly lower than at the start of the grant. Measuring informally on my development machine against Perl 5, Rakudo now starts up in just 40% of the time taken to load Perl 5 with Moose, and 160% of the time taken to load Perl 5 with Moo. I’d like to note that I’m certainly not the only person to thank for these startup time improvements; some have come from other Rakudo and MoarVM contributors. And, of course, there’s room for improvement yet!

Both the startup improvements and other work have also helped to lower the base memory footprint, though we certainly have some work to go in this area. Private memory consumed by Rakudo Perl 6 on MoarVM running the simple infinite loop program is still twice that of Perl 5 with Moose loaded, for example. On the other hand, this is just half the memory we once swallowed. One notable improvement I worked on is getting hashes to use a good bit less memory.

RT tickets

I’ve been addressing bugs and missing features noted in the RT queue. 85 unique RT tickets are mentioned in my work log, almost all of which had an outcome of being resolved by the work I did. They were all over the map: semantic wrongness, bad error reporting, crashes, unimplemented things, and so forth.

Further funding needed

There’s enough funding “in the pot” for me to continue my work through July. In that time, I plan to deliver multi-dimensional arrays (including the native packed variety), further address concurrency issues, and resolve more RT issues.

I’d very much like to continue this work for the rest of the year, as we approach the Christmas release. Any potential donors may like to read more about the Perl 6 Core Development Fund.

Posted in Uncategorized | 3 Comments

This week: Unicode 8, loads of fixes, preparing for shaped arrays

It’s been another week of getting lots of small things done – but also gearing up to working on various array related things, including fixed size and shaped arrays. I expect to have some progress to report on that next week; in the meantime, here’s what I got up to in this week.

Update to Unicode 8.0

I got MoarVM’s Unicode database upgraded to the recently released Unicode 8. Part of the work was reading the changes list to see if there were any code changes needed. After establishing that there were not, the actual upgrade was very easy, since we have a script that takes the Unicode database, extracts the bits we need, and packs it into various compact data structures. The final step was looking into 3 test failures in one of the specification test files for Perl 6. It turned out the tests in question were written in a way that made them vulnerable to Unicode adding additional ideographs; I fixed the tests so we should not run into this problem with them in future Unicode upgrades. And with that, Rakudo on MoarVM has Unicode 8 support. Make the SIGN OF THE HORNS, and grab a HOTDOG.

Improving error reporting

We work pretty hard to report the programmer’s mistakes in a good and understandable way. Of course, sometimes we fall short, and people file tickets to let us know. I’m keen to fix various of these; many of them are not that hard to fix, but can make quite a big difference to user experience. Here are the ones I’ve just recently fixed:

  • Fix RT #125228 (bad error reporting when ‘is’ trait argument referenced undeclared symbol)
  • Fix RT #125259 (bad error reporting when parameterizing a type in an illegal way)
  • Fix RT #125427 (Rakudo silently accepted trying to overload special-compiler-form operators, which will never work)
  • Fix RT #125339 (no location information for unhandled control exceptions)
  • Fix RT #125441 (bad error reporting when trying to declare a class with the same name as an existing enum element)

Parsing issues

For a while, there’s been a tricky issue in RT regarding parsing of parameters with both where clauses and defaults:

sub bar($percent where { 1 <= $_ <= 100 } = 100) {
    ...
}

This did actually parse. Unfortunately, it parsed as something like:

sub bar($percent where ({ 1 <= $_ <= 100 } = 100)) {
    ...
}

That is, trying to assign 100 to a block. This, of course, explodes when we try to evaluate the constraint. Clearly it was a precedence issue – but oddly, the precedence limit set on parsing the constraint matched what the standard grammar did exactly. I dug a bit deeper – and found a discrepancy in how we interpreted the precedence limit (exclusive vs. inclusive). That was a one line change – but fixing it caused quite a lot of fallout in the specification tests. In all, three or four separate issues had cropped up.

One of them was easy and rewarding: I removed a hack that was working around the precedence limit handling bug. That left the remaining ones, involving adverbs getting attached to the wrong bits of the AST and chained assignments parsing wrongly. Again, I looked for a place where we were out of line with the standard grammar – and failed. I couldn’t see how STD could come out with a different result than Rakudo was. Of course, since STD only parses, we could easily have not noticed it was broken. I didn’t have a build of STD to hand; thankfully, someone on channel did and could quickly paste me the ASTs it produced. And…they were wrong. So, I’d uncovered a bug in the standard grammar that had gone without being noticed for years. Anyway, I worked out some kind of solution, and left Larry to think up a neater one if possible. Since then he’s given me another suggestion, which I’ll try out. Anyway, the bug is gone.

The others were thankfully simpler. In one case, a program that ended in an invocant colon, for indirect object syntax, would report a syntax error; this was just a case of failing to look for end of source code as a valid thing to have after it.

Finally, I looked into some oddities around item and list precedence analysis for assignment. This one also ended up with me discovering a discrepancy with the standard grammar. However, even after fixing that, others found the rules still a little surprising. I looked deeper, and found a good way to explain the current semantics. Later, Larry read the discussion and committed a further tweak. So, we’re ahead of the standard grammar in another place now.

Down in MoarVM

I spent some work doing various fixes down at the VM level. I jumped on a couple of segmentation faults (RT #125376), which I like to get on top of since they can, in the worst case, be security issues. I also reviewed a set of patches resulting from running the clang static analyzer over the codebase, and also looked through some tickets and pull requests.

The most significant work, however, was adding a “free at next safepoint” mechanism to the fixed size allocator. A common problem in concurrent systems, when you have memory not managed by the GC, is making sure that when you free it, no other thread can possibly be reading it. For things the garbage collector manages, this isn’t an issue; it already is considering every live reference to an object when it does GC no matter what thread has the reference. But in some places, we do want to employ concurrent algorithms, but not have the GC manage the memory. Generally, these are situations where we’re producing a new “version” of the data structure, putting it in place, and freeing the old one. It’s fine if another thread obtained the previous version and is looking at it – but that won’t end well if another thread frees it right away.

Thankfully, there’s an easy solution also inspired by how the GC works: add the memory to a list of things what we should free at the next “safe point”. A safe point is one where we know the state of each thread is such that it could not possibly still be in code looking at the memory in question. It turns out GC runs are natural safe points, so for now it just postpones the freeing until the next GC run. However, the API will let us consider some smarter option in the future if needed.

I immediately used this for the NFG synthetics table (a concurrent data structure that can safely be read from without ever taking a lock, and locking is only used to establish an ordering on additions). I’ll also use it to fix an issue with memory safety of dynamic arrays.

Too many spectests for Windows

Recently, “make spectest” stopped working on Windows. I suspected somebody had done some kind of platform unaware change to the testing related bits – but a quick glance at the git log ruled out that hypothesis. Finally, I figured it out: on Windows, there is a 16-bit string length field on the process launch data structure, so you can’t send along a command line longer than 32KB. We were invoking the fudge tool with all of the spectests, and had reached the number of tests where we had too long a command line for Windows. Thankfully, I could get it to fudge in batches.

Loads of other small fixes

I also chewed my way through a number of other tickets, and fixed up a few more things that I spotted along the way.

  • Verify RT #125365 fixed and add test coverage
  • Fix RT #109322 (bare blocks didn’t take a closure properly, causing reentrancy issues)
  • Review and remove/update tests mentioned as needing review in RT #125016 and RT #125018
  • Fix RT #125015, which complains about HOW(…) sub not existing
  • Fix RT #113950 (optimizer could sometimes cause LEAVE phaser to not execute)
  • Fix an issue where submethods with a role mixed in would not be considered submethods; add a test case
  • Fix RT #125445 (.can and .^can broken on enum values)
  • Fix RT #125402 (cannot assign non-Str to substr-rw; fixed by making it DWIM and coerce)
  • Fix RT #125455 (variable phasers applied to variables directly in package/class bodies didn’t work properly)
Posted in Uncategorized | 2 Comments