A few quick updates

I’ve been rather snowed under with work in November, thus the absence of posts here. Sorry about that. It turns out the sales folks at $dayjob did their job a little too well, and besides my own heavy teaching load I had to step in to rescue a few other things. Anyway, that’s done now, and I’m back to having a bit more time. (It’s not that I had absolutely no time at all since the last post. It was just more important to spend the bits I had clearing blockers in the way of others rather than writing blog posts. :-)) So, some quick bits of news.

Concurrency

At the Austrian Perl Workshop, I worked with lizmat on getting the Perl 6 concurrency spec aligned with the things I’d been working on for Rakudo on JVM. Happily, this had some good consequences. Firstly, Larry did a load of work on it, vastly improving naming of things and giving some syntactic relief in various areas where I’d just done simple function and method APIs. Secondly, both lizmat and timotimo dug in to bring the implementation in line with these spec changes, doing some other improvements to boot. So, now the bus number on the Rakudo concurrency support has increased. You can find the latest spec here. Also, you can see my Nordic Perl Workshop slides about it.

Rakudo on MoarVM progress

Last time I wrote here, we had NQP support for MoarVM pretty much complete, and were ready to dig into working on getting Rakudo running on MoarVM. The first step was to get the core of the compiler itself building. This wasn’t a terrible amount of work, since the majority of it is just NQP code. There’s more of it than is found in the NQP compiler, but that aside it doesn’t do too much that’s new. Next came the Perl 6 MOP, which is written in NQP. Things went very smoothly with that. Beyond there, things got more interesting.

The next big piece to make work was the BOOTSTRAP. This uses the MOP to start piecing together the key types at the heart of Perl 6, doing enough so we can write the rest of the built-ins in Perl 6 itself. Most of it is one huge BEGIN block. Here there were various unimplemented things, plus some of the VM-specific bits of Rakudo needed porting. And beyond that lay…the setting. When I did the JVM port, we had around 14,000 lines of built-ins there. These days it’s closer to 17,000 – and that’s excluding the concurrency stuff. Since compiling the setting actually involves running little bits of Perl 6 code, thanks to traits and BEGIN blocks, getting through it means making quite a lot of things work. We got there a week or so back.

That still didn’t give us “Hello, world”, however. The setting does various bits of initialization work as it loads, which naturally hit more things that didn’t yet work. Finally, yesterday, we reached the point where setting loading worked and we could do “Hello, world”. Actually, once the setting loaded, this worked right off.

Of course, there’s still lots to do from here. The next step will be to work on the sanity tests. There’s a couple of large and complex features that will need some porting work; of note, gather/take will need doing. Also there are a couple of stability and algorithmic things that need taking care of in MoarVM itself. Building CORE.setting is easily the biggest workout it’s had so far, and naturally it highlighted a couple of issues. And, of course, beyond the sanity tests like the spectests…

Better use of invokedynamic, and other optimization

I’ve got a talk at Build Stuff next week on invokedynamic, the instruction added to the JVM in JDK7 to support those implementing languages of a more dynamic nature So, in the last week I spent some time tweaking our usage of it to get some further wins in various areas, and to make sure I was nicely familiar with it again in time for my talk. That work got merged into the mainline development branches today. I did some other optimizations along the way that are a win for all backends, too; lizmat noted a 12.5% decrease in spectest time. Not Bad.

Posted in Uncategorized | 2 Comments

NQP gets MoarVM support, cursor reduction, and other news

I thought it was high time for a round-up of news about the Perl 6 related stuff that I’ve been working on – as usual, with the help of many others. So, here goes!

MoarVM Backend in NQP master

In the last progress update, I mentioned that MoarVM could host NQP and pass most of the NQP language test suite. We got to that point by using a cross-compiler running on Parrot to compile the NQP sources into MoarVM bytecode. Fast forward a month, and we’ve reached the point of having NQP bootstrapped on MoarVM. What’s the difference? Simply, that you can use NQP running on MoarVM in order to build an NQP from source. This means that both bytecode generation and serialization are now working. Since some of the source files we’re compiling are also fairly large (4000 lines), it’s also been a fairly good hardening exercise; many bugs have been fixed.

The MoarVM support has now been added to the NQP repository. Just as you only need a JVM to build NQP for the JVM, you also only need MoarVM to build an NQP for MoarVM. That is, it’s bootstrapped and can stand alone now. NQP monthly releases from here on will thus come with support for three backends: Parrot, JVM and MoarVM.

Better still, NQP on MoarVM now passes the full NQP test suite. It also does so faster than any other backend. This doesn’t mean MoarVM is faster for all things; if you write NQP code with a tight integer loop, or something long running, the JVM’s JIT will kick in and typically come out ahead after the slow start. MoarVM just gets moving a lot faster than the JVM, which is what matters for running lots of short-lived tests.

This means we’re now ready to start work on getting Rakudo running on MoarVM. I don’t have any predictions just yet on precisely when we might land this; the rate of progress in the next couple of weeks, as we dig into it, will provide an indication. Reaching the point of being able to bootstrap NQP on MoarVM is a significant achievement along the way, though. While NQP is both smaller and simpler than full Perl 6, it still requires being able to execute Perl 6 grammars, eval code, do a whole range of OO-related things (including classes, roles and meta-programming), perform multiple dispatch, handle BEGIN time, and so on. These are all things that a full Perl 6 needs, so it’s very good to have them in place.

Allocating Less Cursors

Allocation profiling work by hoelzro and timotimo (I hope I remembered right; correct me if not!) indicated that both NQP and Rakudo were allocating a huge number of Cursor objects. So what are these? They are objects that keep track of parsing state. They are created as we enter a rule or token in the grammar, thrown away if it fails, or typically placed into a tree of cursors if it passes (though sometimes we only care for pass/fail and so quickly throw it away again). Naturally, we’re going to allocate quite a few of these, but 2.14 million of them being allocated while parsing Rakudo’s CORE.setting seemed decidedly excessive.

I decided to spend some time trying to understand where they all came from. NQP itself tends to be a bit easier to analyze than Rakudo, so I started there, and added some instrumentation to record the number of times each production rule or token was hit and led to allocation of a Cursor. I then used this to gather statistics on a fairly large source file from the NQP build. It started out at 284,742 Cursor allocations.

The first big win from this came when I realized that a huge number of these Cursors were just allocated and returned, to indicate failure. A Cursor is born in a failed state, and many places were just creating them and returning them without any further work, to say “I failed to match”. Thing is, failed Cursor objects all look about the same. The only thing they differ by is type, and for each Cursor type we have a ParseShared instance. Thus, it wasn’t too hard to create a shared “fail Cursor” and just look it up in a bunch of places, rather than allocating. That shaved off over 50,000 allocations. A related realization about improving MARKED and MARKER led to another 30,000 or so allocations chopped. Note that none of this really leads to overall reduced memory usage; all of these objects were quickly collectable. But allocation makes GC run more often, and thus carries some amount of runtime cost.

Further examination of the statistics showed hotspots that seemed unreasonable. The problem wasn’t that they allocated a Cursor needlessly, but rather than we should never have been calling the grammar rule in question so many times. This led to a number of grammar optimizations. In one case, just making sure that a certain rule got a declarative prefix meant the LTM mechanism could very quickly rule it out, saving 15,000 calls to the production rule, and thus cursors. The other big discovery was that the ident built-in was not having an NFA properly generated for it. This is notable because ident features in various paths in the grammar, and thus meant we were failing to quickly rule out a lot of impossible paths through the grammar using the NFA (which is the cheap way to rule things out). With a few other tweaks, all told, we were down to 172,424 Cursor allocations, just 60% of what we started out with.

The statistics for Rakudo’s CORE.setting showed we started out doing 2,196,370 Cursor allocations. Some of the fixes above also aided Rakudo directly, making a good cut into this number. However, analysis of the statistics revealed there were more wins to be had. So far, we managed to bring it down to 1,231,085 – a huge reduction. There’s likely a lot more wins to be had here, but I think we’ve picked all of the large, low-hanging fruit by now.

Rakudo Debugger on JVM

I spent some time getting the Rakudo Debugger to also work on the JVM. I’ve still got some work to do on making it easily buildable, and so it can be installed with Panda on the JVM like it can on Parrot. But the rest of the work is done, and I was able to step through programs, set breakpoints, examine variables and so forth. So, that’s another of the “things missing on the JVM port” mostly gone.

Rakudo on JVM spectest

I also did a little work to fix more of the spectests that pass on Rakudo Parrot, but fail on Rakudo JVM. The result: 99.9% of the spectests that pass on Rakudo Parrot also pass on Rakudo JVM. To put an exact number on it, that’s 27 tests different. Around half are Unicode related, which isn’t so surprising since we’ve borrowed Java strings for the time being; in the end, we’ll need to do an NFG implementation on the JVM. The others still need to be hunted down. I think it’s fair to say that you’re rather unlikely to run into this 0.1% in day-to-day usage of Rakudo on the JVM, however. In fact, the thing you’re most likely to miss today is NativeCall – which arnsholt has been working on, and I plan to join in with during the coming days.

All in all, things are coming along very nicely with Rakudo’s JVM support. It was just 11 months ago – not even a year – that I started to build a little bit of infrastructure so we’d be able to create a tree-like data structure in NQP and get JVM bytecode made from it. By now, thanks to the contributions of many besides me, Rakudo runs on the JVM and is very capable there. When it comes to concurrency, it’s the most capable Rakudo you can get. And, once it gets past the slow startup, it’s usually the fastest. It’ll be interesting to see where we stand on these things in another six months or a year’s time. Here’s hoping for another productive time ahead!

Posted in Uncategorized | 2 Comments

Material from the Rakudo and NQP Internals course

A little over a month ago, lizmat++ contacted my employer, Edument AB, to discuss using our training services to deliver a 2-day workshop aimed at current and potential Rakudo and NQP contributors. For those of you who don’t know, aside from working on Perl 6 I also work teaching/mentoring on various topics (typically software architecture, TDD, and advanced C# programming). The goal was for me to spend a couple of days explaining a bunch of topics – including NQP, grammars, QAST, 6model, and bounded serialization – to help people know where to start or to help them get into unfamiliar areas of the code.

The workshop took place this last weekend in Frankfurt. I’d expected we might have 5-6 people sign up; in the end, we had around 15 people attending! The workshop involved a combination of teaching and exercises, with plenty of chances to ask questions. Happily, there were tasty lunches and dinners too (though I can’t take any of the credit for this side of things). I greatly enjoyed teaching the course; I like both working on Perl 6 and doing my teaching work at Edument, and being able to do both at once was delightful! :-)

One aim of developing the course was to help with the state of documentation of the NQP and Rakudo internals. Typically at Edument, we only give out our material to those attending a delivery of our courses, and even then not the original source! However, for this course, we’ve made an exception and released the course material under a Creative Commons license. So, while I hope that we’ll be able to put together future “live” deliveries of the material for other new (or potential) contributors, this means it will now always be available to the community at large. :-)

I hope this will prove a valuable resource for those contributing or interested in contributing to Perl 6 development, and would like to take a moment to thank Edument for once again being supportive of my work on the Perl 6 project!

Posted in Uncategorized | 4 Comments

A MoarVM Progress Update

A while back, I wrote here to introduce the MoarVM project, which aims to build a Virtual Machine specially tailored for the needs of NQP and Rakudo. Since then, a lot of progress has been made and the project has attracted a number of new contributors. So, it’s time for a round-up of the latest MoarVM news.

MoarVM Hosts NQP

Back when I first introduced MoarVM, we had a partly complete cross-compiler that used NQP on Parrot to parse NQP source code and build its AST, then turned it into MoarVM bytecode. This enabled us to cross-compile some of the NQP test suite to MoarVM. Since then, we’ve been improving the cross-compiler so it is able to translate NQP itself into MoarVM bytecode, as well as making the VM capable of doing the things that NQP requires to run.

In the last week, we’ve reached the point of having a self-hosted NQP on MoarVM that passes most of the NQP test suite. What does this really mean? That MoarVM is capable enough not only to run programs written in NQP, but also to run the NQP compiler (meaning it has pretty complete Perl 6 grammar support), its own backend code-generator and to be able to evaluate the result of this compilation.

We’re down to 5 files from t/nqp that either fail outright or fail some tests. I expect we’ll be able to get these fixed within the next week, and also make progress on the serialization tests. Beyond this, it will be time to start working on getting this self-hosted NQP able to build itself. At this point, the cross-compiler will no longer be needed, and the MoarVM backend code will migrate to the main NQP repository, joining the Parrot and JVM backends.

How slow is it?

I’ve been pushing pretty hard for us to focus on getting things working before getting things fast; for now keeping the code easy to work on and implementing features on the path to having a Rakudo on MoarVM is far, far more important. That said, there is already an encouraging result. While it’s not quite a fair test given the five of the 77 test files that do not run to completion, currently MoarVM can run the NQP test suite in 16s on my main development machine, This compares to 21s on Parrot (which means we come out ahead at startup time, execution time, or a mixture). While it’s not incredibly faster (though of course running lots of small test files is startup-dominated), it isn’t a bad starting point given we’ve barely scratched the surface of what we’ll be able to do in terms of optimization (for example, no JIT yet, known algorithmic issues in handling of GC inter-generational pointers, and comparatively little time invested in optimization all over the VM). By the way, thanks to its much greater start-up time, NQP on the JVM takes a huge 139s. However, much like in the story of the tortoise and the hare, leave things to run for long enough and the JVM backend still typically comes out ahead, sometimes very significantly so.

On memory usage, to compile and run a simple “hello, world” in NQP, MoarVM’s memory usage clocks in at just over half that which Parrot uses, which in turn clocks in at about half that of the hungry JVM. :-)

While the JVM backend doesn’t come out looking too awesome here resource usage wise unless you’re doing something long-running, it’s worth remembering that the JVM port of Rakudo is just a couple of months old, and the whole porting effort less than a year old. In that sense, we’ve a lot of room for improvement yet.

Goodbye APR, welcome libuv

Originally, MoarVM was using the Apache Portable Runtime for IO, multi-platform thread support and a handful of other things. By now, we’ve moved over to using libuv. This isn’t because there was anything inherently wrong with the APR, and while we’ve ended up not using it, I’ve come out with a positive impression of it. So, thanks to the APR developers! The primary reason we moved to using libuv instead was to ensure we can provide support for asynchronous I/O on a relatively wide variety of platforms. We were also using a mix of the atomic primitives from the APR and from libatomic_ops; we’ve now standardized completely on libatomic_ops for this.

Build and Dependencies

There’s a bunch of stuff I’m decent at doing, but build systems is not one of them. My first pass was a case of, “make something work, and hope somebody replaces it.” Happily, that has now happened, so the build system is in much better shape. We’ve also got things in place for building MoarVM as a shared library, and for pulling in varius dependencies while keeping them at enough of a distnace that we should be able to depend on system versions of them (trying to balance easy source builds with keeping future packagers happier).

Contributors

Since becoming a public project, MoarVM has grown its contributor base. In total, 16 authors now show up in the commit log and I’d say we’re at around 5 core developers who commit notable things relatively often. I’m happy that people have felt inspired to contribute, and that things are progressing fairly swiftly now (I hoped we’d be a little further by this point, but my tuit supply over the summer was not what I’d hoped). Currently, it looks like the October NQP release will be the first to include MoarVM support, and from there we’ll focus on getting Rakudo onto MoarVM.

Posted in Uncategorized | 4 Comments

YAPC::EU 2013 Slides

I’m just back from this year’s YAPC Europe in Kiev. I’ve liked Kiev since the first time I visited many years ago, and after around two years since the last visit, I was glad of an excuse to return. It was the same beautiful, rather hilly city I remembered, though decidedly warmer than I remember it – probably because this is the first visit I made there in summer. In the company of the many good Perl folks attending YAPC, I enjoyed plenty of nice food, and even caught some good Belgian beers thanks to the wonderful Belgian Beer Cafe.

This year I submitted three talks, expecting one or two would be accepted. Instead, all three were! So, I talked about:

  • Rakudo on JVM – this session explained the motivation for adding a JVM backend, the compiler architecture that enabled it, how it was implemented, the current status, the support so far for Java interoperability and the plan from here. Seems to have been well received.
  • Concurrency, Parallelism and Asynchrony – this session showed the work I have been doing to build basic support for parallel, asynchronous and concurrent programming in Perl 6. This was the best attended and also, I believe, the most discussed of my talks this year. There is still much work to do in this area, but what’s done so far caught some interest. Once I’m recovered from YAPC, I’ll dig back into it.
  • MoarVM – this session talked about the motivation for building a new VM, its overall design, the current status and what’s in store. It’s also the first talk I ever gave on MoarVM. The most lightly attended and most hastily prepared, but still it seemed to be appreciated by those who attended.

Enjoy the slides, and hopefully the videos will make it online soon too.

I also agreed to attend this years Austrian Perl Workshop, where the hills will be alive with the sound of Perl 6 in the lovely Salzburg, sometime in November. :-)

Posted in Uncategorized | 1 Comment

Rakudo JVM News: More tests, plus Thread and Promise prototypes

Last time I wrote, the Rakudo on JVM port was passing around 92% of the specification tests that Rakudo on Parrot can. In the last couple of weeks, we’ve continued hunting down and fixing failures. I’m happy to report that we have already passed the 98% mark – well beyond the 95% I was aiming for by the July release! I’m optimistic that we may be able to push that up to 99% in the coming days. Either way, we’re closing in on the goal spectest wise, meaning the focus should soon move to getting tools like Panda working, followed by the module ecosystem. Happily, arnsholt++ has already started working on the NativeCall support that many modules depend on.

One of the reasons for adding a JVM backend is to unblock work on Rakudo’s support for asynchronous, parallel and concurrent programming. With a YAPC::EU talk on these topics looming, and hating to take to the stage without anything to demonstrate, I’ve started working on this area. It’s early days yet, but here is a quick look at what’s already possible.

There is some basic support for doing stuff in threads.

say "Creating a couple of threads...";

my $t1 = Thread.start({ sleep 1; say "Thread 1 done"; });
my $t2 = Thread.start({ sleep 2; say "Thread 2 done"; });

say "Waiting for joins...";
.join for $t1, $t2;
say "Joined!";

This does what you’d naturally expect. However, threads are kind of like the assembly language of parallel programming: occasionally you want to work at that level, but usually it’s much better to work in terms of higher level constructs. Thus, while you can do the above, I don’t suggest it. So what’s available at a higher level? Well, so far, promises are.

say "Creating two promises...";

my $a = async { sleep 2; say "omg slept 2"; 27 }
my $b = async { sleep 1; say "omg slept 1"; 15 }

say "Scheduler has $*SCHEDULER.outstanding() tasks";
say "Waiting for results...";
say $a.result + $b.result;

The async construct evaluates to a Promise object (name subject to change; Future or Task are other options we could steal from other languages). A Promise is an object that represents a piece of ongoing work. It is not backed by a thread of its own; instead, it is scheduled onto a pool of threads that are spun up on demand, up to a limit. Alternatively, a Promise could be backed by some kind of asynchronous I/O. The point is that it doesn’t much matter what the exact nature of the work is, just that there’s a common way to talk about concurrent work and write combinators over them.

When you ask for the result of a Promise and it is not available, you will block until it is available. If the inside of the async block died, then the exception will be thrown at the point the result method is called. There is also a method “then” for chaining on extra work to be done once the promise is either completed or fails due to an exception:

my $a2 = $a.then(-> $res { say "Got $res.result() from promise a" });

This returns another Promise, thus allowing chaining. There is also a sub await that for now just calls the result method for you, on a whole list of promises if you pass them. Here’s an example:

say [+] await dir('docs/announce').map({
    async { .IO.lines.elems }
});

This creates a Promise per file in the directory that will count the number of lines in the file. Then, await will wait for each of the promises to give a result, handing them back as they come in to the reduction. Note that in the future, this could probably just be:

say [+] hyper dir('docs/announce').map(*.IO.lines.elems)

But we didn’t implement that yet, and when it does happen then it will most likely not work in terms of simply creating a promise per element.

Remember that promises are much lighter than threads! We’re not spinning up dozens of threads to do the work above, just spreading the load over various threads. And yes, the parallel version does run faster on my dual core laptop than the one that isn’t using async/await.

Future plans for promises include combinators to wait for any or all of them, an API for backing them with things other than work in the thread pool, and making await a bit smarter so that it can suspend an ongoing piece of work in the thread pool when it blocks on another promise, thus freeing up that thread for other work.

Of course, all of this is early and experimental; any/all of the above can change yet, and it’s a long, long way from being exercised to the degree that many other parts of Rakudo have been. Expect changes, and expect many more things to land in the coming months; on my list I have asynchronous I/O and an actors module, and I know pmichaud++ has been thinking a bit about how to evolve the list model to support both race and hyper.

Anyway, that’s the latest news. Next time, hopefully there will be yet more spectest progress and some nice sugar for sorear++’s ground work on Java interop, which is what I used to build the threads/promises implementation.

Posted in Uncategorized | 9 Comments

Rakudo on JVM Progress Update

In the spirit of “release early, release often”, last month’s Rakudo compiler release was the first one to come with very basic support for running on the JVM. It supported a small number of the features that Rakudo on Parrot does. Of note, it could pass the sanity tests, a small test suite we keep in the Rakudo repository. The sanity tests ensure that the compiler is functional enough to compile Test.pm, which is a prerequisite for running the specification tests. Essentially, they make sure that we can crawl our way to the starting line, where the real race begins. :-)

So, since that release, we’ve been working hard on getting Rakudo on JVM able to pass more of the specification test suite, gradually increasing its capabilities, hunting bugs, adding missing bits of infrastructure and guts code in order to get features working, and so forth. Progress has been rather swift, and today’s automated test run by Coke++ shows that Rakudo on JVM is now passing 92% of the tests that Rakudo on Parrot does!

So what does this number really mean? It means that Rakudo passes 92% of the individual test cases on the JVM that it does on Parrot. To put that into perspective, that’s more tests than either of Niecza (82%) or Pugs (36%), meaning that Rakudo on JVM is now the second most spectest-passing Perl 6 implementation. Not bad, to say around 8 months ago not a single line of implementation code related to Rakudo JVM targeting had been written.

However, the raw number should be taken with a pinch of salt. Here’s why. The test suite is not especially uniform in terms of tests it dedicates to each feature. For example, there are hundreds of passing tests dedicated to providing good coverage of Unicode character property matching in regexes, and trigonometry easily has many hundreds too. By contrast, the slurp function has more like dozens of tests – since you can comfortably cover it that way. However, these are failing still, and I’m pretty sure that more Perl 6 users depend on slurp that Unicode character properties and trig.

Anyway, my aim is that we’ll be some way past the 95% mark in time for the July compiler release. And yes, I’ll make sure slurp gets fixed before then! :-) I suspect the last couple of percent will be the most tedious, but it feels good to be reaching towards them.

In the coming weeks, I expect the focus to start shifting from the compiler to the ecosystem: getting Panda working with Rakudo on JVM, and starting to run the module tests and work through the issues we find there. Calling Java libraries and work on parallelism are also planned, not to mention digging in to optimization work, since this initial phase has very much been focused on “get things working”. All in all, plenty to be working on, and plenty to look forward to.

Finally, a week or so ago I spent a couple of hours being interviewed by Nikos Vaggalis for Josette Garcia’s blog. It was a very wide-ranging interview and I managed to spout sufficiently in response to the questions that it’s being published over three posts in the coming weeks. You can find the first part of the interview here; enjoy!

Posted in Uncategorized | 3 Comments