Optimization, concurrency, and Moar

It’s been a while since I wrote an update here. Happily, timotimo has taken up the role of being our weekly Perl 6 reporter, so there’s a good place to follow for regular updates. However, I wanted to take a moment to provide the bigger picture of what’s been happening in the last couple of months, share my perspectives on it, and talk a bit about where things are headed from here.

Optimization, optimization, optimization!

A lot of recent effort has gone on optimization. NQP, the subset of Perl 6 that we use to implement much of the Rakudo Perl 6 compiler, has performance on MoarVM that starts to approach that of Perl 5, and on JVM sometimes exceeds that of Perl 5 for longer running things (it typically runs the forest fire benchmark from our benchmark suite faster once the JIT has had time to get going, for example). By contrast, Rakudo’s performance itself has been comparatively awful. Thankfully, things have been getting better, as we’ve worked to improve optimization, reduce costs of common things, and gradually begun to close the gap. This has involved work on both Rakudo and NQP’s optimization phases, along with work on improving the built-ins and some performance-oriented factoring changes. There’s still plenty of work to go, but anybody using Rakudo on MoarVM will likely feel the results in the next release. To give an idea of the improvement in HEAD Rakudo on MoarVM, which will appear in the April monthly:

  • Array and hash access is more than 3 times faster
  • Most multi-dispatches are now enormously cheaper
  • Many, many unrequired scalar allocations are optimized away
  • The forest fire benchmark on Rakudo can render twice as many frames per second
  • Compiler performance is around 15% better (estimate going on CORE.setting compile time)

Compared to Rakudo on Parrot the difference is more marked. On compiler performance alone, the difference is enormous: you can build the entire of Rakudo on MoarVM on my box in around 80s (which I’m not happy with yet, though you rarely change so much that you have to do the whole thing). That is less time than it takes for Rakudo on Parrot to complete the parse/AST phases of compiling CORE.setting (the built-ins). Running the spectest suite happens in half the time. Both of these times are important because they influence how quickly those of us working on Perl 6 implementation can try out our changes. Unsurprisingly, most of us do the majority of our development on MoarVM first these days.

One consequence of this work is that Rakudo on MoarVM is often sneaking ahead of Rakudo on JVM on some benchmarks now, even once the mighty JVM JIT kicks in. This won’t last long, though; a couple of the optimizations done will not be a great deal of work to port to the JVM, and then it can re-claim its crown. For now! :-)

Introducing “spesh”

A whole other level of performance related work has been taking place in MoarVM itself. The first goal for the project was “just run NQP and Perl 6”, and the VM simply got on with interpreting the bytecode we threw at it. That doesn’t mean it wasn’t carefully designed along the way – just that the focus in terms of execution was to be simple, correct and sufficiently complete to serve as a solid Rakudo backend. With that goal achieved, the next phase of the project is underway: implementing dynamic optimization based on program information available at runtime, speculative optimizations that can be undone if things they rely on are broken, and so forth.

The first steps in that direction will be included in this month’s MoarVM release, and are to thank for much of the improved compiler performance (since the compiler is a program running on the VM too). The easiest way to get an overview is for me to walk you through the pieces in src/spesh in MoarVM.

  • graph is about building a representation of the bytecode (at a frame level) suitable for analysis and transformation (the two steps involved in optimization). It starts out by building a Control Flow Graph. It then computes the dominance tree, which it uses to rename variables so as to produce a Static Single Assignment form of the bytecode. This is a form whereby a given name is only written to once, which eases many, many aspects of analysis.
  • args takes a tuple of incoming arguments, considers their types, arity, and so forth. It produces a set of guard clauses that indicate when a given specialization of the code applies (that is, a version of the code improved by making assumptions about what was passed), and then re-writes various argument access instructions to “unsafe” but fast ones that it can prove will always work out.
  • facts takes the graph, looks through it for sources of type information (including the incoming arguments) and does an initial propagation of that information through the graph. It creates usage counts to be later used in dead code elimination.
  • optimize takes this annotated graph, and makes a pass through it, applying a number of optimizations. Granted, there are not so many yet; so far we’ve mostly worked on getting to the point of having a framework to prove safety of transformations, and adding more of them comes next. However, those there so far can do things like:
    • Resolve methods to avoid hash lookups
    • Install monomorphic method caching if that doesn’t work out
    • Resolve type checks, possibly eliminating entire branches of code
    • Re-write attribute binds into “unsafe” pointer operations
    • Eliminate dead code
  • codegen takes the optimized graph and produces bytecode from it again. However, in the future (if we’re lucky, then hopefully through a GSoC project), this is where we would produce machine code instead.
  • deopt deals with the situation where some invariant specialized code may be relying on gets broken, and yet that specialized code is still on the call stack. It walks the call stack, looking for specialized code on it and tweaking return addresses and other data structures so that when we return into the code, we’re back in the safe (though of course slower) unspecialized code that checks things as needed.

By and large, this is essentially giving MoarVM a JIT. Of course, it’s not producing machine code yet, but rather JITing back to improved bytecode. While we tend to think of JITs primarily as “turn the program into machine code”, that’s really just one small part of any modern JIT. Analysis and specialization of the program before the machine code is produced is just as important; with this approach, we get to focus in on that part first and get some of the benefits now.

Concurrency

Progress on Perl 6 concurrency continues. The JVM implementation of the concurrency features has had various performance improvements since the March release, and MoarVM now has support for most of the Perl 6 concurrency features also. However, the MoarVM support for these features is early and most certainly not yet production ready. We’ll include it in the April release, but stick with Rakudo on the JVM for concurrency things if you’re doing anything that matters. If you just want to play with the basics, either will do.

Rakudo on MoarVM takes the spectest crown

Rakudo on its various backends hold the top spots on the Perl 6 specification test suite pass rate. However, nowadays Rakudo on MoarVM has worked its way into the lead. How so? Because it has the best Unicode database introspection support, opening up a range of tests that no other backend handles yet. Additionally, because it gets some of the Unicode stuff right that that Parrot does, but JVM doesn’t. And, finally, because on top of that it can now pass a bunch of the concurrency tests.

A multi-backend Rakudo Star

I’d hoped we would get a Rakudo Star release with support for all three backends out in March. It didn’t happen; the module tests showed up some holes. We’ve by now largely fixed those for Rakudo on MoarVM, and we’re looking quite good for the April Rakudo Star coming with support for both Parrot and MoarVM. With some effort, I’m optimistic we’ll have a JVM Star release in good shape for April too. This will provide users who want the “batteries included” release a choice of backends, and of note give those using Parrot a chance to switch over to using MoarVM, getting some substantial performance improvements on most programs and lower startup time and memory use.

Where next?

In the coming months, I’m going to be focusing on the following areas:

  • More improvements to the Rakudo and NQP optimizers, so we generate better (faster, smaller) code.
  • More improvements to the Rakudo built-ins, so they operate more efficiently.
  • Making the MoarVM concurrency support more robust, and improving the parallel GC.
  • Advancing the state of asynchronous I/O, so we’ll have good support for it on both the JVM and MoarVM.
  • Teaching spesh to specialize better. There are a bunch of data access things that can be made cheaper, as well as being able to better optimize multiple dispatch. Beyond that, both inlining and escape analysis are on the menu.
  • Improving native type support, including providing native arrays.

I’ve been hidden away coding for much of the year so far, apart from putting in an appearance at FOSDEM. But I’m getting on the road soon! I’ll be at the Dutch, Polish and Czech Perl Workshops, and look forward to seeing folks and sharing what I’ve been working on. Hope to see some of you out there!

Posted in Uncategorized | 5 Comments

January Rakudo Compiler Release: MoarVM support and much more

This month’s Rakudo compiler was cut today, and there’s a bunch of good stuff in there. In this post I’ll take a quick look at what’s been done.

MoarVM Support

This is the first Rakudo compiler release to have support for building and running on MoarVM, a new VM being built especially for Perl 6 and NQP (the Perl 6 subset a sizable chunk of Rakudo is written in). MoarVM support is not quite up to the level of the JVM and Parrot backends yet. It passes less specification tests than either of them – though it’s getting awfully close (Rakudo on MoarVM passes over 99% of the specification tests that Rakudo on the JVM – the current leader – does). Thus, you can actually run a heck of a lot of Perl 6 code just fine on it already. I used it recently in a pair programming session and we only hit one bug in the couple of hours we were using it.

The fast-path for signature binding that I mentioned in my previous post has also been put in place. It did, as hoped, lead to a fairly dramatic speedup. The workload of building Rakudo’s built-ins and running the specification test suite was also a good basis for doing some GC tuning, which led to further improvements. By this point, on my box, Rakudo on MoarVM now has:

  • The lowest startup time of any Rakudo backend
  • The shortest spectest time of any Rakudo backend
  • For the CORE.setting build and spectests, the smallest memory footprint of any Rakudo backend

Other Rakudo developers have reported similar findings. I need more time to look into the exact numbers, but it would appear that Rakudo on MoarVM is also the fastest to build. CORE.setting build time is roughly competitive with on the JVM now (but how roughly seems to vary quite widely – I think it depends on what JVM or even version is being used), but startup time for NQP on MoarVM is rather lower, meaning that those parts of the build go by faster.

The focus for the next month or two will be getting into a position where we can produce a Rakudo Star release that uses MoarVM. This means digging through the last 1% of failing spectests and dealing with them, finishing the work of getting Panda (our module installer) to work with Rakudo on MoarVM, and then hunting bugs that keep us from running the modules. Getting NativeCall working will also be a notable task, although given we already have a NativeCall in C working against 6model (the one we built for Parrot), there is a lot of prior art this time – unlike on the JVM.

On performance – we’re not even scratching the surface of what’s possible. MoarVM’s design means it has a lot of information to hand to do a good amount of runtime specialization and optimization, but none of this is implemented yet. I aim to have a first cut of it in place within the next few months. Once we have this analysis and specialization framework in place, we can start thinking about things such as JIT compilation.

Rakudo on JVM Improvements

Probably the best news in this release for anybody working with Rakudo on JVM is that the gather/take stack overflow bug is now fixed. It was a fun one involving continuations and a lack of tailcall semantics in an important place, but with doing the MoarVM implementation of continuations in my recent past, I was in a good place to hunt it down and get a fix in. A few other pesky issues are resolved, including a regex/closure interaction issue and sometimes sub-optimal line number reporting.

The other really big piece of JVM-specific progress this month has been arnsholt++ continuing to work on the plumbing to get us towards full NativeCall support for JVM. This month, a number of the big missing pieces landed. NativeCall working, and the modules that depend on it working, is the last big blocker for a Rakudo Star on JVM release, and it’s now looking quite possible that we’ll see that happen in the February one.

General Rakudo Improvements

While a lot of energy went on the things already mentioned, we did get some nice things in place that are independent of any of the particular backend: improvements to the Nil type, the sequence operator, sets and bags, adverb syntax parsing, regex syntax errors, aliased captures in regexes, and numerics. MoarVM’s rather stricter interpretation of closure semantics than we’ve had in place on other backends has also led to various code-gen fixes, which may lead to better performance in certain scenarios across the board too (one of those, “I know it probably should but I didn’t benchmark” situations).

I’d like to take a moment to thank everyone who contributed to this month’s release. This month had the highest Rakudo contributor count in a good while – and I’m hopeful we can maintain and exceed it in the months to come.

Posted in Uncategorized | 4 Comments

A Rakudo on MoarVM update

Almost exactly a month ago, I reported that Rakudo on MoarVM could do “Hello, world”. Despite the general slow-down as folks took Christmas and New Year breaks (myself very much included), progress has been very good. Here are the key things to know:

  • All of the Rakudo sanity tests pass on Rakudo on MoarVM
  • At present, Rakudo on MoarVM passes over 92% of the specification tests that the current leader (Rakudo on the JVM) does
  • We’re on course to have the January compiler release be the first Rakudo release with some level of MoarVM support; I’m hopeful we’ll be passing over 95% of the specification tests Rakudo on the JVM does by that point

The work has been taking place in the moar-support branch. We didn’t merge it yet, but that should happen within the next week or so.

MoarVM itself has been improving steadily, also. Here are a few of the features that have landed in the last weeks:

  • Block exit handlers, used to implement LEAVE and friends
  • Continuations, used to make gather/take properly lazy
  • Updated the Unicode database to Unicode 6.3
  • Sized native arrays
  • State variables

However, another extremely important bit of work has been taking place that is focused on stability. Nicholas Clark has been conducting garbage collection torture tests, by now down to forcing a GC run every single object allocation and using memory protection to catch illegal accesses to moved objects. Most of the things we call GC bugs are not actually inside the garbage collector implementation, but rather are other places in the codebase where mistakes have been made that break invariants that must be upheld for the GC to do its job correctly. I’ve not been counting, but I’d estimate that a bit over a dozen bugs have been turned up by this work so far – bugs that would have been a real pain to find if they had just happened to crop up some weeks, months or years down the line in some user’s code. At this point, NQP can be built and tested under the toughest torture test that exists so far without showing any issues. The Rakudo build is the current torture subject. I’m incredibly happy this work is taking place; it means that by the time Rakudo on MoarVM starts getting used more widely, we can be reasonably confident that users are unlikely to run into GC-related issues.

So, what’s the path from here? Here’s what I’m hoping for in the coming month:

  • We reach the point of passing more than 99% of the spectests that Rakudo on the JVM does
  • Progress towards Panda being able to run and install some modules (not sure we’ll get all the way there, but I hope we’ll get reasonably close)
  • Dealing the with very-slow-path signature binding. If you try anything on Rakudo on MoarVM, you’ll discover it’s slow. This is thanks to a hot path being implemented in a slow way, in the name of getting stuff correct before optimizing. As soon as we’re beyond the 99% point with the spectests, I’ll dig into this. It should make a dramatic difference. I’m aiming to do some work in this area to make things better for Rakudo on JVM also.
  • Whatever fixes are needed to get the Rakudo build and sanity tests surviving the GC torture

I’ll be back with some updates in a couple of weeks, to let you know how we’re getting on. :-)

Posted in Uncategorized | 1 Comment

A few quick updates

I’ve been rather snowed under with work in November, thus the absence of posts here. Sorry about that. It turns out the sales folks at $dayjob did their job a little too well, and besides my own heavy teaching load I had to step in to rescue a few other things. Anyway, that’s done now, and I’m back to having a bit more time. (It’s not that I had absolutely no time at all since the last post. It was just more important to spend the bits I had clearing blockers in the way of others rather than writing blog posts. :-)) So, some quick bits of news.

Concurrency

At the Austrian Perl Workshop, I worked with lizmat on getting the Perl 6 concurrency spec aligned with the things I’d been working on for Rakudo on JVM. Happily, this had some good consequences. Firstly, Larry did a load of work on it, vastly improving naming of things and giving some syntactic relief in various areas where I’d just done simple function and method APIs. Secondly, both lizmat and timotimo dug in to bring the implementation in line with these spec changes, doing some other improvements to boot. So, now the bus number on the Rakudo concurrency support has increased. You can find the latest spec here. Also, you can see my Nordic Perl Workshop slides about it.

Rakudo on MoarVM progress

Last time I wrote here, we had NQP support for MoarVM pretty much complete, and were ready to dig into working on getting Rakudo running on MoarVM. The first step was to get the core of the compiler itself building. This wasn’t a terrible amount of work, since the majority of it is just NQP code. There’s more of it than is found in the NQP compiler, but that aside it doesn’t do too much that’s new. Next came the Perl 6 MOP, which is written in NQP. Things went very smoothly with that. Beyond there, things got more interesting.

The next big piece to make work was the BOOTSTRAP. This uses the MOP to start piecing together the key types at the heart of Perl 6, doing enough so we can write the rest of the built-ins in Perl 6 itself. Most of it is one huge BEGIN block. Here there were various unimplemented things, plus some of the VM-specific bits of Rakudo needed porting. And beyond that lay…the setting. When I did the JVM port, we had around 14,000 lines of built-ins there. These days it’s closer to 17,000 – and that’s excluding the concurrency stuff. Since compiling the setting actually involves running little bits of Perl 6 code, thanks to traits and BEGIN blocks, getting through it means making quite a lot of things work. We got there a week or so back.

That still didn’t give us “Hello, world”, however. The setting does various bits of initialization work as it loads, which naturally hit more things that didn’t yet work. Finally, yesterday, we reached the point where setting loading worked and we could do “Hello, world”. Actually, once the setting loaded, this worked right off.

Of course, there’s still lots to do from here. The next step will be to work on the sanity tests. There’s a couple of large and complex features that will need some porting work; of note, gather/take will need doing. Also there are a couple of stability and algorithmic things that need taking care of in MoarVM itself. Building CORE.setting is easily the biggest workout it’s had so far, and naturally it highlighted a couple of issues. And, of course, beyond the sanity tests like the spectests…

Better use of invokedynamic, and other optimization

I’ve got a talk at Build Stuff next week on invokedynamic, the instruction added to the JVM in JDK7 to support those implementing languages of a more dynamic nature So, in the last week I spent some time tweaking our usage of it to get some further wins in various areas, and to make sure I was nicely familiar with it again in time for my talk. That work got merged into the mainline development branches today. I did some other optimizations along the way that are a win for all backends, too; lizmat noted a 12.5% decrease in spectest time. Not Bad.

Posted in Uncategorized | 2 Comments

NQP gets MoarVM support, cursor reduction, and other news

I thought it was high time for a round-up of news about the Perl 6 related stuff that I’ve been working on – as usual, with the help of many others. So, here goes!

MoarVM Backend in NQP master

In the last progress update, I mentioned that MoarVM could host NQP and pass most of the NQP language test suite. We got to that point by using a cross-compiler running on Parrot to compile the NQP sources into MoarVM bytecode. Fast forward a month, and we’ve reached the point of having NQP bootstrapped on MoarVM. What’s the difference? Simply, that you can use NQP running on MoarVM in order to build an NQP from source. This means that both bytecode generation and serialization are now working. Since some of the source files we’re compiling are also fairly large (4000 lines), it’s also been a fairly good hardening exercise; many bugs have been fixed.

The MoarVM support has now been added to the NQP repository. Just as you only need a JVM to build NQP for the JVM, you also only need MoarVM to build an NQP for MoarVM. That is, it’s bootstrapped and can stand alone now. NQP monthly releases from here on will thus come with support for three backends: Parrot, JVM and MoarVM.

Better still, NQP on MoarVM now passes the full NQP test suite. It also does so faster than any other backend. This doesn’t mean MoarVM is faster for all things; if you write NQP code with a tight integer loop, or something long running, the JVM’s JIT will kick in and typically come out ahead after the slow start. MoarVM just gets moving a lot faster than the JVM, which is what matters for running lots of short-lived tests.

This means we’re now ready to start work on getting Rakudo running on MoarVM. I don’t have any predictions just yet on precisely when we might land this; the rate of progress in the next couple of weeks, as we dig into it, will provide an indication. Reaching the point of being able to bootstrap NQP on MoarVM is a significant achievement along the way, though. While NQP is both smaller and simpler than full Perl 6, it still requires being able to execute Perl 6 grammars, eval code, do a whole range of OO-related things (including classes, roles and meta-programming), perform multiple dispatch, handle BEGIN time, and so on. These are all things that a full Perl 6 needs, so it’s very good to have them in place.

Allocating Less Cursors

Allocation profiling work by hoelzro and timotimo (I hope I remembered right; correct me if not!) indicated that both NQP and Rakudo were allocating a huge number of Cursor objects. So what are these? They are objects that keep track of parsing state. They are created as we enter a rule or token in the grammar, thrown away if it fails, or typically placed into a tree of cursors if it passes (though sometimes we only care for pass/fail and so quickly throw it away again). Naturally, we’re going to allocate quite a few of these, but 2.14 million of them being allocated while parsing Rakudo’s CORE.setting seemed decidedly excessive.

I decided to spend some time trying to understand where they all came from. NQP itself tends to be a bit easier to analyze than Rakudo, so I started there, and added some instrumentation to record the number of times each production rule or token was hit and led to allocation of a Cursor. I then used this to gather statistics on a fairly large source file from the NQP build. It started out at 284,742 Cursor allocations.

The first big win from this came when I realized that a huge number of these Cursors were just allocated and returned, to indicate failure. A Cursor is born in a failed state, and many places were just creating them and returning them without any further work, to say “I failed to match”. Thing is, failed Cursor objects all look about the same. The only thing they differ by is type, and for each Cursor type we have a ParseShared instance. Thus, it wasn’t too hard to create a shared “fail Cursor” and just look it up in a bunch of places, rather than allocating. That shaved off over 50,000 allocations. A related realization about improving MARKED and MARKER led to another 30,000 or so allocations chopped. Note that none of this really leads to overall reduced memory usage; all of these objects were quickly collectable. But allocation makes GC run more often, and thus carries some amount of runtime cost.

Further examination of the statistics showed hotspots that seemed unreasonable. The problem wasn’t that they allocated a Cursor needlessly, but rather than we should never have been calling the grammar rule in question so many times. This led to a number of grammar optimizations. In one case, just making sure that a certain rule got a declarative prefix meant the LTM mechanism could very quickly rule it out, saving 15,000 calls to the production rule, and thus cursors. The other big discovery was that the ident built-in was not having an NFA properly generated for it. This is notable because ident features in various paths in the grammar, and thus meant we were failing to quickly rule out a lot of impossible paths through the grammar using the NFA (which is the cheap way to rule things out). With a few other tweaks, all told, we were down to 172,424 Cursor allocations, just 60% of what we started out with.

The statistics for Rakudo’s CORE.setting showed we started out doing 2,196,370 Cursor allocations. Some of the fixes above also aided Rakudo directly, making a good cut into this number. However, analysis of the statistics revealed there were more wins to be had. So far, we managed to bring it down to 1,231,085 – a huge reduction. There’s likely a lot more wins to be had here, but I think we’ve picked all of the large, low-hanging fruit by now.

Rakudo Debugger on JVM

I spent some time getting the Rakudo Debugger to also work on the JVM. I’ve still got some work to do on making it easily buildable, and so it can be installed with Panda on the JVM like it can on Parrot. But the rest of the work is done, and I was able to step through programs, set breakpoints, examine variables and so forth. So, that’s another of the “things missing on the JVM port” mostly gone.

Rakudo on JVM spectest

I also did a little work to fix more of the spectests that pass on Rakudo Parrot, but fail on Rakudo JVM. The result: 99.9% of the spectests that pass on Rakudo Parrot also pass on Rakudo JVM. To put an exact number on it, that’s 27 tests different. Around half are Unicode related, which isn’t so surprising since we’ve borrowed Java strings for the time being; in the end, we’ll need to do an NFG implementation on the JVM. The others still need to be hunted down. I think it’s fair to say that you’re rather unlikely to run into this 0.1% in day-to-day usage of Rakudo on the JVM, however. In fact, the thing you’re most likely to miss today is NativeCall – which arnsholt has been working on, and I plan to join in with during the coming days.

All in all, things are coming along very nicely with Rakudo’s JVM support. It was just 11 months ago – not even a year – that I started to build a little bit of infrastructure so we’d be able to create a tree-like data structure in NQP and get JVM bytecode made from it. By now, thanks to the contributions of many besides me, Rakudo runs on the JVM and is very capable there. When it comes to concurrency, it’s the most capable Rakudo you can get. And, once it gets past the slow startup, it’s usually the fastest. It’ll be interesting to see where we stand on these things in another six months or a year’s time. Here’s hoping for another productive time ahead!

Posted in Uncategorized | 2 Comments

Material from the Rakudo and NQP Internals course

A little over a month ago, lizmat++ contacted my employer, Edument AB, to discuss using our training services to deliver a 2-day workshop aimed at current and potential Rakudo and NQP contributors. For those of you who don’t know, aside from working on Perl 6 I also work teaching/mentoring on various topics (typically software architecture, TDD, and advanced C# programming). The goal was for me to spend a couple of days explaining a bunch of topics – including NQP, grammars, QAST, 6model, and bounded serialization – to help people know where to start or to help them get into unfamiliar areas of the code.

The workshop took place this last weekend in Frankfurt. I’d expected we might have 5-6 people sign up; in the end, we had around 15 people attending! The workshop involved a combination of teaching and exercises, with plenty of chances to ask questions. Happily, there were tasty lunches and dinners too (though I can’t take any of the credit for this side of things). I greatly enjoyed teaching the course; I like both working on Perl 6 and doing my teaching work at Edument, and being able to do both at once was delightful! :-)

One aim of developing the course was to help with the state of documentation of the NQP and Rakudo internals. Typically at Edument, we only give out our material to those attending a delivery of our courses, and even then not the original source! However, for this course, we’ve made an exception and released the course material under a Creative Commons license. So, while I hope that we’ll be able to put together future “live” deliveries of the material for other new (or potential) contributors, this means it will now always be available to the community at large. :-)

I hope this will prove a valuable resource for those contributing or interested in contributing to Perl 6 development, and would like to take a moment to thank Edument for once again being supportive of my work on the Perl 6 project!

Posted in Uncategorized | 4 Comments

A MoarVM Progress Update

A while back, I wrote here to introduce the MoarVM project, which aims to build a Virtual Machine specially tailored for the needs of NQP and Rakudo. Since then, a lot of progress has been made and the project has attracted a number of new contributors. So, it’s time for a round-up of the latest MoarVM news.

MoarVM Hosts NQP

Back when I first introduced MoarVM, we had a partly complete cross-compiler that used NQP on Parrot to parse NQP source code and build its AST, then turned it into MoarVM bytecode. This enabled us to cross-compile some of the NQP test suite to MoarVM. Since then, we’ve been improving the cross-compiler so it is able to translate NQP itself into MoarVM bytecode, as well as making the VM capable of doing the things that NQP requires to run.

In the last week, we’ve reached the point of having a self-hosted NQP on MoarVM that passes most of the NQP test suite. What does this really mean? That MoarVM is capable enough not only to run programs written in NQP, but also to run the NQP compiler (meaning it has pretty complete Perl 6 grammar support), its own backend code-generator and to be able to evaluate the result of this compilation.

We’re down to 5 files from t/nqp that either fail outright or fail some tests. I expect we’ll be able to get these fixed within the next week, and also make progress on the serialization tests. Beyond this, it will be time to start working on getting this self-hosted NQP able to build itself. At this point, the cross-compiler will no longer be needed, and the MoarVM backend code will migrate to the main NQP repository, joining the Parrot and JVM backends.

How slow is it?

I’ve been pushing pretty hard for us to focus on getting things working before getting things fast; for now keeping the code easy to work on and implementing features on the path to having a Rakudo on MoarVM is far, far more important. That said, there is already an encouraging result. While it’s not quite a fair test given the five of the 77 test files that do not run to completion, currently MoarVM can run the NQP test suite in 16s on my main development machine, This compares to 21s on Parrot (which means we come out ahead at startup time, execution time, or a mixture). While it’s not incredibly faster (though of course running lots of small test files is startup-dominated), it isn’t a bad starting point given we’ve barely scratched the surface of what we’ll be able to do in terms of optimization (for example, no JIT yet, known algorithmic issues in handling of GC inter-generational pointers, and comparatively little time invested in optimization all over the VM). By the way, thanks to its much greater start-up time, NQP on the JVM takes a huge 139s. However, much like in the story of the tortoise and the hare, leave things to run for long enough and the JVM backend still typically comes out ahead, sometimes very significantly so.

On memory usage, to compile and run a simple “hello, world” in NQP, MoarVM’s memory usage clocks in at just over half that which Parrot uses, which in turn clocks in at about half that of the hungry JVM. :-)

While the JVM backend doesn’t come out looking too awesome here resource usage wise unless you’re doing something long-running, it’s worth remembering that the JVM port of Rakudo is just a couple of months old, and the whole porting effort less than a year old. In that sense, we’ve a lot of room for improvement yet.

Goodbye APR, welcome libuv

Originally, MoarVM was using the Apache Portable Runtime for IO, multi-platform thread support and a handful of other things. By now, we’ve moved over to using libuv. This isn’t because there was anything inherently wrong with the APR, and while we’ve ended up not using it, I’ve come out with a positive impression of it. So, thanks to the APR developers! The primary reason we moved to using libuv instead was to ensure we can provide support for asynchronous I/O on a relatively wide variety of platforms. We were also using a mix of the atomic primitives from the APR and from libatomic_ops; we’ve now standardized completely on libatomic_ops for this.

Build and Dependencies

There’s a bunch of stuff I’m decent at doing, but build systems is not one of them. My first pass was a case of, “make something work, and hope somebody replaces it.” Happily, that has now happened, so the build system is in much better shape. We’ve also got things in place for building MoarVM as a shared library, and for pulling in varius dependencies while keeping them at enough of a distnace that we should be able to depend on system versions of them (trying to balance easy source builds with keeping future packagers happier).

Contributors

Since becoming a public project, MoarVM has grown its contributor base. In total, 16 authors now show up in the commit log and I’d say we’re at around 5 core developers who commit notable things relatively often. I’m happy that people have felt inspired to contribute, and that things are progressing fairly swiftly now (I hoped we’d be a little further by this point, but my tuit supply over the summer was not what I’d hoped). Currently, it looks like the October NQP release will be the first to include MoarVM support, and from there we’ll focus on getting Rakudo onto MoarVM.

Posted in Uncategorized | 4 Comments