Digging into the performance/reliability work

This week, I finally started digging into Rakudo and MoarVM things again, as part of my performance and reliability grant. Here’s what I got up to.

Faster accessors and an inlining fix

Making object accessors and object construction faster was one of the bullet points on the grant, and I decided to start out with accessors – largely because I’d already figured out pretty much exactly what I wanted to do with them. Here’s a small microbenchmark:

class A {
    has $.x;
}
my $a = A.new(x => 42);
my $y;
loop (my int $i = 0; $i < 10000000; $i++) {
    $y = $a.x;
}
say $y;

The hot loop is just calling an accessor method ten million times and assigning the result. Before I started with any improvements, this code took 3.34s to run, including startup/compilation time. That was around 0.15s (measured by the time to do one iteration of the loop), so let’s call it 3.2s for the iterations themselves. That makes for about 0.32 microseconds per iteration. Being under a millionth of a second might not seem something to complain about, but it’s actually a little shy of a thousand CPU clock cycles on a 3 GHz CPU. Not cheap, for something you would expect to be very cheap indeed!

So, where is the cost? Here’s a bit of the profile output:

prof-before

That <anon> is the accessor, which is called ten million times. It’s been JIT-compiled (the green), which is good. The column on the right, however, tells a less happy story: the accessor method was not inlined. Inlining is an optimization that involves taking the body of a called method and splicing it into the callee, such that you completely eliminate the dispatch and calling overhead. One other point on my grant is to look into decreasing call overhead, which really wants to be lower, but being able to inline and get rid of it altogether is still preferable. And accessor methods are tiny, so should be great inlining candidates. So what gives?

Well, at the heart of it was the way that we’d implemented accessor methods. Down in the metamodel code, we did something like this (simplified):

$package.^add_method($accessor-name, method () {
    nqp::getattr(self, $package, $attr-name)
});

The method being added there is a closure. It closes over the package the attribute is from and the name of the attribute, which together are used in the lookup. MoarVM’s inliner can not inline things that access lexicals from the surrounding scope, because then you actually need a callframe with a correctly bound outer frame to resolve them. There’s a further cost, though. If we write an accessor method by hand:

method x() { $!x }

Then we look at the generated code, we see these two instructions:

 wval r4(3), liti16(0), liti16(12)
 getattr_o r5(1), r0(1), r4(3), lits($!x), liti16(0)

The wval is just grabbing a constant symbol, and the name of the attribute is a literal string. Since all those things are known, the MoarVM code specializer is able to re-write those two into:

 sp_p6ogetvc_o r5(1), r0(1), liti16(8), sslot(3)

This is known as a “spesh op”, which only the optimizer is allowed to emit. The 8 there is an offset from the start of the object’s memory in bytes, and this op is interpreted by taking the object in register 0, adding 8 bytes to its address, and deferencing – without having to do much in the way of “safety checks”. This JIT compiles into machine code rather cheaply also.

But when we have a late-bound package and attribute name, this optimization cannot happen. So not only are we not getting inlining to take place, we’re also losing out by having the code for accessors that we generate not being so good as the code you’d get if you wrote them all by hand.

So, what to do? Well, since most of the time the MOP is invoked by the compiler, and the compiler’s job is code generation, the neat solution is to have it provide a compiler services object to the MOP. It can use this to generate the accessor methods. When there’s no compiler services object supplied, it falls back to the existing way (for example, this would happen when you’re doing dynamic generation of classes). So, that’s what I implemented.

The result? 2.43s, or about 2.3s with startup time removed. So, it now was running in 72% of the time it did before. That’s an improvement, but not what I was looking for. Disappointed, I looked at the new profiling data and found that we still weren’t actually inlining the accessor methods. Looking at their code, they were free of outer lexical references and otherwise really simple. So what was going on?

It turned out to be an inlining bug that was also leading to many more missed opportunities. I’d heard from other folks looking at profiles that they were surprised certain things didn’t get inlined, but in the push to get things to work before Christmas, rather than to get them fast, I didn’t look into it. Now I have chance to take care of such things, and so investigated. It turned out to be an unfortunate accident that occurred when teaching inlining about multiple dispatch, so it could inline multi subs and methods. It led to us, oddly, being pretty good at inlining multi-dispatch calls, but missing out on lots of (supposedly easier) single-dispatch calls. Once I figured out what was going on, the patch was short and simple. This should have much wider benefits than just for accessors.

But, back to accessors. With inlining of various single-dispatch things fixed up, the benchmark now ran in 0.63s, so let’s call it around 0.5s once startup time is removed. That’s 0.05 microseconds per iteration of the loop, or around 150 CPU cycles per loop. That’s still too high, but running that loop in 15% of the time we used to is a nice step forward.

Just to put that figure on the map a little, I wrote the following Perl 5 program, which I hope is reasonably equivalent:

package A;
use Moo;
has x => ( is => 'ro' ); 

package Main;
my $a = A->new(x => 42);
my $y;
for (my $i = 0; $i < 10000000; $i++) {
    $y = $a->x; 
}
say $y;

This ran in 5.12 seconds, with 0.08s startup/compilation time, so let’s call it 5 seconds for the iterations, or around 0.5 microseconds, or 1500 CPU cycles on our over-idealized 3GHz CPU. (For those wondering why I didn’t stick use integer into the Perl 5 benchmark to reflect the native int I used in the loop of the Perl 6 one: I tried, and on my box that made things slower.)

There was another small benefit: CORE.setting got a bit smaller. While we’re producing a bit more bytecode now, we’re serializing less closures, and it turns out that the bytecode approach weighs a little less. My guess going into this was that I’d about break even, so coming out with a 130KB lighter CORE.setting was very nice. That means a smaller resident memory size also.

A big leak

Away from performance work, I spent some time hunting down a reported memory leak. Tracing things back from the valgrind output I was sent, I recreated a one-liner to reproduce it. The leak needed certain combinations of features to be used; there were various paths to trigger it, though my reproduction of it involved a multiple dispatch sub with a where constraint, called with flattening arguments. If such situations happened a lot, we could leak rather heavily. It’s gone now.

The analysis of this issue was helped by my having spent some time in the run-up to this grant clearing up MoarVM’s –full-cleanup mode. Usually we don’t waste time neatly clearing everything up at exit, because the OS can do it way faster (yes, I measured). However, MoarVM invoked with the –full-cleanup flag will try to do such cleanup, freeing everything it allocated. This is useful because it makes actual leaks, not just incomplete cleanup, clear to see. I’m not all the way there with this work yet, however NQP and most of its test suite are already valgrind-clean. I’ll be continuing to pick off the missing bits of cleanup over the course of the grant, and at some point want to set up a spectest run with valgrind + –full-cleanup to give ourselves a good chance of hunting all the leaks down.

Work in process: lazy string heap decoding

A decent chunk of the CORE.setting compiled output, along with various other bits of the compiler, is string literals. Of course, not all of them are equally used (think about the hundreds of error messages you never normally encounter). To date, we’ve always taken all of the strings and decoded them at bytecode load time, creating an NFG MVMString data structure (and so doing the bit of analysis needed to see if they need any synthetic codepoints generating). Looking at our memory use, it became clear there might be a win from deserializing them on-demand. The patch doing so isn’t quite ready for prime-time yet (though a follow-up fix by Timo may have addressed the crash my initial patch produced in an NQP test), but it revealed that we can knock around 1.3MB off Rakudo’s base memory usage with such an approach. (Fun fact: with the patch, on my box NQP running while 1 { } uses 1.8MB less memory than the JVM running the equivalent Java program – except NQP actually has the NQP compiler in memory and compiled the script too!) Anyways, expect this improvement to land in the next couple of days.

What next?

There’s so much to do! One task will be applying the same approach I took for accessors to object construction, so see how far that helps. Our object construction speed is a known bottleneck, and a simple benchmark against Perl 5 + Moo shows Rakudo being a disappointing 7 times slower. So, some work will be needed for us to reach towards parity, then get ahead, there. And on the reliability track, I’ll be picking out some bugs, leaks, etc. to squish too.

Posted in Uncategorized | 3 Comments

A few words on Perl 6 versioning and compatibility

Recently, I put together a set of guidelines for Perl 6 versioning and backward compatibility. They have been refined with input from the Perl 6 community. Together, they amount to a read of over 4,000 words, and are fairly detailed. You can consider this post a TL;DR version of the guidelines, giving the big picture idea of what they mean.

There’s more than one thing to version

In common with many languages, Perl 6 has a clean separation of language version and compiler version. Just as C has language versions like C89, C99, and C11, Perl 6 has versions 6.c, 6.d, 6.e, and so forth. To compile C you might be using GCC, Clang, MSVC, or something else, and these have their own version schemes. To compile and run Perl 6 you’ll for now probably be using Rakudo, which has monthly releases with a boringly predictable naming scheme: 2015.11, 2015.12, 2016.01, etc.

Perl 6 language version = a test suite

A given Perl 6 language version is defined by a test suite. An implementation that passes that test suite can claim to provide that version of the Perl 6 language. Rakudo 2015.12 was the first release to provide Perl 6.c, because it passed the test suite. Rakudo 2016.01 also passes the test suite, and thus provides Perl 6.c too.

The master branch of the test suite represents the latest development of the Perl 6 language. Language releases are represented by tags (implying that 6.c, for example, is immutable).

Programs may declare which version they want

The first statement of a program (either a script or a module) may contain a use statement requesting a particular language version:

use v6.c;

Implementations are required to refuse to run code for versions of Perl 6 they do not provide.

No language version declaration implies latest released version

Pretty much self explanatory. Asking for nothing in particular gets you the latest and greatest Perl 6 version the implementation has to offer. Meaning that 20 years down the line, people wanting to show off the latest things won’t have to prefix all their example snippets with use v6.n.

Module installers/directories may be more demanding, and require modules to specify a language version. That’s a decision for the those making such tooling.

An implementation may support many language versions at once

Assuming that we release a Perl 6.d sometime during 2016 (remember that this means “release a test suite defining what 6.d is”), then Rakudo 2016.12 will provide both Perl 6.c and Perl 6.d. To do that, it will be required to pass both the 6.c and 6.d test suites.

Implementations will need to go to moderate effort to do this

New syntax (new operators, new phasers, etc.) introduced by Perl 6.d should not be available to a program declaring use v6.c. This is to ensure that the syntactic additions do not cause breakage. For example, adding a DONE phaser could break any code that had a DONE sub and called it with listop syntax. Rakudo will, for now, support this by guarding such syntax in the parser with a version assertion. (Another strategy is to ship multiple parsers, presumably loading “on demand” those that are actually needed.)

Lexical aspects of the CORE.setting (the builtins) get the same treatment. Recall that in Perl 6, the builtins are not imported, but rather are in the outer lexical scope of the program. If Perl 6.d adds new subs, different implementations of subs, new classes or roles, or new constants, they should go in a CORE.d.setting, or some equivalent. Implementations are free to choose exactly how they will structure things. In Rakudo, we’ll retain CORE.setting with the “base” set of things, and have a CORE.d.setting that overrides and adds or overrides things as needed for Perl 6.d. This means that new Perl 6 language versions can change the behavior of existing operators or subs in ways that are not backward compatible, without breaking code that declared in a use statement it wants a previous version of Perl 6.

Late-bound things (methods, dynamic variables) have different rules. See the full guidelines for the details, but the upshot is that Perl 6 designers have a bit less flexibility in how they evolve methods on built-in types compared to subs, syntax, etc. Backward incompatible changes (as judged by the test suite) are not allowed; new methods or new overloads that don’t conflict with existing behaviors are. The other key difference is that no effort is made to “hide” methods added as part of newer versions of Perl 6 from callers using an earlier version of the Perl 6 language.

TL;DR, supposing you have a module that declares use v6.c:

  • The onus is on Perl 6 implementations to not let your program run into 6.d syntax changes or new subs/constants in the builtins
  • The onus is on you to not use methods that did not exist in 6.c. However, tooling will probably come to exist to help out (for example some kind of Module::FlightCheck that uses rakudobrew to grab the 2016.01 release, which is known to support nothing later than 6.c, and run your module’s tests on it).

If you think the second part of this sucks, then read the guidelines to get an idea why the alternatives all suck harder. :-)

Long Term Support and security releases

Rakudo releases every month, and since each release will provide a range of Perl 6 language versions by passing their official test suites, then in a sense every one is an “official” release. To be very clear, there is no release we should talk about as being “the 6.c release of Rakudo” or “the 6.d release of Rakudo” (though some folks probably will anyway, no matter what I say, and they’ll most likely mean “the first Rakudo release that supported 6.c” or so).

What we will do is declare some releases as “Long Term Support” releases. This label will be applied to releases some time after they have been made, so we can support releases that we know behaved reasonably well in the wild – at least in their first month or so “out there”. For example, suppose that 2016.02 is fairly well received. We might declare it a LTS release, and we’ll declare with that the period of time we intend to “support” it for.

What does support mean? It means that, in the event of security patches or serious bug fixes, we’ll produce bug compatible releases of all current Long Term Support versions of Rakudo. For example, suppose 2016.02 and 2016.10 were marked as LTS releases for a period of 12 months. In December 2016, we find a serious security bug in Rakudo. We’d release a 2016.02.1 and 2016.10.1, which would branch from the 2016.02 and 2016.10 tags and have the required patch(es) cherry-picked in. This would allow upgrades to get the security fix with a very high degree of confidence that existing code will not break.

Minor language versions

One thing we’d prefer to avoid is people declaring dependencies in their Perl 6 code on particular compiler versions. There’s no way to prevent it, but we can try to reduce the temptation to do so. The typical use case would be wanting to depend on a particular bug fix. Fixes get coverage in the language test suite, and so will be part of the next language release – but since the major language versions will tend to have at least a year between them, that could be a bit too long to wait.

Therefore, we’ll also have minor language versions, named 6.c.1, 6.c.2, etc. These give something implementation-independent to depend on. Chances are these will be needed more in the short term than in the long term.

What will we market?

Major language versions, primarily. We’ll use minor language versions to focus on incremental improvement and refinement. The interesting “next big thing” will be the major language versions. Each major language version will get a name. We’ve picked celebrations as the naming theme; 6.c was “Christmas”, and 6.d will be “Diwali”. (That doesn’t mean we’ll be pushing ourselves to actually ship it anywhere near where Diwali falls in the calendar. We already did that with Christmas, taking care to release on Christmas day.) So, look out for a talk on “What’s coming in Perl 6 Diwali” at some conference later on in the year. :-)

Trying out the future

One question the above raises is how to try out the latest implementation work towards the next major version. For that, we’ll use versions such as 6.d.a (6.d “alpha”). So:

use v6.d.a;

Will get you access to the stuff we expect to be in 6.d. Note these lettered versions will really be giving you the current work in progress and come with absolutely no backward-compatibility or stability promises, and support for them will be dropped swiftly after the actual language release of 6.d.

Where will new spectests go? Do I need to tag them somehow?

In the master branch of the repository of all spectests, and they need no tagging up besides the usual fudging. Released language versions are handled as tags, which are immutable.

Are the spectests really enough to specify a language?

I think they’re the best tool we have for the job at the moment. We might want to look towards property based testing ala QuickCheck some more, but that’s still a test-based approach. Natural language doesn’t have the precision, but more critically lacks the verifiability (that is, you can’t run a Perl 6 compiler against a natural language specification). Formal methods, such as operational or denotational semantics, offer greater precision than tests, but the intersection of people who know how to apply those and who want to contribute to Perl 6 is probably tiny. Certainly they lack the accessibility of a test suite expressed in Perl 6, and so would take us away from the goal of Perl 6 being a community’s language.

All that said, it’s fairly clear we need anywhere between 2 and 10 times the current number of tests to have comfortable coverage of both the language and its wide array of built-ins. We’ll be looking into coverage analyses to help us understand where those tests are most lacking.

You didn’t answer my really important question about Perl 6 versioning!

Then leave a comment, and maybe I’ll do a follow-up post to answer it. :-)

Posted in Uncategorized | 6 Comments

Not guts, but 6: part 5

It’s time for me to start building a simple Stomp::Server class, test-driven. I’ll need to extend my Test::IO::Socket::Async to make this possible, as it currently doesn’t handle listening sockets.

The simplest start

I’ll start out by stubbing an almost empty Stomp::Server class, which goes in lib/Stomp/Server.pm6:

class Stomp::Server {
    method socket-provider() {
        IO::Socket::Async
    }
}

Once again, I’ll give it a socket-provider method so I can inject a socket test double. Then, it’s time for a new test file, server.t:

use Test;
use Test::IO::Socket::Async;
use Stomp::Server;

constant $test-socket = Test::IO::Socket::Async.new;
my \TestableServer = Stomp::Server but role {
    method socket-provider() {
        $test-socket
    }
}

So, where to begin? To set me off in a consistent direction, I take a look at Stomp::Client and notice it expects to be constructed with a host and port. That seems like a good starting point. So, some tests:

constant $test-host = 'localhost';
constant $test-port = 1234;
dies-ok { TestableServer.new }, "Must provide host and port to new (1)";
dies-ok { TestableServer.new(host => $test-host) }, "Must provide host and port to new (2)";
dies-ok { TestableServer.new(port => $test-port) }, "Must provide host and port to new (3)";

These are easily passed, by adding to Stomp::Server:

has Str $.host is required;
has Int $.port is required;

A typical pattern for asynchronous server-like things in Perl 6 is to expose a supply of incoming connections. I may as well call that listen. Here’s the simplest test I can write for that:

my $test-server = TestableServer.new(host => $test-host, port => $test-port);
my $listen-supply = $test-server.listen();
isa-ok $listen-supply, Supply, "Stomp::Server listen method returns a Supply";

It explodes since there’s no listen method. Stubbing one in that contains a supply block gets me a pass:

method listen() {
    supply {
    }
}

So far, so easy.

The audience is listening

Now for something a little more involved. I want to make sure that a listening socket is only opened once I tap the supply that comes back from Stomp::Server’s listen method:

my $socket-listening = $test-socket.start-listening;
nok $socket-listening, "Not listening before supply is tapped";
my $listen-tap = $listen-supply.tap(-> $incoming { });
my $socket-listener = await $socket-listening;
ok $socket-listener, "Listening once supply is tapped";

I’d also like to make sure it listens on the correct host/port:

is $socket-listener.host, $test-host, "Listening on correct host";
is $socket-listener.port, $test-port, "Listening on correct port";

And that closing the tap on the supply Stomp::Server gives me back will also close the listen supply from the socket:

$listen-tap.close;
ok (await $socket-listener.is-closed), "Closing supply tap also closes socket";

So, that’s the tests, but Test::IO::Socket::Async isn’t up to the job yet – so that’s first in line. I’ll want an object that represents a listening socket, and I can see that at the very least it’ll need a host and a port. I’ll also need to deal with the same race between tests and code under test that I had with connect, meaning that Listener itself should hold the Supply that I will use to simulate incoming connections. Finally, I need to provide a way to test that at some point it stops listening, exposing a Promise that is kept when that happens. That’s quite a few things that need wiring together. Happily, it falls out really quite easily, by wiring things up at construction time:

class Listener {
    has $.host;
    has $.port;
    has $.is-closed = Promise.new;
    has $!is-closed-vow = $!is-closed.vow;
    has $!connection-supplier = Supplier.new;
    has $.connection-supply = $!connection-supplier
        .Supply
        .on-close({ $!is-closed-vow.keep(True) });
}

That’s really quite pretty. Perl 6 promises that attribute initializers run in order, so I can safely rely on $!is-closed containing the Promise I next take a vow from – and also keep that vow private. I also keep the ability to inject new connections private, and then tweak the Supply with on-close, which lets me run some logic to keep the is-closed promise when a tap on the supply is closed. Since everything exposed is either immutable or concurrent, there’s no need for this to be a monitor rather than a class.

That just leaves me to write a couple of methods in Test::IO::Socket::Async. One is listen, which should match the IO::Socket::Async method. The other is start-listening, which returns a Promise that tests can await to get the Listener instance. As with testing incoming connections, I’ll need a couple of attributes too.

has @!waiting-listens;
has @!waiting-start-listening-vows;

method listen(Str() $host, Int() $port) {
    my $listener = Listener.new(:$host, :$port);
    with @!waiting-start-listening-vows.shift {
        .keep($listener);
    }
    else {
        @!waiting-listens.push($listener);
    }
    $listener.connection-supply
}

method start-listening() {
    my $p = Promise.new;
    with @!waiting-listens.shift {
        $p.keep($_);
    }
    else {
        @!waiting-start-listening-vows.push($p.vow);
    }
    $p
}

Recall that Test::IO::Socket::Async is a monitor, meaning there are no data races here. Installing these updates, and running my tests, things get a bit further, then hang here:

my $socket-listener = await $socket-listening;

That’s not surprising, because the code under test never actually starts to listen. Let me make that happen with the simplest possible addition to Stomp::Server’s listen method:

method listen() {
    supply {
        whenever self.socket-provider.listen($!host, $!port) {

        }
    }
}

With that, all the tests pass. But wait…how did the socket closed test pass? That’s thanks to supply blocks being smart enough to keep track of all the things tapped by a whenever inside of them, and closing them automatically when the corresponding tap on the supply block itself is closed. Resource management is one of the things that supply blocks quietly take care of, avoiding all kinds of potential resource leaks.

Incoming connections

Next, I want to test and implement the server side of the incoming connection handshake. This will need me to finish up Test::IO::Socket::Async. As usual, I’ll start by writing the tests I want to have:

constant $test-login = 'user';
constant $test-password = 'correcthorsebatterystaple';

my $test-server = TestableServer.new(host => $test-host, port => $test-port);
my $listen-tap = $test-server.listen().tap(-> $conn { });
my $socket-listener = await $test-socket.start-listening;
my $test-conn = $socket-listener.incoming-connection;

$test-conn.receive-data: Stomp::Message.new(
    command => 'CONNECT',
    headers => (
        login => $test-login,
        passcode => $test-password,
        accept-version => '1.2'
    ));

my $message-text = await $test-conn.sent-data;
my $parsed-message = Stomp::Parser.parse($message-text);
ok $parsed-message, "Server responded to CONNECT with valid message";
my $message = $parsed-message.made;
is $message.command, "CONNECTED", "Server sent CONNECTED command";
ok $message.headers<accept-version>:exists, "Server sent accept-version header";
is $message.body, "", "Server sent no message body";

The key new piece is the incoming-connection method. The API for simulating received data and obtaining sent data works just like in the client sockets testing – suggesting they’ll want to share a lot of the same code. But can they share all of the same code?

Glancing at the Connection monitor I already wrote, it seems the answer is no. The first four attributes:

has $.host;
has $.port;
has $.connection-promise = Promise.new;
has $!connection-vow = $!connection-promise.vow;

Aren’t really interesting for an incoming connection. The rest of the code, which deals purely with sending and receiving data, seems relevant, however. So, I’ll rename Connection to ClientConnection. I’ll then add a role called Connection, and factor the common bits out there. This gives me:

role Connection {
    has @!sent;
    has @!waiting-sent-vows;
    has $!received = Supplier.new;

    # print, write, sent-data, Supply, receive-data...
    ...
}

monitor ClientConnection does Connection {
    has $.host;
    has $.port;
    has $.connection-promise = Promise.new;
    has $!connection-vow = $!connection-promise.vow;

    method accept-connection() {
        $!connection-vow.keep(self);
    }

    method deny-connection($exception = "Connection refused") {
        $!connection-vow.break($exception);
    }
}

I’ll then define a ServerConnection monitor, which for now simply composes the Connection role:

monitor ServerConnection does Connection {
}

I note how nice it is that code factored out to a role can happily be composed into classes and monitors, and will automatically get the desired mutual exclusion behaviour when composed into a monitor. Now, I can implement the incoming-connection method in Listener:

method incoming-connection() {
    my $conn = ServerConnection.new;
    $!connection-supplier.emit($conn);
    $conn
}

And I’m done extending Test::IO::Socket::Async to support testing server sockets. Committed! And yes, I owe that module some tests and a README some time soon…maybe on the plane tomorrow. For now, the STOMP must go on!

So, how can I make my test pass? By doing the absolute easiest thing possible, of course. Here it is:

method listen() {
    supply {
        whenever self.socket-provider.listen($!host, $!port) -> $conn {
            whenever $conn {
                await $conn.print: Stomp::Message.new:
                    command => 'CONNECTED',
                    headers => ( accept-version => '1.2' );
            }
        }
    }
}

This is a fairly epic cheat. It just responds with a CONNECTED message when it receives anything on the socket! It passes the test, though. I’ve learned – mostly when doing ping-pong pair programming – that being willing to “cheat” my way to passing simple tests is actually a good thing. It makes it clear what tests I need to write next.

Sharing the parsing

To check I do treat CONNECT messages correctly, I’ll now write a test case where I sent complete junk to the server:

my $test-server = TestableServer.new(host => $test-host, port => $test-port);
my $listen-tap = $test-server.listen().tap(-> $conn { });
my $socket-listener = await $test-socket.start-listening;
my $test-conn = $socket-listener.incoming-connection;

$test-conn.receive-data: "EPIC FAIL!";

my $message-text = await $test-conn.sent-data;
my $parsed-message = Stomp::Parser.parse($message-text);
ok $parsed-message, "Server responded to invalid message with valid message";
is $parsed-message.made.command, "ERROR", "Server sent ERROR command";

Obviously, it fails, sending back a CONNECTED message instead of an ERROR. So what to do about it? Clearly, I’ll need to start paying a bit more attention to the messages coming it. That is, I need to turn a sequence of packets into a sequence of Stomp::Message objects. Hmm, that sounds familiar! Glancing in Stomp::Client, I see this:

method !process-messages($incoming) {
    supply {
        my $buffer = '';
        whenever $incoming -> $data {
            $buffer ~= $data;
            while Stomp::Parser::ServerCommands.subparse($buffer) -> $/ {
                given $/.made -> $message {
                    die $message.body if $message.command eq 'ERROR';
                    emit $message;
                }
                $buffer .= substr($/.chars);
            }
        }
    }
}

Well, that’s precisely what I want – except it’s parsing server commands, and I need to parse client commands. Factoring a method out feels like a job for a role – and needing to make the factored out code be parametric on something makes a parametric role just the ticket. So, I’ll add a Stomp::MessageStream role:

role Stomp::MessageStream[::MessageGrammar] {
    method !process-messages($incoming) {
        supply {
            my $buffer = '';
            whenever $incoming -> $data {
                $buffer ~= $data;
                while MessageGrammar.subparse($buffer) -> $/ {
                    given $/.made -> $message {
                        die $message.body if $message.command eq 'ERROR';
                        emit $message;
                    }
                    $buffer .= substr($/.chars);
                }
            }
        }
    }
}

And use it in Stomp::Client:

class Stomp::Client does Stomp::MessageStream[Stomp::Parser::ServerCommands] {
    ...
}

The client.t tests still happily pass, so it goes in as a commit. Now I can also compose the role into my Stomp::Server and use it:

class Stomp::Server does Stomp::MessageStream[Stomp::Parser::ClientCommands] {
    has Str $.host is required;
    has Int $.port is required;

    method listen() {
        supply {
            whenever self.socket-provider.listen($!host, $!port) -> $conn {
                whenever self!process-messages($conn) {
                    await $conn.print: Stomp::Message.new:
                        command => 'CONNECTED',
                        headers => ( accept-version => '1.2' );
                }
            }
        }
    }

    method socket-provider() {
        IO::Socket::Async
    }
}

That turns my test fail into…a hang. Why? Because the process-messages private method simply assumes that if it didn’t manage to parse a message, the reason must be that it’s incomplete – not that it’s broken. It will therefore just accumulate broken data in its buffer. Clearly, the parser needs to fail more violently if the incoming message is utterly bogus.

So, over in lib/Stomp/Parser.pm6, I’ll add an exception:

class X::Stomp::MalformedMessage is Exception {
    has $.reason;
    method message() {
        "Malformed STOMP message: $!reason"
    }
}

Then, I’ll tweak the TOP token to:

token TOP {
    [ <command> \n || <.maybe-command> ]
    [<header> \n]*
    \n
    <body>
    \n*
}

The addition here is the sequential alternation, calling maybe-command (the dot indicates to not capture the result). The idea of maybe-command is to check if the data so far might viably parse as a command if some more data were to arrive. It can look like this:

token maybe-command {
    <[A..Z]>**0..15 $ || <.malformed('invalid command')>
}
method malformed($reason) {
    die X::Stomp::MalformedMessage.new(:$reason);
}

With that, I’ve got from a hanging test to an exploding test. Happily, all of the client.t tests that also use the parser still work, though, so it can go in as a commit of its own.

It’s at this point that I finally ran into my first Rakudo bug of this series. The code I want to now write looks like this:

method listen() {
    supply {
        whenever self.socket-provider.listen($!host, $!port) -> $conn {
            whenever self!process-messages($conn) {
                await $conn.print: Stomp::Message.new:
                    command => 'CONNECTED',
                    headers => ( accept-version => '1.2' );

                QUIT {
                    when X::Stomp::MalformedMessage {
                        await $conn.print: Stomp::Message.new:
                            command => 'ERROR',
                            body => .message;
                    }
                }
            }
        }
    }
}

Unfortunately, QUIT phasers suffer a scoping bug if there is an exception before any other messages have been received. Thankfully, it’s easy enough to drop the whenever sugar in this case and fall back on the tap method:

method listen() {
    supply {
        whenever self.socket-provider.listen($!host, $!port) -> $conn {
            self!process-messages($conn).tap:
                {
                    await $conn.print: Stomp::Message.new:
                        command => 'CONNECTED',
                        headers => ( accept-version => '1.2' );
                },
                quit => {
                    when X::Stomp::MalformedMessage {
                        await $conn.print: Stomp::Message.new:
                            command => 'ERROR',
                            body => .message;
                    }
                };
        }
    }
}

And with that, the test passes.

And that’s Stomp::Server’s first steps

There’s still a bit more to do to ensure the first message really is a CONNECT, and to provide a hook for authentication. And then I’ll need to work out how the API for incoming messages is going to look, along with subscription/unsubscription requests. But that’ll be for next time. In the meantime, here’s a commit of the server work so far.

Posted in Uncategorized | 3 Comments

Not guts, but 6: part 4

I’ve managed to marry myself into getting two Christmases a year. The Orthodox one takes place on the 7th of January, so I’ve been celebrating that. And now the trek back home is underway, stopping off to enjoy the snow and nice mood in Kiev for a couple of nights before returning to Prague and normal life and work. (And, if you’re wondering, yes, I shall eat a Chicken Kiev while here.)

In today’s post, I’ll be keeping it simple: improving my test coverage, fixing a couple of small design issues, supporting unsubscription, and using a new little module I wrote to deal with a pesky data race.

Tweaking send

The next easy thing to write tests for is the send method, so I’ll start there. Here’s the tests:

constant $test-destination = "/queue/shopping";
constant $test-body = "Buy a karahi!";
my $send-promise = $client.send($test-destination, $test-body);
$message-text = await $test-conn.sent-data;
$parsed-message = Stomp::Parser.parse($message-text);
ok $parsed-message, "send method sent well-formed message";
$message = $parsed-message.made;
is $message.command, "SEND", "message has SEND command";
is $message.headers<destination>, $test-destination, "destination header correct";
is $message.headers<content-type>, "text/plain", "has default content-type header";
is $message.body, $test-body, "message had expected body";
is $send-promise.status, Kept, "Promise retunred by send was kept";

A little wordy, but there’s nothing new going on. One of them fails, though:

not ok 13 - destination header correct
# Failed test 'destination header correct'
# at t\client.t line 75
# expected: '/queue/shopping'
#      got: '/queue//queue/shopping'

Hmm. Let me look at send:

method send($topic, $body) {
    self!ensure-connected;
    $!connection.print: Stomp::Message.new:
        command => 'SEND',
        headers => (
            destination  => "/queue/$topic",
            content-type => "text/plain"
        ),
        body => $body;
}

Ah, there it is. My advent post hard-coded the RabbitMQ queue path, but the module really should allow full control over the destination. That’s easily fixed, and I’ll take the time to do a little rename also:

method send($destination, $body) {
    self!ensure-connected;
    $!connection.print: Stomp::Message.new:
        command => 'SEND',
        headers => (
            destination  => $destination,
            content-type => "text/plain"
        ),
        body => $body;
}

It’s easy to under-value simple things like renaming variables to keep up with the evolving language of a design, but I’ve found it to be really worthwhile. I tend to call such refactors “domain refactors”. They are often small and subtle, but together they help make the code easier to follow, improve consistency, and so ease future development. Anyway, committed!

There’s one other thing that stands out to me here, which is that it’d be good to be able to choose the content type also. First, a test:

constant $test-type = "text/html";
$send-promise = $client.send($test-destination, $test-body,
    content-type => $test-type);
$message = Stomp::Parser.parse(await $test-conn.sent-data).made;
is $message.headers<content-type>, $test-type, "can set content-type header";

It’s easily implemented, adding an optional named parameter that defaults to the text/plain content type. With the variable names perfectly matching the header names, this means I can get some repetition out of the code with the variable colon pair syntax:

method send($destination, $body, :$content-type = "text/plain") {
    self!ensure-connected;
    $!connection.print: Stomp::Message.new:
        command => 'SEND',
        headers => ( :$destination, :$content-type ),
        body => $body;
}

And there’s my second commit.

Subscription and unsubscription

Now I’ll turn to receiving messages. Once again, the tests aren’t too difficult to write, and follow a sufficiently common pattern I’m already starting to ponder whether it’s time to factor things out a bit:

my $sub-supply = $client.subscribe($test-destination);
isa-ok $sub-supply, Supply, "subscribe returns a Supply";
my $sent-data-promise = $test-conn.sent-data;
is $sent-data-promise.status, Planned, "did not yet send subscription request";
my @messages;
my $sub-tap = $sub-supply.tap({ @messages.push($_) });
$message-text = await $sent-data-promise;
$parsed-message = Stomp::Parser.parse($message-text);
ok $parsed-message, "subscribe method sent well-formed message";
$message = $parsed-message.made;
is $message.command, "SUBSCRIBE", "message has SUBSCRIBE command";
is $message.headers<destination>, $test-destination, "destination header correct";
ok $message.headers<id>:exists, "had an id header";

One fails. Once again, it’s the destination header. Here’s how my subscribe method looks:

method subscribe($topic) {
    self!ensure-connected;
    state $next-id = 0;
    supply {
        my $id = $next-id++;

        $!connection.print: Stomp::Message.new:
            command => 'SUBSCRIBE',
            headers => (
                destination => "/queue/$topic",
                id => $id
            );

        whenever $!incoming {
            if .command eq 'MESSAGE' && .headers<subscription> == $id {
                emit .body;
            }
        }
    }
}

Ah, yes, it’s the topic/destination discrepancy again. And, given I have a $id variable, I’ll be able to use the colon pair variable form again. Here goes:

method subscribe($destination) {
    self!ensure-connected;
    state $next-id = 0;
    supply {
        my $id = $next-id++;

        $!connection.print: Stomp::Message.new:
            command => 'SUBSCRIBE',
            headers => ( :$destination, :$id );

        ...
    }
}

That’s better but…something is not quite right still. I cheated a bit when I wrote this for the advent post, and nobody was observant enough to call me out on it – so I guess I’ll just have to out myself. There’s a data race on $next-id, should two threads end up making subscriptions at the same time. It’s not likely to crop up, but it still wants dealing with. I’ll do that in a moment.

Before that, I’d like to get unsubscription handled. Closing the tap should do an unsubscribe. First, some tests:

my $expected-id = $message.headers<id>;
$sub-tap.close;
$message-text = await $test-conn.sent-data;
$parsed-message = Stomp::Parser.parse($message-text);
ok $parsed-message, "unsubscribing sent well-formed message";
$message = $parsed-message.made;
is $message.command, "UNSUBSCRIBE", "message has UNSUBSCRIBE command";
is $message.headers<id>, $expected-id, "id matched the subscription";

This hangs on the await, because at present nothing is sent when the tap on the supply is closed. Happily, the CLOSE phaser makes it easy to write logic that will be run on tap close:

method subscribe($destination) {
    self!ensure-connected;
    state $next-id = 0;
    supply {
        my $id = $next-id++;

        $!connection.print: Stomp::Message.new:
            command => 'SUBSCRIBE',
            headers => ( :$destination, :$id );
        CLOSE {
            $!connection.print: Stomp::Message.new:
                command => 'UNSUBSCRIBE',
                headers => ( :$id );
        }

        ...
    }
}

I could write the CLOSE phaser wherever I wanted inside of the supply block, and so chose to put it near the logic to send a SUBSCRIBE message. Phasers are often handy in that way: they specify code that runs at certain phases in the program, and so free me to place that code in the most helpful place for the reader. And with that, the tests pass. Commit!

Dealing with that data race

So, how to deal with the getting ascending IDs in the safe way? There are a couple of options that come to mind:

  • Make Stomp::Client a monitor. That’s probably overkill, however. It’s quite capable of otherwise having methods invoked on it concurrently, since it has no state beyond that set up in connect.
  • Use Lock. But using Lock is generally a last resort, not a first one.

What I really want is a mechanism that can just give me ascending integers. If I generalize that thought a little, I want a safe way to grab the next value available from some sequence. And sequences of values in Perl 6 are typically handled by iterators. However, an Iterator is only safe for consumption from one thread at a time.

So, I wrote another little module: Concurrent::Iterator. It’s weighs in at well under 50 lines of code, and does a bit more than I need for this use case. Using it, I can just ask for a concurrent iterator over the range of integers from 1 up to infinity, and keep it around in an attribute:

has $!ids = concurrent-iterator(1..Inf);

And then use it in subscribe:

method subscribe($destination) {
    self!ensure-connected;
    supply {
        my $id = $!ids.pull-one;
        ...
    }
}

Message arrival

I’m almost up to having tests covering all the stuff that matters in Stomp::Client, but there’s one glaring exception: receiving messages from a subscription. I already set up an array that such messages can be pushed to:

my $sub-tap = $sub-supply.tap({ @messages.push($_) });

So, I’ll now sneak some extra tests in between the subscription and unsubscription tests:

my $expected-id = $message.headers<id>;
is @messages.elems, 0, "no messages received yet";
$test-conn.receive-data: Stomp::Message.new(
    command => 'MESSAGE',
    headers => ( subscription => $expected-id ),
    body    => $test-body
);
is @messages.elems, 1, "one message now received";
isa-ok @messages[0], Stomp::Message, "it's a Stomp::Message";
is @messages[0].command, "MESSAGE", "has the command MESSAGE";
is @messages[0].body, $test-body, "has the correct body";

And…epic fail!

not ok 26 - it's a Stomp::Message
# Failed test 'it's a Stomp::Message'
# at t\client.t line 108
# Actual type: Str

Since the Stomp::Message headers may well contain relevant information for processing of the message – such as a content-type header, it would be a good idea to pass those along to the consumer. Thankfully, that’s an easy change to the whenever block, to emit the Stomp::Message itself rather than its body:

whenever $!incoming {
    if .command eq 'MESSAGE' && .headers<subscription> == $id {
        emit $_;
    }
}

And that’ll be the final commit for this time.

I live to server

Next time, I’ll add support to Test::IO::Socket::Async for testing listening sockets, and then use it to start implementing a Stomp::Server class.

Posted in Uncategorized | Leave a comment

Not guts, but 6: part 3

To me, one of the most important things about the asynchronous programming support in Perl 6 is the uniform interfaces the language provides. Promises represent asynchronous operations that will produce a single result, while supplies represent asynchronous operations that may produce a stream of values (which we might find more natural to call “events”) over time.

Perl may well embrace There’s More Than One Way To Do It. However, being able to quickly put together programs that combine our selection of preferred modules still hinges on there being things they do all agree to use. It’s typically the little, unspoken things: the basic data structures (scalars, arrays, hashes – and in Perl 6 lazy iterators too), and that method calls look the same no matter what magic may lie behind their dispatch.

Promises and supplies are the basic asynchronous data structures. Whether we are working against sockets, message queues, time, domain events, or GUI events, we can talk about these sources of asynchronous values or value streams using the Promise and Supply types. And, since it’s easy to create a Promise or Supply and back it with whatever data we feel like, we can use them in writing tests for our asynchronous code too. Which brings me nicely to the next steps for my Stomp::Client.

Sketching out the double I want

I tend to find that the point I actually have a concrete need for something is a good time to design and build it. It gives me a use case, or use cases, to check the design against. To move ahead with testing Stomp::Client – something I wish to do before evolving it further – I need a test double for IO::Socket::Async. Just as a stunt double stands in for a real actor for the purpose of doing dangerous things in a film, a test double stands in for a real object for the purpose of testing code that uses it. Stub objects and mock objects are common examples of test doubles.

I’m going to use my need to test Stomp::Client to drive out the design and implementation of a Test::IO::Socket::Async. I’ll just write tests as I’d like them to look, and then do what’s needed to make things work. First, I’ll add a few constants providing some test data, so I won’t have to repeat it:

constant $test-host = 'localhost';
constant $test-port = 1234;
constant $test-login = 'user';
constant $test-password = 'correcthorsebatterystaple';

Then, pretending I have a Test::IO::Socket::Async already, I’ll take the Stomp::Client type and derive an anonymous type from it that overrides the socket-provider method I added back on day 1. I’ll arrange for the method to return my test socket instance.

constant $test-socket = Test::IO::Socket::Async.new;
my \TestableClient = Stomp::Client but role {
    method socket-provider() {
        $test-socket
    }
}

With the setup out of the way, it’s time to sketch out the first couple of tests. Checking that Stomp::Client connects to the host and port it was constructed with seems like a good start. So, here goes:

my $client = TestableClient.new(
    host => $test-host, port => $test-port,
    login => $test-login, password => $test-password
);
my $connect-promise = $client.connect();
my $test-conn = await $test-socket.connection-made;
is $test-conn.host, $test-host, "Connected to the correct host";
is $test-conn.port, $test-port, "Connected to the correct port";

The first two statements look just like a normal usage of Stomp::Client. The connect method gives back a Promise, which for now I’ll just stick in a variable and worry about later. The third statement is where the test double is used. Since IO::Socket::Async is asynchronous, interactions with its test double also should be. An asynchronous socket may be used by multiple threads, and the code under test may end up interacting with it or creating it on a different thread than our test is running on. Therefore, the test double has a connection-made method that returns a Promise that will be kept when a connect call is made on the test double. The Promise will be kept with some object that represents a test connection, and provides the host and port that were supplied to connect. These are examined in the final two statements.

Implementing the test double

First, the easy part. I’m not yet sure what the thing representing a test connection is going to look like when it’s completed, but I know it must have both a host and a port. So, I’ll just declare a simple class for it inside of Test::IO::Socket::Async:

class Test::IO::Socket::Async {
    class Connection {
        has $.host;
        has $.port;
    }
    ...
}

Next, I’ll do the connect and connection-made methods. Some care is needed here, because there’s a race condition just waiting to happen. Two orderings of events are possible. Either:

  1. The connect call is made on the test double
  2. The connection-made call is made on the test double

Or:

  1. The connection-made call is made on the test double
  2. The connect call is made on the test double

It doesn’t matter which happens, but it does matter that the behaviour is the same. Also, while I don’t immediately have a use case for it, it’s clear that other users of such a test double may wish to test code that connects to many things. Therefore, I’ll add to Test::IO::Socket::Async a pair of attributes:

has @!waiting-connects;
has @!waiting-connection-made-vows;

The first will hold Connection objects for any connect calls that were made, but that are not yet matched up with a connection-made call from the test code. The second plays the opposite role: it holds vows (the thing that is used to keep or break a Promise) for promises returned by connection-made that are not yet matched up with a connect call. The two methods will have a similar kind of symmetry:

method connect(Str() $host, Int() $port) {
    my $conn = Connection.new(:$host, :$port);
    with @!waiting-connection-made-vows.shift {
        .keep($conn);
    }
    else {
        @!waiting-connects.push($conn);
    }
    my $p = Promise.new;
    $p.keep($conn);
    $p
}

method connection-made() {
    my $p = Promise.new;
    with @!waiting-connects.shift {
        $p.keep($_);
    }
    else {
        @!waiting-connection-made-vows.push($p.vow);
    }
    $p
}

Note that a with block is like an if block, but it tests for definedness instead of truth, and sets $_ to the tested object.

Coping with concurrency

There’s one more important thing I need to take care of. Running the tests at this point reveals it. They hang. I try again. Ooh, a pass. Third time? Hang. So, what’s going on? Well, I only told a half-truth earlier when discussing the ordering between connect and connection-made. It’s also possible for the two to be called at the same time! Thankfully, that’s easily fixed. My class needs to become a monitor, which enforces one-at-a-time semantics on the methods of a particular instance. So, it’s off to the ecosystem:

panda install OO::Monitors

OO::Monitors uses Perl 6’s meta-programming features to good effect. All that is needed to make a class into a monitor is to replace the class declarator with a monitor declarator, which is provided by OO::Monitors. Here’s how my test double ends up looking:

use OO::Monitors;

monitor Test::IO::Socket::Async {
    class Connection {
        has $.host;
        has $.port;
    }

    has @!waiting-connects;
    has @!waiting-connection-made-vows;

    method connect(Str() $host, Int() $port) {
        my $conn = Connection.new(:$host, :$port);
        with @!waiting-connection-made-vows.shift {
            .keep($conn);
        }
        else {
            @!waiting-connects.push($conn);
        }
    }

    method connection-made() {
        my $p = Promise.new;
        with @!waiting-connects.shift {
            $p.keep($_);
        }
        else {
            @!waiting-connection-made-vows.push($p.vow);
        }
        $p
    }
}

Not bad.

In denial

This is a promising start. That’s the great thing about Perl 6: every start { … } is Promise-ing. But something is just a little off. While it would be easy to plough ahead and write the next test on the happy path – where the connection to a STOMP server is successful – an even easier one to write is the case where the socket connection fails, and Stomp::Client never gets so far as doing the handshake. But right now, there’s no way to write such a test. The connect Promise is immediately kept when a connect call is made in the test.

Once again, I’ll write the test as I’d like to express it:

my $client = TestableClient.new(
    host => $test-host, port => $test-port,
    login => $test-login, password => $test-password
);
my $connect-promise = $client.connect();
my $test-conn = await $test-socket.connection-made;
$test-conn.deny-connection();
dies-ok { await $connect-promise },
    "Failed STOMP server connection breaks connect Promise";

Now for the changes. First, I’ll extend Connection a bit. It will hold the Promise that will be returned by the connect method. Then, the accept-connection and deny-connection methods will use the vow on that Promise.

class Connection {
    has $.host;
    has $.port;
    has $.connection-promise = Promise.new;
    has $!connection-vow = $!connection-promise.vow;

    method accept-connection() {
        $!connection-vow.keep(self);
    }

    method deny-connection($exception = "Connection refused") {
        $!connection-vow.break($exception);
    }
}

Finally, back in Test::IO::Socket::Async, I’ll update the connect method to just return this Promise:

method connect(Str() $host, Int() $port) {
    my $conn = Connection.new(:$host, :$port);
    with @!waiting-connection-made-vows.shift {
        .keep($conn);
    }
    else {
        @!waiting-connects.push($conn);
    }
    $conn.connection-promise
}

And the test passes. Hurrah. This means Stomp::Client isn’t failing to pass on socket connect errors to its consumer, which is certainly a test worth having.

Testing what was sent

Now I’d like to start filling out a test case for the CONNECT handshake that Stomp::Client should do with a STOMP server. Here’s the first bit, checking that a well-formed CONNECT message is sent with the correct information:

my $client = TestableClient.new(
    host => $test-host, port => $test-port,
    login => $test-login, password => $test-password
);
my $connect-promise = $client.connect();
my $test-conn = await $test-socket.connection-made;
$test-conn.accept-connection();

my $message-text = await $test-conn.sent-data;
my $parsed-message = Stomp::Parser.parse($message-text);
ok $parsed-message, "Client sent valid message to server";
my $message = $parsed-message.made;
is $message.command, "CONNECT", "Client sent a CONNECT command";
is $message.headers<login>, $test-login, "Client sent login";
is $message.headers<passcode>, $test-password, "Client sent password";
ok $message.headers<accept-version>:exists, 'Client sent accept-version header';
is $message.body, "", "Client sent no message body";

The only new thing here with regard to socket testing is the sent-data method on a test connection. It returns a Promise that will be kept when something is sent using the socket. It will be kept with what was sent. The code that follows checks that the message contains what was expected of it. Note that a few things are done here to avoid test fragility:

  • The headers are tested in a way that does not depend on their ordering, as any order is valid
  • The accept-version is not hard-coded, so the test will not break if the module is later updated to cope with newer protocol versions

I was able to use the previously factored out Stomp::Parser in order to avoid testing directly against the text of the message, which would result in an overly-specific test.

So, what happens in my test Connection class to support this? First of all, it’s time to make it a monitor, since I’m about to give it mutable state:

monitor Connection {
    ...
}

I’ll then do a somewhat similar thing as I did when testing connects: have an array of sent things and an array of vows to keep. Here’s the code that I add to the Connection class:

has @!sent;
has @!waiting-sent-vows;

method print(Str() $s) {
    @!sent.push($s);
    self!keep-sent-vows();
    self!kept-promise();
}

method write(Blob $b) {
    @!sent.push($b);
    self!keep-sent-vows();
    self!kept-promise();
}

method sent-data() {
    my $p = Promise.new;
    @!waiting-sent-vows.push($p.vow);
    self!keep-sent-vows();
    $p
}

method !keep-sent-vows() {
    while all(@!sent, @!waiting-sent-vows) {
        @!waiting-sent-vows.shift.keep(@!sent.shift);
    }
}

method !kept-promise() {
    my $p = Promise.new;
    $p.keep(True);
    $p
}

The switch to a monitor is critical to avoiding various races that could easily occur. The print and write methods return a kept Promise in order to match the API of IO::Socket::Async itself. And…the test is happy.

Testing what was received

My previous test wasn’t quite a complete test of the connection process, since the Promise returned by Stomp::Client’s connect method is not completed until a CONNECTED frame is received from the server. As usual, I’ll sketch out the test I want to have:

$test-conn.receive-data: Stomp::Message.new(
    command => 'CONNECTED',
    headers => ( version => '1.2' )
);
ok (await $connect-promise), "CONNECTED message completes connection";

IO::Socket::Async uses a Supply for incoming data received by the socket. That makes it rather straightforward to fake up. First, I’ll add an attribute to my test Connection monitor that holds a Supplier:

has $!received = Supplier.new;

The test version of IO::Socket::Async’s Supply method will simply return the Supply that goes with it:

method Supply() {
    $!received.Supply
}

And then I’ll use the Supplier to emit the data we want to fake the socket receiving, taking care to make sure I only pass along blobs of binary data or strings, conveniently stringifying any other type that isn’t already one:

multi method receive-data(Str() $data) {
    $!received.emit($data);
}
multi method receive-data(Blob $data) {
    $!received.emit($data);
}

And with that, I’ve a passing test.

A module is born

I developed Test::IO::Socket::Async at the top of the test file I was fleshing out the client tests in. However, it really wants to be a separate module, so others can use it in their own tests. So, I gave it its own git repo, with a META.info. Even before adding it to the module list, I could simply do:

panda install .

And use it from my client tests:

use Test::IO::Socket::Async;

Which I then committed.

And what next?

The tests only cover one method of Stomp::Client. However, it should be fairly easy to test the rest, now I’ve got a test double for IO::Socket::Async. This also means I can more confidently move on to implementing some further aspects of the client, working test-first to add features such as unsubscription, disconnecting, and transactions.

Posted in Uncategorized | 1 Comment

Not guts, but 6: part 2

It’s time for more hacking on my Perl 6 STOMP module. Today: parsing.

Pulling out the parser

Given my plans for adding a Stomp::Server to go with my Stomp::Client, I need to factor my STOMP message parser out so it can be used by both. That will be an easy refactor. First, the parser moves off into a file of its own and gets called Stomp::Parser:

grammar Stomp::Parser {
    token TOP {
        <command> \n
        [<header> \n]*
        \n
        <body>
        \n*
    }
    token command {
        < CONNECTED MESSAGE RECEIPT ERROR >
    }
    token header {
        <header-name> ":" <header-value>
    }
    token header-name {
        <-[:\r\n]>+
    }
    token header-value {
        <-[:\r\n]>*
    }
    token body {
        <-[\x0]>* )> \x0
    }
}

Then it’s just a use statement and a small tweak back in Stomp::Client. Done!

Testing parsing of commands – and a discovery

Perhaps the most basic test I should write is for being able to parse all of recognized commands, but not unrecognized ones. So, here goes:

use Test;
use Stomp::Parser;

plan 16;

my @commands = <
    SEND SUBSCRIBE UNSUBSCRIBE BEGIN COMMIT ABORT ACK NACK
    DISCONNECT CONNECT STOMP CONNECTED MESSAGE RECEIPT ERROR
>;

for @commands {
    ok Stomp::Parser.parse(qq:to/TEST/), "Can parse $_ command (no headers/body)";
        $_

        \0
        TEST
}

nok Stomp::Parser.parse(qq:to/TEST/), "Cannot parse unknown command FOO";
    FOO

    \0
    TEST

This doesn’t pass yet, because it turns out the grammar only supports the commands that a server may send, not those a client may send. That’s an easy fix:

token command {
    <
        SEND SUBSCRIBE UNSUBSCRIBE BEGIN COMMIT ABORT ACK NACK
        DISCONNECT CONNECT STOMP CONNECTED MESSAGE RECEIPT ERROR
    >
}

That makes me stop and think a bit, though. I just took a parser suitable for Stomp::Client and generalized it. But now it will also accept messages that a client should never expect to receive. That means I’ll have to add an extra error path for them, which feels suboptimal. Thankfully, since grammars are just funky classes, I can easily introduce variants of the parser that just accept client and server commands:

grammar Stomp::Parser::ClientCommands is Stomp::Parser {
    token command {
        <
            SEND SUBSCRIBE UNSUBSCRIBE BEGIN COMMIT ABORT ACK NACK
            DISCONNECT CONNECT STOMP
        >
    }
}

grammar Stomp::Parser::ServerCommands is Stomp::Parser {
    token command {
        < CONNECTED MESSAGE RECEIPT ERROR >
    }
}

And yes, I added tests to cover these too, in the resulting commit.

From parse tree to message

It’s fairly common in Perl 6 for a grammar to come paired with actions, which process the raw parse tree into a higher level data structure. I certainly have a desired data structure: Stomp::Message. So how is it being made today? Here is the code in question:

while Stomp::Parser::ServerCommands.subparse($buffer) -> $/ {
    $buffer .= substr($/.chars);
    if $<command> eq 'ERROR' {
        die ~$<body>;
    }
    else {
        emit Stomp::Message.new(
            command => ~$<command>,
            headers => $<header>
                .map({ ~.<header-name> => ~.<header-value> })
                .hash,
            body => ~$<body>
        );
    }
}

Clearly, part of this would end up getting duplicated in a Stomp::Server, so it’d be better pulled out, and stuck in an actions class. So, I’ll define an actions class nested inside my grammar, and put the logic there:

grammar Stomp::Parser {
    ...

    class Actions {
        method TOP($/) {
            make Stomp::Message.new(
                command => ~$<command>,
                headers => $<header>
                    .map({ ~.<header-name> => ~.<header-value> })
                    .hash,
                body => ~$<body>
            );
        }
    }
}

It’s nice to notice how this is basically a cut-paste refactor. Now for a test:

{
    my $parsed = Stomp::Parser.parse(qq:to/TEST/);
        SEND
        destination:/queue/stuff

        Much wow\0
        TEST
    ok $parsed, "Parsed message with header/body";

    my $msg = $parsed.made;
    isa-ok $msg, Stomp::Message, "Parser made a Stomp::Message";
    is $msg.command, "SEND", "Command is correct";
    is $msg.headers, { destination => "/queue/stuff" }, "Header is correct";
    is $msg.body, "Much wow", "Body is correct";
}

The test fails, because I forgot to set the actions class when calling parse. Hmm…I’d need to do that in Stomp::Client too…and in Stomp::Server. In fact, I don’t have an example off hand when I’d care to avoid producing a Stomp::Message. That probably means it wants to be the default. That’s easily taken care of by overriding parse and subparse to set the actions by default:

method parse(|c) { nextwith(actions => Actions, |c); }
method subparse(|c) { nextwith(actions => Actions, |c); }

I use |c to swallow up all the incoming arguments, and then pass them along. Notice how I take care to put my default first, and then splice in anything the caller specifies. This means there’s still a way to provide alternate actions, or to pass Nil to get none at all. Test passes. Commit. Yay.

Finally, I can go back and tidy up the code in the buffer processing some:

method !process-messages($incoming) {
    supply {
        my $buffer = '';
        whenever $incoming -> $data {
            $buffer ~= $data;
            while Stomp::Parser::ServerCommands.subparse($buffer) -> $/ {
                given $/.made -> $message {
                    die $message.body if $message.command eq 'ERROR';
                    emit $message;
                }
                $buffer .= substr($/.chars);
            }
        }
    }
}

It no longer needs to dig into the parse tree to find the command and body for the error handling. Generally, the code in this method is much more focused on doing a single thing: turning a stream of incoming characters into a stream of messages, coping with messages that fall over packet boundaries. Win!

Simplifying the actions

Refactoring feels nicer when there’s tests. So, is there anything of the code I now have nicely covered that I fancy cleaning up? Well, perhaps there is a little bit of simplification on offer in my small Actions class:

class Actions {
    method TOP($/) {
        make Stomp::Message.new(
            command => ~$<command>,
            headers => $<header>
                .map({ ~.<header-name> => ~.<header-value> })
                .hash,
            body => ~$<body>
        );
    }
}

For one, I don’t actually need to explicitly do the hash coercion there. The default semantics of construction perform assignment, not binding, and a list of pairs can happily be assigned to a hash. That map is digging into the parse tree too, and it’d probably be neater to do handle the pair construction in a second action method. So, here goes:

class Actions {
    method TOP($/) {
        make Stomp::Message.new(
            command => ~$<command>,
            headers => $<header>.map(*.made),
            body    => ~$<body>
        );
    }
    method header($/) {
        make ~$<header-name> => ~$<header-value>;
    }
}

I think I like that better. Not really any shorter, but breaks the work up into smaller chunks for easier digesting of the code. So, it’s in.

Pretty nice progress

That’ll do me for this time. By now, I’ve got the things I’d need to build my Stomp::Server module nicely factored out. Better still, they’re covered by some tests. Stomp::Client itself is now much more focused, and down to under a hundred lines of code.

Next, I’ll want to look into getting some testing in place for Stomp::Client. And that will mean taking a little diversion: there’s no test double in the ecosystem for IO::Socket::Async yet, so I’ll need to build one.

Posted in Uncategorized | Leave a comment

Not guts, but 6: part 1

After the Christmas release of Perl 6, I spent the better part of a week in bed, exhausted and somewhat sick. I’m on the mend, but I’m going to be taking it easy for the coming weeks. I suspect it’ll be around February before I’m feeling ready for my next big adventures in Perl 6 compiler/VM hacking. It’s not so much a matter of not having motivation to work on stuff; I’ve quite a lot that I want to do. But, having spent six months where I was never quite feeling well, just somewhere between not well and tolerably OK, I’m aware I need to give myself some real rest, and slowly ease myself back into things. I’ll also be keeping my travel schedule very light over the coming months. The Perl 6 Christmas preparations were intense and tiring, but certainly not the only thing to thank for my exhaustion. 3-4 years of always having a flight or long-distance train trip in the near future – and especially the rather intense previous 18 months – has certainly taken its toll. So, for the next while I’ll be enjoying some quality time at home in Prague, enjoying nice walks around this beautiful city and cooking plenty of tasty Indian dishes.

While I’m not ready to put compiler hat back on yet, I still fancied a little gentle programming to do in the next week or two. And, having put so much effort into Perl 6, it’s high time I got to have the fun of writing some comparatively normal code in it. :-) So, I decided to take the STOMP client I hacked up in the space of an hour for my Perl 6 advent post, and flesh it out into a full module. As I do so, I’m going to blog about it here, because I think in doing so I’ll be able to share some ideas and ways of doing things that will have wider applicability. It will probably also be a window into some of the design thinking behind various Perl 6 things.

Step 0: git repo

I took the code from the blog post, and dropped it into lib/Stomp/Client.pm6. Then it was git init, git add, git commit, and voila, I’m ready to get going. I also decided to use Atom to work on this, so I can enjoy the nice Perl 6 syntax highlighting plug-in.

Testing thinking

Since my demos for the blog post actually worked, it’s fairly clear that I at this point have “working code”. Unfortunately, it also has no tests whatsoever. That makes me uneasy. I’m not especially religious about automated testing, I just know there have been very few times where I wrote tests and regretted spending time doing so, but a good number of times when I “didn’t need to spend time doing that” and later made silly mistakes that I knew full well would have been found by a decent suite of tests.

More than that, I find that testable designs tend to also be extensible and loosely coupled designs. That partly falls out of my belief that tests should simply be another client of the code. Sometimes on #perl6, somebody asks how to test their private methods. Of course, since I designed large parts of the MOP, I can rattle off “use .^find_private_method(‘foo’) to get hold of it, then call it” without thinking. But the more thoughtful answer is that I don’t think you should be testing private methods. They’re private. They’re an implementation detail, like attributes. My expectation in Perl 6 is that I can perform a correct refactor involving private methods or attributes without having to be aware of anything textually outside the body of the class in question. This means that flexibility for the sake of testability will need to make it into the public interface of code – and that’s good, because it will make the code more open to non-testing extension too.

My current Stomp::Client is not really open to easy automated testing. There is one non-easy way that’d work, though: write a fake STOMP server to test it against. That’s probably not actually all that hard. After all, I already have a STOMP message parser. But wait…if my module already contains a good chunk of the work needed to offer server support, maybe I should build that too. And even if I don’t, I should think about how I can share my message parser so somebody else can. And that means that rather than being locked up in my Stomp::Client class it will need to become public API. And that in turn would mean a large, complex, part of the logic…just became easily testable!

I love these kinds of design explorations, and it’s surprising how often the relatively boring question of “how will I test this” sets me off in worthwhile directions. But wait…I shouldn’t just blindly go with the first idea I have for achieving testability, even if it is rather promising. I’ve learned (the hard way, usually) that it’s nearly always worth considering more than one way to do things. That’s often harder that it should be, because I find myself way too easily attached to ideas I’ve already had, and wanting to defend them way too early against other lines of thought. Apparently this is human nature, or something. Whatever it is, it’s not especially helpful for producing good software!

Having considered how I might test it as is, let me ponder the simplest change I could make that would make the code a lot easier to test. The reason I’d need a fake server is because the code tightly couples to IO::Socket::Async. It’s infatuated with it. It hard-codes its name, declaring that we shall have no socket implementation, but IO::Socket::Async!

my $conn = await IO::Socket::Async.connect($!host, $!port);

So, I’ll change that to:

my $conn = await self.socket-provider.connect($!host, $!port);

And then add this method:

method socket-provider() {
    IO::Socket::Async
}

And…it’s done! My tests will simply need to do something like:

my \TestClient = Stomp::Client but role {
    method socket-provider() {
        Fake::Client::Socket
    }
}

And, provided I have some stub/mock/fake implementation of the client bits of IO::Socket::Async, all will be good.

But wait, there’s more. It’s also often possible to connect to STOMP servers using TLS, for better security. Suppose I don’t support that in my module. Under the previous design, that would have been a blocker. Now, provided there’s some TLS module that provides the same interface as IO::Socket::Async, it’ll be trivial to use it together with my Stomp::Client. Once again, thinking about testability in terms of the public interface gives me an improvement that is entirely unrelated to testability.

I liked this change sufficiently I decided it was time to commit. Here it is.

Exposing Message

I’m a big fan of the aggregate pattern. Interesting objects often end up with interesting internal structure, which is best expressed in terms of further objects. Since classes, grammars, roles and the like can all be lexically scoped in Perl 6, keeping such things hidden away as implementation details is easy. It’s how I tend to start out. For example, my Message class, representing a parsed STOMP message, is lexical and nested inside of the Stomp::Client class:

class Stomp::Client {
    my class Message {
        has $.command;
        has %.headers;
        has $.body;
    }

    ...
}

The grammar for parsing messages is even lexically scoped inside of the one method that uses it! Lexical scoping is another of those things Perl 6 offers for keeping code refactorable. In fact, it’s an even stronger one than private attributes and methods offer. Those you can go and get at using the MOP if you really want. There’s no such trickery on offer with lexical scoping.

So, that’s how I started out. But, by now, I know that for both testing and later implementing a Stomp::Server module, I’d like to pull Message out. So, off to a Stomp/Message.pm6 it goes. Since it was lexical before, it’s easy to fix up the references. In fact, the Perl 6 compiler will happily tell me about them at compile time, so I can be happy I didn’t miss any. (It turns out there is only one). Another commit.

Oh, behave!

At the point I expose a class to the world, I find it useful to step back and ask myself what it’s responsibilities are. Right now, the answer seems to be, “not much!” It’s really just a data object. But generally, objects aren’t just data. They’re really about behaviour. So, are there any behaviours that maybe belong on a Message object?

Looking through the code, I see this:

await $conn.print: qq:to/FRAME/;
    CONNECT
    accept-version:1.2
    login:$!login
    passcode:$!password

    \0
    FRAME 

And, later, this:

$!connection.print: qq:to/FRAME/;
    SEND
    destination:/queue/$topic
    content-type:text/plain

    $body\0
    FRAME

There’s another such case too, for subscribe. It’s quite easy for a string with a bit of interpolation to masquerade as being too boring to care about. But what I really have here is knowledge about how is STOMP message formed scattered throughout my code. As this module matures from 1-hour hack to a real implementation of the STOMP spec, this is going to have to respect a number of encoding rules – or risk being vulnerable to injection attacks. (All injection attacks really come from failing to treat languages as languages, and instead just treating them as strings that can be stuck together.) And logic that will therefore even be security sensitive absolutely does not want scattering throughout my code.

So, I’ll move the logic to Stomp::Message. First, a failing test goes into t/message.t:

use Test;
use Stomp::Message;

plan 1;

my $msg = Stomp::Message.new(
    command => 'SEND',
    headers => ( destination => '/queue/stuff' ),
    body    => 'Much wow');
is $msg, qq:to/EXPECTED/, 'SEND message correctly formatted';
    SEND
    destination:/queue/stuff

    Much wow\0
    EXPECTED

I find it reassuring to see a test actually fail before I do the work to make it pass. It tells me I actually did something. Now for the implementation:

method Str() {
    qq:to/END/
        $!command
        %!headers.fmt('%s:%s')

        $!body\0
        END
}

The fmt method is one of those small, but valuable Perl 6 features. It’s really just a structure-aware sprintf. On hashes, it can be given a format string for each key and value, along with a separator. The default separator is \n, which is exactly what I need, so I don’t need to pass it. This neatly takes a loop out of my code, and means I can lay out my heredoc to look quite like the message I’m producing. Here’s the change.

Construction tweaks

With a passing test under my belt, I’d like to ponder whether there’s any more interesting tests I might like to write Right Now for Stomp::Message. I know I will need to make a pass through the spec for encoding rules, but that’s for later. Putting that aside, however, are there any other ways that I might end up with my Stomp::Message class producing malformed messages?

The obvious risk is that an instance may be constructed with no command. This can never be valid, so I’ll simply forbid it. A failing test is easy:

dies-ok
    { Stomp::Message.new( headers => (foo => 'bar'), body => 'Much wow' ) },
    'Stomp::Message must be constructed with a command';

So is this fix: just mark the attribute as required!

has $.command is required;

It is allowable to have an empty body in some messages. At present, it kind of supports that without having to pass it explicitly, but there will be a warning. The fix is 4 characters. It’s really rather borderline whether this is worth a test, for me. But I’ll write one anyway:

{
    my $msg = Stomp::Message.new(
        command => 'CONNECT',
        headers => ( accept-version => '1.2' ));
    is $msg, qq:to/EXPECTED/, 'CONNECT message with empty body correctly formatted';
        CONNECT
        accept-version:1.2

        \0
        EXPECTED
    CONTROL {
        when CX::Warn { flunk 'Should not warn over uninitialized body' }
    }
}

It fails. And then I do:

has $.body = '';

And it passes. The boilerplate there makes me thing there’s some market for an easier way to express “it doesn’t warn” in a test, but I’ll leave that yak for somebody else.

Those went in as two commits, because they’re two separate changes. I like to keep my commits nice and atomic that way.

Eliminating the duplication

Finally, I go and replace the various places that produced formatted STOMP messages with use of the Stomp::Message class:

$!connection.print: Stomp::Message.new:
    command => 'SUBSCRIBE',
    headers => (
        destination => "/queue/$topic",
        id => $id
    );

3 changes, 1 commit, done.

Enough for this time!

Next time, I’ll be taking a look at factoring out the parser, and writing some tests for it. Beyond that, there’ll be faking the async socket API, supporting unsubscription from topics, building STOMP server support, and more.

Posted in Uncategorized | 2 Comments