Async::Trampoline

This module makes it easier to write trampolined functions: each function returns a thunk or continuation that will be invoked until a final result is produced. This allows us to implement funky control flow, like Python-style lazy generators or tail recursion. I use this module for a few handwritten recursive-descent parsers and interpreters that deal with recursive structures.

This module started as a small function in a hand-written parser, but it turned out that the code would spend more time in the trampoline dispatch loop than doing useful work (I profiled this carefully).

So I rewrote it in XS, Perl's C language binding. That helped a bit, but using the Perl data structures through C is not more efficient than using them through Perl. Also, writing an ad-hoc C program is fairly error-prone.

Now that I knew what I had to program, I decided to rewrite this code as a separate module and avoid the Perl API for the trampoline dispatch loop. Initially I tried writing it in C, but writing correct C code is very very difficult, especially around correct ownership tracking. I also found that I needed to re-implement many fundamental data structures. While it's often not difficult writing these data structures, it is difficult to test this code carefully. I finally stopped once I found myself researching fast hash functions for a custom hash table implementation. (For the record, I settled on FNV as a likely candidate.)

So I decided to rewrite the code again, this time as C++. The better type system and especially RAII allowed me to write much more robust code than with C. The standard library is a big help. Refcounting in the internal object graph became a non-issue since that can be handled automatically with RAII. I still had a few bugs and segfaults, but in fixing them I learned lots about GDB and Valgrind.

To my surprise, a good part of CPAN Testers supports C++11 compilers. Most modern features are broadly available since GCC 4.7. However, the Perl toolchain is not very C++-friendly. E.g. your XS code must catch all exceptions manually. And XS in general is a bit of a minefield: The docs are very sparse. You have to carefully refcount all Perl objects. And when you peek at the toolchain source code, it looks like the kind of write-once Perl you're always warned about.

After all this effort, the trampoline performs adequately. Not very good since calling Perl subroutines is fairly costly, but it's probably not going to get much faster than this. Along the way I learned a ton about Perl internals, C, C++, and software packaging.