Amos.me

AOT vs JIT: Why don't we do both?

2012-12-03

I wanted to take some time to write about a piece of software I've been working
on lately, just so you know how I've been spending the last few weeks.

Rationale

A few years ago, I designed a programming language:
ooc. Even though I've done my fair share of Java, C,
Ruby, JavaScript, and even some Perl, Scala, Python, PHP, etc., I still
find myself going back to ooc because it gives me access to C libs, relatively
high-level constructs, and it forces me to write code that's not too smart.

The toolchain is far from perfect, even though we've closed almost 500 issues
on the main compiler in about 3 years, but at least I know the bugs and usually
where to start to fix / work around them.

One of the nice things is that it handles interfacing with C libraries really
easily, and I've gotten all my code to compile+run on Windows, OSX, and Linux
without too many headaches. (Other platforms have been successfully tested, but
aren't actively supported).

There are bindings for most of the game-related libraries around: OpenGL,
OpenAL, SDL, GLU, glut, fmod, cubeb, Chipmunks Physics, libvorbis, DevIL, etc.
etc. So all is good to write the logic code in nice ooc itself.

Why am I still not happy though? Because the very nature of game development is
incremental. You build a core game mechanic, try it out, adjust it, try it
again, and so on. With a compiled language like ooc, this means that every time
you have to recompile the program and launch it again.

Compilation times are not really an issue, especially with libcaching (basically,
rock caches the result of compilation in .a archives, and only recompile
the modules that changed), but still, having to relaunch the application each
time slows development down a tad.

So I had an idea: what if, instead of having to recompile the app and launch it
again, I could just pause execution, quickly adjust an object's method, swap it
out for the new one, and then resume the game? Sounds sweet, right?

AOT vs JIT

What currently happens when compiling an ooc program is:

rock parses the ooc source and generates C code

rock launches a C compiler (GCC, Clang/LLVM, etc.)

the C compiler generates an executable or a library

Then, when you launch it, it's always the same code that's executed. This is
called AOT (Ahead-Of-Time) compilation.

By contrast, when launching a a program written in a dynamic language, here's
roughly what happens:

The interpreter parses the source code and creates an AST (Abstract Syntax Tree)

The AST is (usually) converted to byte code

Byte code is interpreted as the program runs

Hot spots (functions that are often called) are identified, and compiled
JIT (Just-In-Time) for performance

That's the gist of it. Of course, there are variations on that theme. For example,
Java has a compilation phase to JVM bytecode, and the runtime loads bytecode directly
into the JVM, which then handles JIT itself. There are also dynamic languages
running on the JVM.

There are debates as to how AOT and JIT compare in terms of performance: one argument
is that AOT leaves a lot of time for the compiler (and the linker, with LTO) to
figure out complex optimizations.

On the other hand, JIT compilers have first-hand information on what actually
goes on when a program is running, and are thus able to fine-tune optimizations,
specifically for the case the program is currently being used for. JITs however,
have downsides as well: warmup time (it takes time to find the hotspots and compile
them while the program is running), and it creates entirely new security issues

However, we're not interested in getting the most performance here: we know the
AOT ooc compiler, coupled with GCC or Clang/LLVM, is enough for our needs (because
I tested it), and we know JIT can help us develop by compiling new versions of methods
while the game is running. Let's see more in details how we can do it.

Getting there

What I described in the intro is not that crazy. The JVM supports hotswapping, for instance.
And Eclipse, with its incremental compiler, has had that feature integrated for
a while. Notch, of Minecraft fame, uses that heavily during Ludum Dare events.

So, back to ooc, what do we need to pull it off?

A way to inspect all objects of the game

A way to parse the game code & retrieve the AST, along with the
method boundaries in the source files

A way to parse method bodies alone to the same AST format

A way to resolve that AST

A way to compile single methods from the modified AST, using a
JIT compiler

A way to swap the old method with the address of the new, JIT-compiled
method.

And that's exactly what scissors plans on doing. Some code is already there,
messing around with the different parts of it.

Basically, here's the strategy:

scissors is used as a library that you import in your game project

When using it, you need to define a 'root object' so we can explore from there.

It uses rock as a library to parse the existing game code and know where the
code for each method is in your source code.

From there, it will be able to display a web interface so you can explore the
state of all your objects, browse the code of various methods, and modify it.

New code for modified methods is also parsed with rock, then scissors' LLVM
backend is able to compile it to executable code at runtime.

Using our knowledge of class layout, scissors is able to swap the old method's
address for the new one.

Swapping methods is really not a big deal. In ooc, like in most class-based systems
in C, the class is simply a struct with a vtable in it. The vtable contains the
addresses of each method. So swapping methods is simply a matter of overwriting the
address of the method in the vtable with the new, freshly compiled version.

What we can't do however, is adding fields and/or methods to classes. Since that
would require changing the layout of the class struct. With already-running objects,
that wouldn't be possible. However, subclassing and instanciating some objects from
the REPL just to test out something would be very possible. Or you could store your
game objects' properties in a HashMap, and then the class structure wouldn't change
at all. It's really up to how you design your game engine.

We could, of course, even go further. scissors could remember the state of your
experimentation. It could support an XPath-like query language to define triggers on
objects (for example "world/objects[type:enemy]"). You could add sliders to modify
some game values in real time (gravity, friction, jump height...). And on the next
run, all your experiments would still be there.

Continuing with the idea, since scissors know where the method's code lies, it can
replace it for you. While experimenting with a running instance of the game, of course,
it could keep track of the different versions of a method, allowing you to revert
back to a previous version if you made a mistake (doesn't replace git, it's local to
one "session" which shouldn't be too long-lived).

Of course, scissors could provide a full REPL, ideally with syntax highlighting,
auto-completion. It could instrument method calls and give all kind of useful information
about the parameters and return values of methods. For example, display spatial values
in a graph. Or show a frequency distribution fo an integer parameter. It could also
make a rough profiling of what time is spent in various methods.. although there would
be better-suited tools for that.

Why bother, and the road ahead

Of course, one of the questions this raises is why? Why bother trying to do all that
with a compiled language, whereas we could just use a dynamic, typically interpreted
language (Python, Ruby)? Well, there are some arguments in the vein of "performance",
"type-safety", etc. but mostly it's just because I like ooc.

Now, how will all this be implemented? As I mentioned before, I have all the pieces.
ooc-llvm has a demo that shows how to compile a function at runtime. rock
can already be used as a library, and in the current scissors repo, there's a demo
that shows parsing the old code, parsing the new code of a single method, swapping
them in the AST, resolving the new method.

The next step is obviously writing an LLVM backend for rock. For now, I think I'll
do it in the scissors repo, walking the rock AST and working with ooc-llvm from there.
I might have to add an intermediate pass where we clean up the mess generated by rock
in the resolving pass (like adding extraneous casts), but I'll figure it out, one feature
at a time.

After that, I'll probably start working on the web interface. Again, nothing too
fancy at first. I think web is the right choice here because 1) then I don't have to
rely on a heavy GUI framework like GTK and 2) you get remote debugging/editing for
free. Plus web dev is in much nicer shape than it was a few years ago. Having a simple
http server in ooc is no big deal, it's a dev tool so no need to worry too much about
security. It runs locally so no need to worry too much about small optimizations.

Note that scissors should only be used during the development part of a game. The
released game still consists of pure .ooc code, compiled ahead of time by rock, and
having very few dependencies. The whole scissors/rock/LLVM stack is only needed for the
dev version.

Closing thoughts

I don't know if I'll be able to put all the pieces together for this project, but it
still sounds pretty exciting to me, because this kind of hybrid (compiled language,
with a C backend, but adding JIT to the mix for rapid development) hasn't been tried
before, to the best of my knowledge.

What makes it attainable is that ooc is a relatively small language, and its self-hosting
compiler, while not perfect, is quite accessible as a library.

If you're interested in making games with ooc in general, feel free to drop by
at ooc gaming!