Amos.me

S-exps in your browser

2014-11-03

The front end of the pool

I’ve been interested in reactive JavaScript for a while. At memoways, we
strive to build snappy user interfaces for clients who like to interact with
their data with as little latency as possible.

In the past two years, I learned front-end development on-the-fly, as the needs
of the clients required it. Two years ago, I was still using jQuery. Then, I
discovered space-pen thanks to my colleague Nicolas. It was nice to have
proper ‘view’ objects, and use jQuery’s event system to have messages propagate
throughout a hierarchy.

But state management was still an issue. Anything a tiny bit complex broke way
too easily. Surely with a better code organization, I might’ve been able to
hold on to space-pen longer (after all, it’s used by Atom), but no
matter how much effort you put into it, space-pen isn’t really suited to
rapid prototyping, something you need to do a lot of when you touch domains like
dataviz.

Then I discovered Ractive.js, by the wonderful team at The Guardian.
It seemed to me like a version of React without the baggage. Two-way
binding is magical when it works: just store dynamic properties within the data
object, use the getter/setter methods (or built-into-JS accessors if you’re
feeling adventurous and independent from all IE compatibility needs) and everything
updates smoothly.

However, Ractive proved very hard to debug. Its custom sort-of-HTML parser (as
is the tradition with almost all reactive frameworks nowadays) didn’t give much
context as to where errors were, when I last tried it. I had to quickly hack
support for “printing surrounding code when a syntax error occurs” into my copy,
just so my app wouldn’t collapse.

Error reporting is often overlooked by new language implementors. “You should
be writing correct code in the first place!”, right? The truth is: good error
reporting & debugging support is essential - when refactoring, tests are only
useful to a degree. After that, you’re left to the whims of whatever compiler,
processor, macro expander you’re using.

Enter ClojureScript

I had been hearing good stuff about Om for a while. I think it’s one of the
recent Prismatic blog posts that sold me on it. That was before
I learned that they’re pretty much the only high-profile company using
ClojureScript in production!

But at this point, I was fed up with suddenly-breaking Ractive code, and
anything new was good to distract me of my daily dev problems. Turns out,
Om is probably awesome, but it’s too verbose for me.

Here’s how the start of Om’s tutorial looks like:

Now, when you’re discovering a new paradigm and a new language, that’s a bit steep.

Let’s walk through things step by step. First off, it’s S-expressions all the
way down. That means instead of writing:

You’d write:

Yes, even + looks like a function call. It’s handy, too, because you can
pass it any number of arguments, unlike the infix version:

Same goes for =, and lots of other primitives.

So, now we can see that we’re calling something called om/root. The slash
is Clojure’s way of saying it likes to keep things nice and tidy, arranged into
namespaces. In this case, what they don’t tell you is that on the top of this
file is probably something like:

And so, whenever we type om/something, we refer to something within the
om.core namespace.

Note that ClojureScript allows you to refer symbols so that you can use them
without any qualification, for example we could’ve done the following:

You see what I’m getting at.

Then we have array syntax:

Like arguments to a function, arrays are space-separated. Maps are about the same,
except you need to have an even number of elements and the odd ones are the keys.

Usually we’ll use symbols instead of strings as map keys:

As for functions definitions, they look like a call as well:

Now with our newly-acquired notions, let’s go back to our original code:

We have a call to om/root, and the first argument is a function:

The function returns a type - really, an interface implementation if you ask me:

So, we implement the om/IRender interface, and it has only one method:

It takes one argument, which we bind to _ to mark that we really don’t care
about it (in this case it’s the local state of the component, but let’s not get
into this…)

And then we call the h1 dom construction function, with no special options
(nil is Clojure for null), and what the value for the key :text stored in app.

In the interest of clarity, (:text app) is roughly equivalent to (get app :text),
or app['text'] in JavaScript.

The second argument to om/root is simply app-state, an atom (we’ll get back to that).

And the third argument is a map, with a single key, :target, whose value is:

That’s ClojureScript/JavaScript interoperability - equivalent to the following JavaScript:

Now that we’ve learned a few basics to be able to read ClojureScript code, what does
this actually do? Well, anything implementing the om/IRender interface is an Om
component, and calling om/root mounts it to a DOM element somewhere. Somewhere
in our HTML, we probably have that:

And in there we’ll have an h1 tag containing whatever is in (:text app).

So then, how do we put stuff in the app state? As we mentioned earlier, app-state is an
atom. We could define it like that:

And then, if we want to change the text, we can do:

Which is really just a shorthand for:

Itself a shorthand for:

What does swap! do? Well, an atom is basically a reference to a value. The
value itself is immutable, as all Clojure data structures, but we can change
the reference to something else.

So, wherever in Ractive you’d do state.set(key, value), changing the state
itself, in ClojureScript you’d rather swap the reference to a slightly changed
version of the state.

However, the whole state isn’t copied - the parts that didn’t change are
immutable as well, so they’re just referenced from the new data structure.

Let’s take a more complex state as an example.

In this example, not only are the old and new (:b @state) value-equal but
they’re also reference-equal - they’re the same data structures, at the same
place in memory.

Note: @some-atom is the atom dereference operation. It evalutes to the value
the atom references, rather than the atom itself. And pr-str is basically
a toString.

So, whenever the state atom is changed, comparing the old state and the new state
is very fast - still in our “John Doe” example above, no need to compare
whether (:first-name (:b @state)) has changed, for example - we can see that
(:b @state) still has the same address, so all the substructure has to be
the same - because of immutability.

Even when the app state gets complicated, small updates to part of the state tree
are still fast, because it’s inexpensive to find out which parts of the tree are
changed. Hence, only the minimal amount of re-rendering is done.

Exit Om

It might not seem like much, but writing all those reify forms and having to
require and call methods for every dom element get old really quick. Call me
impatient, but I knew there was something simpler lurking around the corner.

And there was! Reagent provides a much simpler interface to React. So,
in the same ballpark as Om but without all the running around in circles.

In reagent, components are simply functions, and HTML tags are specified
as keywords, with a Hiccup-like syntax.

For example, I love FontAwesome icons. Here’s a simple
component that displays the icon of your choice:

Using it is as simple as any HTML tag:

That would produce HTML like:

I was pretty happy with Reagent, and wrote the UI for the memoways issue
tracking system in it. It works very well, and I can prototype quickly,
although errors are still kind of painful.

ClojureScript: the bad

But now I’m seriously looking at ClojureScript alternatives.

I was okay with having to learn a completely new build tool, leiningen.
In fact, it’s really nice to have mostly one tool to interact with for the
whole language.

I was okay with ugly JavaScript interop, leading to code like this:

I was okay with syntax errors yielding 50-lines-long stacktraces in
my console.

I was even okay with the REPL taking around 4s to start (on a 2.2Ghz Core i7 with
a recent SSD).

But I’m not okay with the completely erratic behavior of lein-cljsbuild and
the Google Closure compiler.

Let me get you up-to-speed on the whole ClojureScript/Closure shenanigans.

Google Closure compiler has two main components:

An SDK covering a lot of ground

An optimizing/minifying JavaScript compiler

ClojureScript relies on the former to reimplement most of the Clojure standard
library in JavaScript (while also using its module loading facilities), and
on the latter to keep the compiler’s output to reasonable sizes.

And by “reasonable size” I mean 330k minified, 80k minified+gzipped, with
just a few libraries. Turns out cljs.core (imported by default) is huge.
Like, over nine thousand lines in a single file huge.

In dev, it’s not that bad. Sure, you have to go through a few hoops, like
figuring out the right options to put in your project.clj, including some
goog/base.js file on top of your yourproject.js file and having an
inline JavaScript call to goog.require.

With all that figured out, it’s really not that bad. Again, in dev.
Recompilations with lein cljsbuild auto dev are fast, around 200ms, it loads
all dependencies correctly with XHR calls (about 20 separate .js files to do
anything useful), everything’s fine.

In production, it’s another story entirely. First off, you want to set
the Closure compiler to perform advanced optimizations. Basically it’ll attempt
to:

Perform dead code elimination (remove the parts you don’t need)

Minify by renaming symbols (variables, functions) to shorter things like
Gl or aQ instead of reverse or forEach

And of course, removing all the whitespace it can get away with.

So first off, it’s slow. It’s really really slow. If you have 9000+ lines of
ClojureScript as input, you can imagine how much JavaScript it spits out.
The Closure compiler has several megabytes of JavaScript to chew through, and
it’s not uncommon for it to take around 30 seconds to process completely.

Second, you have to be extra careful about foreign libraries. Because:

Chances are, they don’t use the google module system at all
(goog.require, goog.provide, etc.)

They define symbols that can not be renamed by the closure compiler
in your code.

But there’s a thing for that! Just add them to an externs list as an option
to the compiler and you’re good to go. Except, again, it’s slow. Very slow.
And it won’t hesitate to emit thousands of warnings on sources over which you
have absolutely no control.

And if it wasn’t enough, once it’s done taking dozens of seconds compiling all
that stuff - sometimes it just emits erroneous code. I suspect that, when scanning
a directory for sources files, the tree it constructs is wrong, because if you
run it in watch mode (so that it recompiles automatically on file changes)
and touch just the right file (the one you’d pass to the compiler if it did
allow that), then it takes another 8 to 10 seconds and generates
correct code.

Oh, and if you want to modularize your codebase, say you want to output
one .js file per “view type” of your application, say one for the issue tracker
UI, one for the client-side slideshow visualizer, etc. - not only do you have
to maintain a contrived directory infrastructure and specify tons of
redundant dependencies in your project.clj - you also get the 30s to get
the wrong code then 10s to get the right one but only if you launch it in
watch mode for every separate target file you want to have.

In that light, it appears that ClojureScript is - currently - quite painful
to use in production. Sure, the tooling may improve in the coming months,
but in a fast-moving world, I don’t really have the time to wait.

Honestly, the ecosystem feels just wrong - even though I love reading Clojure
code (the language itself is a pleasure) - and it’s full of nice people,
fetching .js files as a .pom + .jar from a Maven repository… it feels wrong,
and so dissociated from the rest of the JavaScript world.

Mori + Sweet.js = Ki

When looking for alternatives, I knew that I wanted something:

Lighter than ClojureScript

With a similar syntax (S-exps are <3)

With immutable data structures.

Ki is just that. It uses mori, which brings ClojureScript’s data
structures, and is, besides that, mostly a set of sweet.js macros to turn
S-expression back into good ol’ JavaScript.

I won’t detail the whole language, it has a reference page on the website.
The main differences from ClojureScript are missing features (at the time
of this writing: destructuring assignment is the one I miss the most), and
snake_case instead of kebab-case.

Of course, a bunch of macros (as sophisticated as they might get) will never
do as much as a full compiler will - but in this case, it’s kind of the point.
A compiled ki program is usually pretty small, the only constant cost being mori,
which is rather small (by cljs standards).

Ki presents itself as a JavaScript script, the command-line tool provided
with the distribution is ran with Node.js by default. After having played around
with the language a bit, I was seduced. Moving on to the next problem: how
the hell do I integrate that to my Rails application?

With ClojureScript, I had given up - even though Rails asset pipeline support
for ClojureScript does exist, the documentation made it clear that it had
to spin up a JVM each time any asset was modified. So I was just using the
lein cljsbuild auto command by hand.

In Ki’s case, though… I had a feeling we might do better. A lot better.
We could, obviously, hook into the Rails asset pipeline and, when, a .ki.js file
needs to be compiled, write it to a temporary file, then call the ki command-line
tool on it, ask it to write to another temporary file, then read that file and
return it to Rails.

But that’s no fun at all! There’s a reason tools like the Coffee compiler, sass
compiler, ki compiler, have “watch” modes - they take longer to initialize than
they do to recompile something. And we only need to pay that cost once.

So, let’s go with a much harder - but more rewarding - alternative. Keeping our
own instance of V8 around and running the ki compiler through that. A
handy way to interact with v8 from Ruby is therubyracer. Ideally we’d
use something runtime-agnostic like execjs - but we’ll have enough trouble
making our stuff work in a single JavaScript engine like that.

therubyracer is very nice to play with. The first thing we want to do is
create a context.

Then we can eval some JavaScript expressions in it:

You can interact with JS expression from Ruby, for example, JS arrays
come back as V8::C::Array instances, which you can coerce to real ruby
arrays, or just iterate with each.

Similarly, you can iterate over JavaScript objects (ie. dictionaries/maps)
with each if you want.

The context retains globals across eval calls, so we can store whatever
we want in this (the equivalent of window when there’s no browser around.)

As always, though, globals are evil, if only because they might collide with
other globals, so we’ll use that sparingly.

We can load JavaScript files via eval using a little trick to escape the string
properly:

This approch has an advantage - you can fence the code with a preamble and epilogue
of your choice. You can make your code execute within a scope where you override
some variables, like define or exports for example.

If you don’t need to do that, you can just tell V8 to load the file directly:

In this case, you don’t control the scope the file is evaluted in (it’s always
the global scope) - but you do retain file / line number information within
stack traces, which is invaluable.

So, now that we know how to evaluate arbitrary code within V8, and retrieve
the results, and how to load any JS file in it, we’re done right? We just
load ki.js, call ki.compile and be on our merry way. Right? Riiiiiight?

JavaScript loaders

Wrong. You see, in the JS world (the real, vibrant, “god save us lest we forget
a var” JS world, not the compiles-to-JS world), they’ve been busy figuring out
the best way to have modular code that can be loaded in different environments
(say, Node.js, or a browser module loader).

The idea is for libraries to avoid clobbering the global scope with stuff like
_, or $, or jQuery, or React, or whatever variable they’ve chosen.

That’s why it’s not uncommon to see JS libraries nowadays structured like this:

In the case of ki.js, the main file for the ki compiler, the fuck-it style
isn’t included at all. Which means we can’t get away with the naive gross approach
of just loading everything in the right order, crossing our fingers, and sacrificing
a virgin veal to the gods of globals.

The Node.js style is kind of painful to handle for us from the Ruby side.
Whenever require is called, something must be done - but we load files not by XHR
like some JS module loaders do, but by reading files from the filesystem ourselves
(or calling context.load, or…). Point is - we need some way to get back to the
Ruby side from that JS call.

I know that it’s possible. In particular, commonjs.rb does it - but it doesn’t solve
problem #2 with this approach, which is that the ki compiler and its dependencies
assume that, if it’s loaded with the Node.js style, it can access Node.js modules
such as “fs” (the async, libuv-based thingy to interact with the filesystem. so
cute.) And since we’re running in V8, not in node.js, we have zero access to that
kind of module. We’ll do the I/O ourselves, thank you very much.

Which leaves us with the AMD approach - the cleanest, in my view. I like the idea.
Each module calls define with a list of dependencies, and a factory function that
should be called with these dependencies.

It leaves us plenty of time to inspect those ‘module specifications’ from the Ruby
side, load the dependencies and then pass them back to the factory function.

Now, at this stage of my journey I’ve lost a lot of time trying to do something sane.
I thought - well it’s easy: we’ll start by loading up ki.js, then see what its
dependencies are, then load them up, see what their dependencies are and so
forth until we have a complete dependency graph. Then we know in which order to call
the factory functions.

It looked a little bit like this:

Or, in graphviz form:

From this graph, though, it’s immediately apparent that we have a serious problem.
Circular dependencies are lurking everywhere! parser depends on expander, but
expander depends on parser as well. And many others.

So, my beautiful plan of loading everything in order has quickly fallen apart.
Instead, we can do the next best thing: load them in the order they are
required, and when something’s not properly loaded yet, reserve a empty-object
space in our modules list that we’ll pass to the dependants. I suspect that’s
how most AMD loaders work, because this approach actually worked.

In the case of ki-rails, I’ve decided to store modules in a global object, so
that I can easily access them later. At the time of this writing, the code isn’t
entirely cleaned up, so don’t hit me with a large trout - just be patient.

There’s a few things I haven’t mentioned yet, but need to be taken care of:

First off, exports is a special requirement. It’s not a js file we load,
it’s the object in which all the exported symbols of a library will be stored.
That must be accounted for.

Second, my dependency graph only shows sanitized module names. In the real
world, it’s not uncommon to see modules require ./module or module.js
instead of just module

Thirdly, AMD loaders allow two different define signatures - one that goes:
name, dependencies, factory, and an anonymous one. Most modules I’ve encountered
use the latter, but named modules exist (within the sweet.js codebase) and must
be handled correctly.

Lastly, dependencies starting with text! are not JavaScript files and must
be passed as a string without any evaluation, to the module asking for it.

With all that, we can finally call our ki compiler from V8, a naive
version of which looks something like:

However, this approach is pretty naive:

It parses the ki core macros every time

It has no source map support

Source map support

Thankfully, Sweet.js has source map support built-in. And the ki compiler
interface allows its usage, like so:

To hook that into ki-rails, I took inspiration from Mark Bates’
coffee-rails-source-maps. Basically, the original ki file along with its source
file are written into public/assets/source_maps and the correct URL is specified
in sourceMappingURL, in the compiled JavaScript file.

Apart from erroring in production because the server was very protective about
where the web app could and could not write, it worked very nicely!

Macro support and speed

I said one of the problems with our approach, was that the ki macros were being
parsed everytime. This is actually pretty slow. At the time of this writing,
they only amount to about 900 lines of rules, but the sweet.js parser is quite
an involved piece of work, and as such, I was seeing end-to-end compilation
times of about a second or two.

We can avoid that by precompiling ki_core and keeping it around in the global
context.

And then, later, passing it as the modules option to the ki compiler instead of
passing the ki_core option.

This has a negative side-effect, however. See, the ki compiler has a piece of code
like this:

ie. when passed the source of the ki macros instead of pre-compiled modules,
it calls joinModule on the input (ie. src) and the ki macros, then asks
sweet to load that. What does joinModule do exactly?

How interesting! A parseMacros method. That’s where I saw that my little
trick didn’t quite work. Sure, it made compilations faster, but it broke
ki macro support.

To understand why, we have to learn a bit more about how ki macros and how
ki is implemented. Ki is implemented, as previously mentioned, via a set
of sweet.js macros. They look like that.

The largest macro, by far, is _sexpr. It contains rules like:

That’s the anonymous function definition syntax! Sweet.js is definitely
readable. Here’s an example of ki macro in action:

The parseMacros call we’ve seen earlier in the ki compiler looks for
the “ki macro” string with a regular expression, then converts the
definitions to sweet.js rules, like so:

That leaves us with this mysterious line in joinModule:

Apparently it’s replacing a comment within the ki core macros’ source with
the user’s macro definitions. Why is it doing that? Let’s see where
it gets inserted:

Of course! It’s smack dab in the middle of the _sexpr macro, so that we
may use ki user-defined macros anywhere in an S-expression.

However, it’s bad news for our Asset pipeline integration dreams. Does this
mean that if we want to support user-defined ki macros, we’ll have to recompile
the whole thing every time, and thus have a minimum of 1-2s compilation time?

Sprockets dependencies

Not so fast! First off, ki has no concept of import or require or include.
It allows one to define namespaces, but there’s no built-in support for importing
other files, unlike, for example, sass. (Not a language that compiles to JS, but
it is well-supported by rails and has its own @import statement).

So, we have to fall back to the next best thing: Sprockets
directives. Sprockets is the little processing meta-engine that could, powering
the whole Rails asset pipeline, from preprocessing to transforming to postprocessing
to fingerprinting, manifest-generating, caching, and serving-with-the-right-headers.

As anyone who has written JavaScript or CoffeeScript in Rails, Sprockets has
directives, the most used of which are require, require_tree and require_self:

Now, since we want to write modular ki code, we want to be able to require stuff.

In the case of functions, the ki compiler doesn’t have to know about it. They’re
just regular JavaScript functions, it’s not like any compilation-time checking is
done.

In the case of macros, however, the ki compiler has to know about every macro
definition in the dependencies of our file, so that it may apply the right
transformations.

In other words, we need to do the following (in pseudo code):

Now, the Sprockets API is not that hard to figure out, although
I wish better documentation existed on the subject. Turns out,
from within a tilt template called by Sprockets, like this one:

..we can access the set of dependencies with a simple
scope._dependency_assets, and we can even coerce it to an array with a
.to_a call. Giving us something like:

Reading all those files and concatenating them is done in two lines of Ruby.
Extracting the macro part can be done by calling the JS method ki.parseMacros
via our V8 context and all that’s left is to call sweet.loadModule and pass
the result to the ki compiler.

Of course, that only solves the problem of “if A requires B, A should be able
to use macros defined in B”. It doesn’t solve the problem of “I want compilations
to be fast because brain switches are expensives”.

However, we can easily solve that with a simple cache. I’ve opted for a very
naive solution for the time being: I simply have a global object __macrocache
in the V8 environment that stores compiled macros, indexed by the SHA-1 hash
of their sources.

Often, I’ll change the ki code without touching any macro definition. In that
case, the SHA-1 of the macro sources stays the same, and no reloading of the
sweet.js module happens. In that case, recompilation time is not even noticeable!

Shameless plug

I hope you liked this article! I’m currently not available for hire, but
memoways is looking for client work!

If you have a project involving the web, video, and perhaps a dash of data
visualization, then send us an e-mail, or ping
@fasterthanlime on Twitter.