2013-07-10

Clojure and Clojurians at Factual

How We Use It

Factual began deploying Clojure to production in October of 2009. We used it cautiously and experimentally at first, confining ourselves to a narrow corner of our stack related to query interpretation.

As we experimented further with Clojure and applied it to more problems, we formed a very favorable impression of the overall language and related technologies. It's true that Lisp is seen by many as a "weird" language, and to be sure, Clojure is not without its warts. On the other hand, if you're looking for a fun and productive functional language that runs seamlessly on the JVM, Clojure brings massive value to the table.

SOME THINGS WE LIKE:

Significant developer productivity gains for certain use cases

Capitalizes on the superior power of Lisp

Everything Leo Polovets, Factual engineer, wrote in his answer on Quora: "Why would someone learn Clojure?".

Often allows large reduction in code bloat vs. the equivalent Java solution. An unscientific survey of selected projects where we ported functionality suggests that we need 3X the code when working in Java.

SOME THINGS WE DON’T LIKE:

Not always super easy for non-Lispers to pick up

Some immaturity around things like tooling, stack traces, and the library ecosystem

May not be as performant as Java if you really, really need fast code

Aaron Crow, one of Factual's early Clojure advocates, has a nuanced take on the matter:

"Please don't make me write any more Java."

Before long, we discovered Cascalog, a Clojure-based query language that runs on top of Hadoop. For certain use cases, we found Cascalog to be a huge win over the alternatives, especially when considering clarity of code and developer productivity. Since Cascalog queries provide a higher level of abstraction than pure Java-based MapReduce jobs, there may be concessions made to performance, but the code is also much easier to understand and maintain.

Chun Kok, author of “Clojure on Hadoop: A New Hope“, is philosophical about the trade-offs:

"Why not use Hive?"

Evan Gamble, our most seasoned Lisp veteran, applies Clojure to solve some of Factual's more challenging Machine Learning problems. He explains his love of Clojure thusly:

"It's still not as good as Common Lisp."

Zach Tellman, creator of Aleph, is currently helping us build an engineering team in Factual's Palo Alto office. Zach describes his seasoned, measured approach to using Clojure to conquer Factual's complex engineering challenges:

"Yeah, I can hack that together this weekend."

Of course, like every tool, Clojure is not a silver bullet and we don't treat it as such. Our broader goal is to have a wide variety of tools in our toolbox, and carefully choose the best tool for each job. Boris Shimanovsky, Factual’s Director of Engineering, clarifies the subtlety here:

"Have you shipped yet?"

Fortunately for those of us who love us some parentheses (and who doesn't (right?)), Clojure is often chosen as the right tool for a project. We're now using Clojure and related tech throughout our stack, including:

Drake, our open-sourced "Make for data" tool

Various Machine Learning solutions

An internal API server on top of our entity data

An internal task management and resource queueing system

Ad hoc querying of data housed in HBase

We've also fielded Clojure-based libraries for using Factual's public API:

Factual Clojure Driver

FactQL

We're becoming more confident in when and where it's most appropriate to apply Clojure as a solution, and we're excited about the possibilities and potential gains. Boris elucidates:

"Sometimes, the Lisp weenies are right."

Read More

Clojure on Hadoop: A New Hope

Thinking in Clojure for Java Programmers (Part 1 — A Gentle Intro)

Thinking in Clojure for Java Programmers (Part 2 — Macros FTW)

Work with Us

If you’re an engineer and interested in helping Factual with hard stuff, check out our current openings.

Show more