Martinfowler.com

Bliki: EmbeddedDocument

2013-06-04

Flowing JSON data structures through a server is something I'm
seeing more these days. JSON documents can be persisted directly,
either by using an AggregateOrientedDatabase or a
serialized LOB in
a relational database. JSON documents can also be served directly
to web browsers or used to transfer data to server-side page
renderers. When JSON is being used in this way, I hear people saying
that using an object-oriented language gets in the way because the
JSON needs to be translated into objects only to be rendered out
again - a waste of programming effort [1]. I agree
with the point about waste, but I argue that it's not a problem with objects but
a failure to understand encapsulation.

Let's imagine we're storing an order as a JSON document and
serving it up with minor server-side processing, again as JSON. An
example document might be like this.

We'll assume we haven't got much server-side processing to do, but we do have
some. Let's also assume we're using an OO language. A naive approach
might be to read in the JSON document, convert the data to the
appropriate object graph (with orders, line-items, and deliveries),
apply any processing, and then serialize the object graph to JSON
for the client.

In many of these situtiations a better way to proceed is to keep
the data in a JSONish form, but still wrap it with objects to
coordinate manipulation. Most programming environments provide
generic libraries that take a document and deserialize it to generic
data structures. So a JSON document would deserialize to a structure
of lists and dictionaries, an xml document to a tree of xml nodes.
We can then take this generic data structure and put it into a field
of an order object - here's an example with Ruby and JSON.

When we want to manipulate the data, we can define methods on the
object as usual, and implement them by accessing this data structre.

This includes cases with more complex logic. [2]

The embedded document can be enriched before sending to the client.

If needed, you can form similar objects on sub-trees of the
embedded document.

One thing be wary of here is that such object wrappers aren't
quite the same as normal objects. The delivery objects returned in
the above code fragment don't have the same equality semantics that
you'd expect from objects arranged in the more usual structure.

Despite its compartive rareity, an embedded document fits well
with object-orientation. The point of encapsulated data is the
hiding of the data structure, so that users of the object don't know or care about
the internal structure of the order.

Those familiar with functional programming will recognize the
style of flowing a generic data structure through a series of
functions - you can think of the object as providing a namespace for
manipulating the generic data stuctures.

The sweet spot for an embedded document is when you're providing
the document in the same form that you get it from the data store,
but still want to do some manipulation of that data. If you don't
have any need to access the contents of the JSON document, then
there's no need to even deserialize it into a generic data
structure. The order object needs only a constructor and a method to
return its JSON representaiton. On the other hand as you do more
work on the data - more server side logic, transforming into
different representations - then it's worth considering whether it's
easier to turn the data into an object graph.

1:
Some might argue it's a waste of computing effort too - although
I would be surprised if it were significant. I would certainly
not accept a performance argument against converting to an
object graph unless it was accompanied by measurements - just
like any performance argument.

2:
Note the chaining of CollectionLambdas in this
method. One of my pet annoyances is hearing some functional fans
say that this style of code isn't object-oriented. While it may
seem foreign to those with a C++/Java background, this style is
perfectly natural to smalltalkers.