2013-12-09

We here at Liip are currently building a JSON REST API for a customer. At least initially it will only be used by 3 internal projects. One of which we are building and the 2 others are build by partner companies of the customer. Now we want to define a game plan for how to deal with BC breaks in the API. The first part of the game plan is to define what we actually consider a BC break and therefore requires a new API version. We decided to basically define that only removed fields, renamed fields or existing fields who's content has changed should be a BC break. In other words adding a new field to the response should not be considered a BC break. Furthermore changes in how results are sorted are generally not to be considered a BC break as loading more data or an upgrade of the search server can always result in minor reordering. However we would consider a change in any defaults to be an API increase (f.e. changes in default sort order) or changing the default output from "full" to "minimal" to be a BC break. But I guess we would not consider changing from "minimal" to "full" as a BC break as it would just add more fields by default. That being said, for caching reasons, we try not to work with too many such defaults anyway and rather have more requirement parameters. With these definitions ideally we should only rarely have to bump the API version. But there will be the day were we will have to none the less.

First up we do not want to use the URL to version the resources for the obvious reason that this violates the concepts of REST, as this would imply that "/v1/foo" and "/v2/foo" are not the same resource. Remember the "RE" in REST stands for "REpresentational" which means we are talking about representation of state. Therefore a single URI should be used by resource as the unique identifier. Instead to get different representations we should use media types. There isn't really a universally accepted standard for how to encode version information into a media type. So far so bad. To make things worse it gets a bit iffy to define custom media types (ie. "application/vnd.my_api+v1.1") for different versions as then you would logically also return that as the Content-Type in the respons. This in turn will make generic tools not able to pick up that the response is in fact JSON if its not just "application/json". A convention to at least make it human understandable is to add "+json" to the custom media type and I guess clients like browsers could be made to understand this convention. Also it seems like browsers also sometimes ignore the Content-Type entirely and simply try to guess by looking at the actual content.

Then again there is no standard that defines what a web application is actually supposed to do with an Accept header in the strict sense. Sure there is the "q" parameter which defines the priorities of the different media types in the Accept header. But technically it is up to the server to decide what to make of these priorities and as far a I know it can also choose to respond with any media type it wants to. Meaning a request with "Accept: application/vnd.my_api+json+v1.1" could come back with "Content-Type: application/json". Obviously if the server does not support the requested media type (f.e. it never did or no longer does) it should return a 416 HTTP status code.

However this brings us to the next issue: Should we have separate version numbers for different parts of the API? Having different versions would make sense if we will likely have releases that will touch smaller parts. But this will complicate the life of the API user, who would then need to worry about sending the proper media types for different parts of the API. But a release often approach might still make this quite sensible as long as we then keep older versions supported for a longer time. However if we have bigger releases it might be easier for everyone if we just increment the entire API if there are any BC breaks. Given that we have a small group of API users we might then even force everyone to update to the new API within a defined timeframe. This could be nice because then with every such big release we would remove potentially deprecated code for previous versions.

But this brings up to the topic of caching. Caching really requires that we also include the version in the response. So returning "Conent-Type: application/json" would be a no go. It would be better if we would return the actual version of the returned structure in the response, so we are back at returning "Content-Type: application/vnd.my_api+json+v1.1". This way we could ensure we do not return duplicates into the cache, even if parts of the API got their version number incremented without any actual change in the response. But actually it does not reall

Truncated by Planet PHP, read more at the original (another 1428 bytes)

Show more