HTTP API versioning

I've been thinking on HTTP API versioning lately. Troy Hunt's Your API versioning is wrong... is the 1st hit when googling "HTTP API versioning" but I think it's limited in its scope to dealing just with GET verb in public non-authenticated APIs. This isn't explicit in the article but Troy was writing about the API to his Have I Been Pwned public service which only serves GET requests.

I want to extend this discussion to other HTTP verbs and also authenticated APIs. I'm considering the following versioning methods, out of which the first three appeared in Troy's article while the other two are mentioned in the comments:

  • Accept headers versioning
  • A custom versioning header
  • URL path versioning
  • Subdomain versioning
  • Query versioning

Before I continue:

  • I have actually never created a public API that I had to maintain through different iterations. I work on closed APIs meant for known client consumptions usually controlled by the same organization (or individual in my case). So take all of this with a huge grain of salt.
  • Pertinent to this discussion is whether the service should accept unversioned requests as if there it's the latest version or the original version which all the clients may have been using.
  • HTTP APIs include REST APIs but also other types of APIs like RPC
  • As noted above, this is about versioning all kinds of HTTP APIs, not just public anonymous GET APIs.

So here are my thoughts on the subject.

  • Versioning data is orthogonal to resource state data - resources are what they are, not matter what is the version of the API that is serving or accepting them.
  • Resources however may also be versioned - they could change not only their state but also their meta-state: increase or decrease of the field of all possible states (e.g. more or less properties on JSON object accepted/served by API)
  • It makes no sense to conflate these two versions: if resource is versioned then it needs to describe its own version and clients need to know what to do with it (or not). A single API version on the other hand may choose to serve/accept many different version of resources.
  • Troy's preferred versioning method is using Accept headers. However per RFC:

Accept request-header field can be used to specify certain media types which are acceptable for the response

To me this doesn't read like something that should (rather than could) be used to inform the server what the client is sending but explicitly only to inform what the client will accept from the server. So IMO Accept headers don't cut it as a general rule - they could work for GET but not for the verbs that change the state of the resource. In those cases they are wrong even philosophically which is where they are at their strongest when it comes to GET requests.

  • As noted in Troy's post, using custom versioning header is more cumbersome than other possibilities. But this is mostly felt on anonymous requests as authenticated requests already require to communicate authentication values (e.g. tokens) usually in other headers (e.g. Authorize). This limits the concerns about this method to the use in anonymous APIs.
  • Versioning through URL path is the method that I personally have mostly used. It's easy and you get to keep parts of API that don't change from parts of API that do change (so your clients can call /v1 and /v2 at the same time). I'm absolutely convinced (now) that this is the wrong way to go about it as it confuses the issue of versioning of the API and of resources/procedures. On the other hand if you are versioning the entire API you need to have an easy way to create multiple versioned endpoints for essentially same handler functions. This could be done through configuration rather than code to keep the boilerplate down.
  • Subdomain versioning is a really nice way to not worry about complicating code with any backward compatibility scaffolding. If the back ends can be separated then just keep the old architecture running until all the clients have moved to the new version (or you end-of-life it). If you can't separate the back end then:
    • Release version 1 at your v1.api.domain.tld
    • When releasing version 2, put it on v2.api.domain.tld
    • Instead of keeping and maintaining version 1, rewrite it in the terms of version 2 API, just passing requests to your v2.api.domain.tld.
    • When releasing version 3, put it on v3.api.domain.tld
    • Now rewrite version 2 in terms of version 3, passing requests to v3.api.domain.tld.
    • And so on.

Thus you will be building a chain of APIs which will heavily affect the performance of the old APIs but that is another discussion and might even help you to move your clients to newer versions of your API (in a rather passive-aggressive way but if you disclose it then it should be fine IMO). This chaining can also be done within a single server instance but having an HTTP between two versions of API is, in general case, a more optimal solution in my opinion: you ensure that your new public API can do all that the old API could and you separate the states of new and old API so temptation to reuse code at implementation level never comes into play.

  • That said the subdomain versioning might be the toughest sell here as you need to convince ops and business to run two or more different versions of the server. But this is the standard engineering trade-off: either the development incurs technical debt or operations/business incur complexity/price debt. Figure out what's cheaper for you both short term and long term.
  • Query versioning seems like a good hybrid solution to me: it doesn't break REST paradigm and yet you get to specify the versioning in the URL itself. My personal concerns are more in the backward-compatibility-scaffolding department - now you have a single endpoint and a single function to respond to that endpoint so it becomes responsibility of that function to switch to different versions of behavior. This is not hard of course - it's actually enticingly easy - but in my experience it leads to worse code and faster accumulation of technical debt with each new version that the function has to understand.
  • Mandatory versioning, even when using a method where it's possible for it to be optional (e.g. versioning through headers), seems like a sound choice to me to enforce from day one. Then you never have to worry about clients that "don't use version number" and what to serve them.

As I noted above, what I personally have been doing so far is using version as part of path. APIs I usually create are a mixture of REST and RPC, just by the nature of the systems they expose, and I didn't even think of the philosophical argument until I read Troy's post.

Going forward, I find subdomain versioning very interesting. This method is where I suddenly (and surprisingly to me) spent most of my time in writing this post. It's definitely not an easy strategy in terms of convincing operations and business to follow your lead on this but where I control API and I'm both operations and business, I think I'll be trying out how that works for me. The obvious cost here is rewriting v(k) in terms of v(k+1) but I have some ideas on how to deal with that in an easier way. In general I have been strongly moving away from backward-compatibility-interspersed-with-current-functionality both in internal and external APIs and instead I try to use more backward-compatibility-through-conversion, so right now this is something close to my heart.

Where I don't get to choose and experiment, I think that versioning through custom header is the cleanest option for REST APIs. For RPC APIs though - there is literally no philosophical argument to be made. I would just use version as part of URL path as path is procedure's name rather than a resource name. That said - I would be versioning entire API rather than just separate procedures (as I have been doing so far)

Author

Ivan Erceg

Software shipper, successful technical co-founder, $1M Salesforce Hackathon 2014 winner