brandur.org

Docs! Docs! Docs!

Ocean Beach

The past couple of weeks: a lot of work building out and tightening up documentation, with special emphasis on API reference.

We’ve had an API reference live for some time, but before now it was maintained manually. The loop was: hopefully a product person would remember that a new API feature was shipping, and would poke a dev rel person to update the docs. The dev rel person would write some documentation about a feature they understood by way of two hops worth of telephone, hit some endpoints with cURL, and try to capture what came back as best as they could.

The results were about what you’d expect. The docs were enough to get something working, but only about half our total API endpoints were documented. Amongst those that were, fields were often missing or extraneous as the responses had changed since the docs were originally written. Some of it was just flat out wrong.

But although imperfect, in a major sense the handwritten docs did what they were supposed to: serving as a stopgap long enough to give us a chance to build something better. That took a bit of doing, involving dragging a few monolithic foundational blocks into place – a process that took about six months all in – but we finally got there, and launched our generated API reference last week.


This is the third major push to build API documentation that I’ve been involved in, each of which has produced a very different pipeline.

Heroku’s API reference is found on Devcenter. Here’s how it gets built:

Screenshot of Heroku's API reference

All in all, it worked pretty well, and especially given the state of the art at the time (this was ~2013), it wasn’t half bad – I would’ve pitted what we had against any other Silicon Valley companies with public APIs at the time.

But there were obvious downsides. A few that come to mind:


This post was originally broadcast in email form. Can I interest you in seeing more like it? Nanoglyph is never sent more than once a week.

That brings us to Stripe, and the well known stripe.com/docs/api, a URL that I’ll forever remember better than my own name.

Here’s the rough loop there:

OpenAPI was a bit of an unauthorized skunkworks project when we started it, but it turned out to be one of the most high-leverage technical investments the company ever made. (Although I’ll note that it’s not anything particular to OpenAPI that makes this true – the important part is to just have some kind of intermediary format that lives between the backend implementation and the generators that build derivatives.)

Screenshot of Stripe's API reference

It wasn’t bad, but the process was far from perfect:


So that brings us to the latest stack at Crunchy, where having learnt from more than a few old mistakes, I determined to build something not only accurate and useful, but also fast, fluid, and as automatic as it could be.

Here’s how it works:

So to recap: on a successful merge to master, CI pushes a new OpenAPI to the web. A separate GitHub Action wakes up, runs the doc generator and commits any changes. That commit triggers a Heroku deployment and pushes the changes live. Aside from the initial merge on GitHub, no human intervention is required at any point. The result looks like this.

Screenshot of the Crunchy Bridge API reference

We’re also taking speed seriously. The OpenAPI generator runs in less than a second and that’s without any optimization effort on my part. The docs generator is even faster. Go compiles quickly, so even when I’m iterating on either program, it’s always fast. No thirty-second development loops in sight, and gods help me there never will be.

But while it’s the best API reference pipeline I’ve been involved with yet, nothing is perfect. A few things that come to mind:


So while generated docs are great from an effort standpoint, a hill I’m willing to die on is that even when generated, all docs should include a healthy dose of humanity. The computer handles iterating over endpoint/struct/field ad nauseum, but a human should augment what the machine would do to add as much background and context as possible.

Here’s an example of the worst kind of documentation, unfortunately all to common everywhere in the computing world:

// GenerateHTTPResponse generates an HTTP response.
func GenerateHTTPResponse([]byte, error) {
		...
}

Oh so that’s what GenerateHTTPResponse does. Hallelujah – I was lost, but now I’m found. That documentation isn’t just of no value, it’s actually of negative value because someone might see there’s documentation on a function and go there to read it, only to realize they’ve completely wasted their time.

So where possible, I encourage an internal convention of writing docstrings that aren’t just useful to us, but have enough context that they’d be useful to anyone.

Check out the Keycloak REST API for an example of what inhuman API documentation looks like – exhaustive, but frustratingly context-free in every possible way. I’m aiming very explicitly for our docs not to look like that.


Like sci-fi, want a TV recommendation, and still trust me after my horrible over-optimistic Wheel of Time review from a few months back? Raised by Wolves, which just started airing season two. Directed by Ridley Scott, this is the purest science fiction to make its way to a big budget production in years, and one of the precious few original ideas to be found in modern culture. It’s seriously crazy – flying snakes, acid oceans, androids performing ad-hoc facial reconstructive surgery, Travis Fimmel reprising his role as Ragnor Lothbrok – nothing makes sense, yet there’s just enough there to make me believe there’s a method to the madness. I never have any idea what’s going to happen next.

Until next week.

1 We look for Go “pseudo-enums” using basically the same technique as the exhaustive lint, which is open source if you want to look at code.