I’ve been thinking about idempotency in APIs a lot this week as I was working to improve the idempotency story within one of our internal services so that we could retry requests to it and minimize data inconsistencies.
For our public API, I’ll likely implement something resembling the Idempotency-Key
IETF draft, at least for POST
and PATCH
verbs. We’re not likely to have idempotency keys for all internal services though, so I’ve been leveraging alternate idempotency techniques instead. A good example of an alternative technique is to allow clients to inject an ID that they generated when creating a resource, which easily allows the receiver to determine whether they’ve already processed the request. This might not work so well for public APIs (it introduces the possibility of malicious IDs), but it works fine internally.
After tackling idempotency on our create endpoints, I moved on to DELETE
. This is an interesting verb because DELETE
is supposed to be idempotent according to RFC7231 (along with OPTIONS
, HEAD
, GET
, and PUT
) and in many implementations technically is, it’s often somewhat unhelpful about what it signals back to the caller.
Here’s a delete implementation from a Rails generated controller:
def destroy
@article = Article.find(params[:id])
@article.destroy
redirect_to root_path
end
A first call to destroy
looks up and destroys the record, then 302
s the user back to root. A second request will fail to find the object and 404
instead.
Now technically, this is idempotent because idempotency doesn’t guarantee anything with respect to status codes returned – it just concerns itself with side effects of each operation, and that second request had no additional side effects compared to the first. So from a strict correctness perspective, it’s correct.
However, it’s also not very helpful. If a caller lost the response of the first request and got a 404
back on the second, it’s hard for them to tell whether one of their requests did something, or were invalid as that object was never present in the first place.
On this go around, I’m going to try and do something with our DELETE
APIs. In many product implementations some kind of “acts as paranoid” facility is common, where you add a deleted_at
timestamp to tables or keep a separate “tombstone” collection to soft delete objects instead of culling them permanently, and we’ve had that since well before I started.
Conveniently, soft deletion acts as an easy stepping stone do something a little smarter with our delete implementations:
204 No Response
, as is our usual convention.410 Gone
instead.404 Not Found
.As a client, you can now reliably use either a 204
or 410
to know that the request succeeded, and that you’ll get back one of these responses even in the presence of multiple attempts and intermittent network errors. A 404
tells you something unexpected happened. This is also nicely symmetrical with the convention of returning either a 201
or 200
on an idempotent create endpoint depending on whether the object already existed or not.
This is also not to say that those old objects need to be saved forever so that a 410
is always returned. The value of something like this or Idempotency-Key
is not to have a permanent archive of old requests, they’re to be able to reliably make sense of recent ones, usually those that have occurred just in the last few seconds as you’re engaged in a retry loop. It’s fair game for that 410
to become a 404
after 24 hours 1 or so as old objects are pruned.
I’ll be the first to acknowledge that this isn’t a ground breaking new idea, and may not even be worth the effort in many cases – for the most part, clients are mostly just concerned that objects are gone, and getting a 404
back is good enough – but it is a small convenience on the road to a retry-friendly (and friendly in general) API.
1 24 hours being the lifetime of an Idempotency-Key
at Stripe before it gets pruned, although I’m sure this was chosen somewhat arbitrarily.
Did I make a mistake? Please consider sending a pull request.