About a year ago I wrote a short article asking when it’s okay to use GET to do POST’s job. Since then I’ve learnt a bit about web standards and web pragmatism, but also about the specifics: safety and idempotency.
Recently I found a blog devoted to well-designed URLs. I love well-designed URLs (enough to have made DecentURL.com, a web service that turns ugly URLs into decent ones). For instance, there’s no question about which of these URLs is better:
Unfortunately, I think Mike Schinkel got a bit carried away in his somewhat ranty post about how SnipURL’s GET-based API is bad.
On the web, it’s not simply a case of “GET is always evil for requests that aren’t safe”. I’m afraid he’s ignoring the evidence that GET sometimes just works better. Paul Buchheit sums it up nicely:
There’s no question that POST is the “right” way and generally safer, but sometimes it’s annoying. Even though GET isn’t “supposed” to work, it often can be made to. Don’t believe me? Google does billions of dollars a year in GET based transactions in the form of CPC ad clicks (which can cost over $50/click).
Okay, perhaps I’m a touch biased. Perhaps I’m being a little defensive because DecentURL’s API works just like SnipURL’s. :-)
But apart from the pragmatics (“it works”), I believe Mr SnipURL and I are sticking to the standard. It’s not just about GET vs POST. There’s this other little point: the distinction between safe and idempotent.
Safe means a request doesn’t cause any side effects. A safe request just grabs data from a database and display it. Static pages, browsing source code, reading your email online — these are all “safe” requests.
Idempotent means that doing the request 10 times has the same effect as doing it once. An idempotent request might create something in a database the first time, but it won’t do it again. Or it’ll just return the reference to it the next time around. As a friend said to me:
From the browser’s perspective, there is no difference than if the response had always existed for all time prior to the first request. One can cache that response without any perceptible effect, for instance, and bots can request it again and again without damaging anything.
Idempotent is exactly what creating a DecentURL or a SnipURL is, and it’s why we’re allowed to use GET. You do it the first time, and the service creates a record in the database. But there’s no harm in GETting it again — the service simply grabs the existing database entry.
As the HTTP standard notes:
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered “safe”.
It’s a “should”, a rule of thumb for good reason. But what the standard does actually mandate is that GET must be idempotent (see 9.1.2).
And it seems like the makers of all the popular URL redirection services realise this (TinyURL, SnipURL, Metamark, notlong.com). All of those services either use GET normally, allow GET, or allow GET in the API call that creates a URL.
But what about the spider trap that Mike proposes? (See the PHP code in his blog entry.) Won’t it fill our databases? Won’t it suck our non-safe services into recursive oblivion? Well, apparently it doesn’t happen, or TinyURL and co would have big problems on their hands.
Also, if you create a decent robots.txt, you’ll stop spider-trap problems, at least from good spiders. And if it’s a spider with malicious intent, it could just as easily break a POST-only service as an allow-GETs service. Just like with XSRF, using POST isn’t a catch-all for baddies.
29 March 2008 by Ben 8 comments