My last post looked at a very high level at what an API is and what REST APIs are. In this post we’ll look in some more depth at how REST works – and start using some of the right lingo to describe it. It gets a little long – for which I apologise - but please stick with it, as at the end I’ll pull it all together with a (hopefully enlightening) real-world analogy which should make it all clear if it all gets too confusing.
Previously we saw HTTP is a way of interacting with something over the internet using four things:
- resource name + the location of that resource (also known as a URL)
- a method of interacting with the resource (e.g. GET)
- the payload (the resource itself)
- other information – which give some more context and help to extend this very limited model
Note: I use REST and HTTP interchangeably (which I shouldn’t), but to try to pick apart the difference here is to add more complexity than I feel comfortable with. Consider REST the concept and HTTP the technical way of doing the concept. Or just read them both as the same thing and excuse me for using both.
Ingredients of REST
Universal Resource Location (URL)
URLs are something most of us see every day e.g. https://en.wikipedia.org/wiki/Representational_state_transfer. The URL includes both the server (en.wikipedia.org) and the resource on that server (wiki/Representation_state_transfer ). The reason it’s called a Universal Location is because as it’s unambiguous – there are no two things anywhere on the internet which share the same URL.
One of the elegant things about REST is that although it has great simplicity – though only having a small set of methods – it’s still very flexible. There are around 15 supported methods, but only 5 are used on a regular basis:
- GET: Used by the vast majority of the web. Viewing webpages, downloading PDF files, viewing videos etc are all GET (e.g. the client says to the server “I want to GET this resource”.
- POST: The HTTP way of saying ‘create’. This is the least intuitively named of the methods – but really means posting a thing to the server the same way one might post something by mail.
- DELETE: As the name suggests, this removes a resource.
- PUT: This is an ‘overwrite’. It puts the whole resource back to a given location, overwriting the old copy.
- PATCH: This is a newer addition to the REST family (introduced in 2010 – compared to 1996 for GET and POST). Patch allows a ‘partial update’ of a resource – patching over just a little bit of it. Whilst this isn’t great if uploading things like pictures, it’s useful for textual data and is an important and useful part of REST APIs - as we’ll see later.
These five essentially cover the whole of REST – they’re the only ones I’ve ever used; although there are others such as OPTIONS (what methods does this resource support), HEAD (just give me the start of the resource) – but they’re quite niche.
Methods are defined as being safe and/or idempotent:
- Safe methods can be used without changing anything. GET is safe, but methods such as PUT, PATCH, POST and DELETE are not (obviously deleting a resource is a change)
- Idempotent methods can be performed one or many times with the same result. A PUT is idempotent. Consider overwriting your old email address with email@example.com – this can be overwritten once, twice or a hundred times and the result will always be the same. POST is not idempotent. If you sent a POST to create a new order, and then do it many more times you’ll end up with many orders – and pay a lot of money! The more times you repeat it the more you’ll get – so not Idempotent
This is important to think about in case something goes wrong, can/should you just retry or not? All Safe methods are by definition also Idempotent.
The body is the main part of either a request/reply between the client and the server. It’s sometimes known as the payload. The body contains the actual substance of the resource being transferred (e.g. an image or file). A body can be sent in the request, the reply, or both - depending on the method.
When doing a GET the response-body contains the resource, when doing a PUT the request-body contains the resource. Occasionally it’s in both, for instance sometimes a POST will have a body in the request (client is effectively saying to the server: “please create this for me”) and then returned in the response (server is effectively saying: “this is what I’ve created”). This can be useful in case the server adds things during the act of creating. If you POST an order, you might get a copy of the order back, but with extra bits added (e.g. order number, expected delivery date – this is key if you want to subsequently GET the order because without the order number how do you request it?).
Something I left out of the first post (for simplicity) is the “query string”. This is an extra bit of information stuck onto the end of the URL to tell the server how to process the request. Logically this only makes sense for a GET (hence the name query). It can be used for other methods, but it probably shouldn’t be.
So far resources have been described as static things (e.g. a picture) which it was in the early days of the web, but resources can also be dynamically created on the fly - such as a search result page on Google. The resource is the search results page (which we want to GET) but the content of the results page is dependent on what we search for. So the client need to tell the Google server what results we want, this is done by adding the query onto the URL (e.g. https://www.google.co.uk/search?q=what+is+a+query+string ).
A query string starts with a question mark, consists of one or more name/value pairs and are separated by ampersand signs (if there are more than one pairs) for example http://server/resource?color=red&hight=100. In the above example, Google uses plus signs in place of spaces because URLs/Query Strings cannot have spaces.
Another idea wonderfully elegant in its simplicity is that of returning a three-digit status code with every response. These tell the client if a request was processed successfully, and if not why not. The reason I say elegant is that the codes aren’t just a random list (1=worked, 2=failed etc) but they follow a pattern which has two layers of meaning.
The first digit shows the overall result. Codes starting with a 2 show it worked; those with a 4 show it didn’t because of something the client did wrong; those with a 5 show it didn’t work either – but because something went wrong at the server end, so the client cannot do anything about it.
The last two digits are then a well defined list of sub-reasons which give more detail – but if the client doesn’t want that extra detail it can just look at the first number. Here are some common examples.
2xx (it worked):
- 200 – OK: the request was processed successfully (e.g. you asked for that cat picture – I’ve returned it in the response-body)
- 201 – Created: a new thing has been created (e.g. from a POST)
- 202 – Accepted: like a 201 but although the message was received, it hasn’t been processed yet (e.g. it’s been put in a queue of things to do later).
4xx (it failed – client fault)
- 400 – Bad request: You sent something which I couldn’t deal with (a somewhat vague catch-all)
- 404 – Not found: the most famous code on the web. You asked for a resource I don’t have e.g. rather than http://server.com/logo.jpg you asked for http://server.com/loogo.jpg
- 405 – Method not allowed: just because HTTP allows all the methods listed above, it doesn’t mean they’re always expected. Google doesn’t support doing a DELETE to http://google.co.uk/search – either to delete the page, or to delete search results!
5xx (it failed – server fault)
- 500 – Something went wrong in the server (also rather vague)
- 503 – Server Unavailable e.g. it’s down for maintenance.
Sitting in-between success (2xx) and failure (4xx/5xx) there is a middle ground. 3xx isn’t a success, but it’s not necessarily a failure - if the client does what the server suggests it might still get a 2xx – happy days! What the server suggests is generally for the client to look in a different place – so these are the ‘redirect’ codes:
- 301 moved permanently: look over there, and don’t bother coming back (the client may choose to remember this for next time)
- 302 found – but you need to go elsewhere before I’ll let you have it (we’ll come back to this later)
- 307 temporary redirect: look over there, but in future come back here as this is only a temporary.
The 3xx codes are a bit strange but they’re used a lot in security (e.g. re-direct off to a login page then re-direct back).
Everything covered so far allow for the basic interactions of the web – but this is still fairly limited. If I can PUT my profile picture, what stops someone putting a different one? Also, this quite inefficient: if every page on Amazon displays the logo in the top left, then I need to GET that exact same logo when each page loads. The answer to this and many other challenges are Headers. They allow all the unseen magic of the internet to occur – whilst keeping all the above as simple as I’ve described it. Effectively they’re the extensible set of add-ons to do “other stuff”.
Headers are name-value pairs in the format headerName: value(s). The value can be a single value or multiple values separated by commas; and headers can be used everywhere: in both the request and response, and for every method.
Some common examples of headers sent with requests include:
- Authorisation: How to stop everyone being able to create/delete, for that matter being able to GET - only I should be able to GET my bank details! The Authorisation header provides a place to pass credentials to the server.
- Accept: The accept header tells the server what format (or formats) of reply I’m able to accept. Some servers can generate responses in multiple formats, other times they only do one but it’s not what the client wants. When the server can only provide something the client cannot accept it just sends back an error: 406 – Not Acceptable (if you’ve been following this far the fact it’s a 4xx should be a surprise – the client asked for something the server cannot do so it’s the client’s “fault”, if the client asks for something else it might still get a 2xx reply). This check stops clients downloading a potentially huge thing which they’ll never be able to process (which would be annoying).
Examples of common response headers include:
- Cache-control: How long the client can keep a local copy of this resource for before requesting it again. Some things (e.g. site logos) appear a lot but change very infrequently. There is no need to download the logo on each page. Rather, web browsers cache (remember) a copy for minutes, hours or even days to speed up page loads. Given that the client doesn’t understand the context it’s up to the sever to explain how long it can be cached for - this header tells them.
- Content-Type: What type is this file? This goes with the Accept header – if the client said it could accept A, B or C this says which one they’re getting.
There are lots more common headers, and rather nicely anyone can add their own – so long as both the client and server understand them (e.g. if they’re agreed in the API Contract – which ties this nicely back to the concept of APIs).
Although it was slightly long, that’s everything important about REST and HTTP (at least before HTTP2 came along… but that’s another story). There are more headers, more status codes and more methods; but they all fit into this framework, so if you’ve followed this then well done, you understand REST and HTTP. If this is all starting to feel a bit complex and disjointed, then hopefully an analogy will help…
In large warehouses dispatchers instruct pickers and stackers (driving on forklifts, or with shopping carts) to either pick or stack items in specific places within in the warehouse. When doing a “pick”, the forklift goes from the dispatcher to the specified location then comes back with the requested item (or when things go wrong come back to say they cannot find the item and get their next instruction). When stacking, the forklift goes from the dispatcher to the location with an item, then comes back to say it’s been successfully deposited – potentially with some more information such as where in the bay it is (or when things go wrong they come back to say it cannot be deposited – maybe the bay is full). Hopefully you see where I’m going here.
When packing before IT automated everything, the driver was handed a paper pick instruction which consisted of:
This paper instruction is to pick (Method=GET) an item (Resource) from a location (server). It even has a space for extra notes about the pick (Query Parameter).
In addition, the driver has knowledge not contained on the pick note – such as that kettles must be in boxes, or they’re not be acceptable to load onto a truck (Accept header).
The driver returned with either the item (Response-body) and/or a verbal update that it was OK (200) or that there was a problem: “there were no kettles” (404) or “none of the kettles were acceptable” (406).
Some warehouses even have secure areas where high value items (e.g. jewellery/tablets/phones) are stored, and pickers cannot get in without an access badge (Authorization header). Other times a driver may get to an area and find a note saying ‘bay being renovated items in bay 5’ (307 temporary redirect) or even when an isle is blocked off temporarily for cleaning or when full of items (503 unavailable).
This blog has covered all the parts which make up the REST/HTTP technology. Much of this has been described in terms of the web (web pages, videos, pictures etc). Whilst this has hopefully been useful and interesting, in the next post we’ll go back to using REST for APIs and see why this is a good way of doing things. Why use REST for our APIs? What alternatives are there? Why is rest (for now at least) the most widely adopted method (no pun intended) 😊. That will all be covered in REST vs the rest – comparing REST with other options.