API Ecosystem


The first three posts in this series have looked at REST APIs, how these use the technology of the web (REST/HTTP) and how they compare to alternatives. In this last post we’ll look at how APIs operate at scale.

It doesn’t take much to get going with an API – you just need a programming language which can understand HTTP (which is virtually all of them) and a bit of code. To show how little code, the I tried to write the shortest API which did something useful (e.g. didn’t just return some static text such as “Hello”) and followed the rules laid out in blog #2 Methods from Madness This is what I came up with (written in NodeJS):

const http = require('http'), url=require('url');
http.createServer(function(req, res) {
               var j = new Object();
               var url_parts = url.parse(req.url);
               if (req.method != 'GET'){
                              res.writeHead(405, {'Content-Type':'application/text'});
                              res.end('Method not allowed');
               } else if (url_parts.pathname == '/hour'){
                              var date = new Date();
                              j.value = date.getHours();
                              res.writeHead(405, {'Content-Type':'application/json'});
                              res.end(JSON.stringify(j));
               } else {
                              res.writeHead(404, {'Content-Type':'application/text'});
                              res.end('Resource not found');
               }
}).listen(1337,'127.0.0.1');

Don’t worry if you can’t read NodeJS, essentially what it does is:

  1. Listens for incoming requests - when it gets one...
  2. Check the method used - if it's anything other GET, it returns a 405 (Method not allowed)
  3. Assuming it is a GET and the resource is "hour” then send back the number of the current hour (0 to 23)
  4. Otherwise if the GET was for a resource other than “/hour”, returns a 404 (not found)
This is actually a usable (if simple) API, and it took only 17 lines of code to get working! If I hadn’t worried about checking methods and returning the correct codes (404 and 404) I could have done it in fewer than 10.

This re-enforces what I said in the last post, that REST got popular by being quick and easy to adopt. But if it’s so simple, why do we need anything else? In fact why would we need an “ecosystem”?

Running APIs

The first answer is (as so often in life) that it’s sometimes easier to get something to work once, than to keep it working. Once an API (or 100 APIs) are up and running how do you keep an eye on them? How do you spot when they stop working – or better still predict this before they fall over? Finally how do you protect them from freak events or malicious attacks? The answer most people rely on is a product called an ‘API Gateway’.

API Gateways sit between clients and the servers which run the API – and add a layer of monitoring, protection, and management on top. API Gateways require each client to pass a ‘key’ to identify themselves (in a HTTP header as you might guess by now), and provide a range of valuable features.

Rate limiting (a.k.a. throttling)

To ensure APIs are not overwhelmed by requests, the gateway can limit the number of requests they let through. This can be done per API (e.g. the GET /hour API can safely manage 100 requests/minute, but the GET /calendar API can only cope with 50 requests/minute). Alternatively, it can be done per client-key (e.g. this client is only allowed 5 requests a minute and can use them for whichever API they like).

API throttling ensures a single API doesn’t get swamped with requests.

Key throttling serves two purposes: first to ensures that one hungry client doesn’t crowd out others; and second to provide tiered layers of access. This allows an organisation to prioritise their own consumers above a business partner, or may allow partners to be given a 'free tier' but if they get popular they may need to pay. An example of this tiered access model is used by TomTom, which offers free access to their maps API for small numbers of requests, but charges big users (like Apple Maps which uses Tomtom to calculate routes) a price per request (or per thousand, ten thousand etc) – this is called API monetization.

Security

The act of sending a Key with an API request give a basic degree of security, but it’s fairly weak by itself. Gateways support a range of well known (and trusted) security standards to provide:
  • Authentication: Who is the client?
  • Authorization: Are they allowed to do what they’re trying to?
  • Auditing: Who did what and when?
  • Encryption: sending sensitive data (including passwords/keys) in encrypted form, rather than in ‘clear text’.

Threat protection

In addition to securing the APIs, many Gateways offer some level of threat protection to detect well known cyber-attacks. This ranges from blocking very large files which might slow down/overload an API;  spotting and blocking deliberately corrupt data which an API cannot handle; and scanning for viruses in messages.

Caching

As we noted in Post #2 (Methods from Madness), things returned by a GET can have a Cache-control header to say how long they can be remembered for. Whilst this was described in terms of a client trying to GET a picture only once, things can also be remembered (cached) by the gateway.

If an API takes a lot of effort to run (such as a GET /pi?decimalPlaces=99999999 API which calculates Pi to many decimal places), then caching the result may be a good idea not only by the client who gets it, but also by the gateway. Then if other consumers want the same result, the Gateway can send it back without re-calculating it. This both speeds up responses and decreases the cost of running lots of servers.

This is a great example of how using existing HTTP norms pays dividends. The API Gateway doesn’t need to know what’s in a response, or the format (picture, XML, JSON or just a very long number) – all it need to know is about the Cache-control header and it can provide a caching service.

Logging/reporting

Finally, people who keep things running need to know what’s happening. The gateway will usually keep metrics on what’s passing through it. These reports can be broken down by API, client-key, or both. Generally, metrics include things like:
  • Number of requests: How busy are the APIs? Over time are we getting busier? Do we predict we’re heading for a capacity problem? Who are our big consumers?
  • Number of successful/error responses: What type of errors are getting (e.g. how many auth errors 401/403, and how many not found 404 errors)? This can show if there are bugs in the client, or if there might be someone trying to hack our APIs!
  • Response times: How quick are our APIs both for successful/errors? Is this getting better or worse over time?
  • Active clients: Is anyone using the old version of our API? Can we turn it off? If there are only a couple of systems using it, this lets us know who so we can warn them and they can use a newer version or find a replacement API.
This largely shows everything available to ‘run’ APIs, but what about those people wanting to use them in their own development? This is where Portals come in.

Documenting and sharing APIs

APIs may have names which make them intuitive (such as GET /orders) and may hyperlink others to aid discoverability (e.g. a GET /order API may link to the GET /product and GET /customer APIs) but that doesn’t help a developer trying to find the APIs and use them for the first time. For this, API Portals were developed.

An API portal is essentially a website which catalogues the APIs offered by an organisation. The portal may be open to all, or require a developer account. Developers can search for APIs, find documentation about them (their API contract) and sometimes ‘try it now’ – making a request of the API directly from the portal user interface. Portals are usually linked into the Gateway, so developers registered on the portal can immediately use some or all the APIs on the gateway (often with a low access tier). If developers find that they like the API, they can contact an administrator through the portal to request a higher tier. The portal can also be used to report bugs, request new features, or chat with the rest of the community using these APIs.

API gateways and portals make use of another product developed to fill a niche: a standardised way of documenting an API Contract. There are several open source options which include the older WADL, the newer Swagger (“why waddle when you can swagger”), or RAML. Swagger seems to be gaining the widest adoption, and not just because it has a name created at WADLs expense. These all offer a standardised way of documenting the resources, methods, status codes, and body. Once a Swagger/WADL/RAML has been created (or often generated directly from code), it can be imported into the Gateway/Portal meaning there is no need to manually set these up. Clients of an API can also use these contracts, not only so Developers can read about an API, but to automatically generate the code to use the API; this all reduces time and human error when getting started. 

Conclusion

In this series we’ve seen how APIs allow applications to talk to each other. In an increasingly connected world, most applications now use the internet, and expectations of speed and agility are always increasing. APIs are one tool which allow developers to develop new products faster than ever (coupled with other trends such as Cloud Computing, Agile, DevOps, and Open Source Software). In fact these often work to complement and magnify each other's effect: 
  • APIs written in Open Source languages are exposed through Open Source or Cloud (SaaS) Gateways and Portals. They're hosted on Cloud platforms; developed in an Agile way with DevOps principles and tooling.
  • As Cloud vendors offer new services these are exposed through APIs. For example APIs to manage cloud compute (turn it off or on), access to cloud storage (GET or PUT a file), and APIs to access cognitive services such as the Azure Cognitive Services or AWS Rekognition - all of which are used through REST APIs.  
  • DevOps tools have their own APIs to allow remote control of build and deployment pipelines
  • Developers build new products in an Agile way by finding existing APIs to service their needs rather than reinventing the wheel.
REST has become (for now at least) the defacto standard for APIs - so much so that people use "API" to mean "REST API" (something I've tried to avoid in these blogs). The API Ecosystem has spawned a set of niche tools to extend REST, making it a complete answer for developing and running APIs at scale.

Hopefully this series of blogs has been an interesting and enlightening overview of these topics. If there are any questions, or feedback please feel free to leave a comment below - and thanks for reading.

Comments