Integration a decade on – what remained


A decade after working on my first Integration project, I’ve been reflecting on what changed and what hasn’t. My last post focussed on how things were then, and what technology and techniques have fallen away; but there is as much common as there is different. Much “best practice” in 2007 is still good now, and sadly some of the challenges I saw then are still to be truly addressed.

Open standards

A big theme of the early 2000s was a push to use open standards - although older colleagues may point out that CORBA predated this. In my “what changed” blog I took a rather tongue in cheek attitude to the myriad of WS-* standards; but although this trend got a little carried away, it was essentially a good thing.

Back then committees decided standards, now a more Darwinian “survival of the fittest” Open Source approach has taken hold. Standards are still open but rather than a compromise several potentials emerge, then one - or perhaps two - win out. This was seen when Swagger, and to a lesser extent RAML emerged as the successors to WSDL. These can be more elegant (not having gone through the sausage machine), but there may also be less uniformity. We don’t all quite talk one language; but we are at heart still standards based.

Discoverability is still elusive

Last time, I discussed how HATEOAS displaced UDDI. In truth, there’s still work to do in discoverability. My personal experience shows HATEOAS links in REST APIs are sporadically implemented at best, and often ignored entirely.

At the extreme end of the scale, some REST evangelists suggest that apps using APIs discover available APIs at run-time (e.g. call api.host.com to get a completely dynamic list). This means apps don’t need to “know” about any “/resource” APIs; but I’ve never seen this (even as a POC).

Most organisations I’ve seen adopting a REST approach will easily reach Level 2 on the Richardson Maturity Model but many stop there and are happy to hard code their apps to use fixed APIs (host.com/something). Others go one step further, using HATEOAS when most useful to link directly related items between APIs (e.g. products in a shopping cart to a product details API) but don’t link everything.

The truth is that a fully dynamic approach isn’t too helpful unless all logic is server side. An app which doesn’t expect an endpoint won’t know what to do with it even if it suddenly comes available. Even without that extreme, writing lots of HATEOAS links into an API requires effort to both develop and maintain. Linking only what is useful, rather than absolutely everything (attempting to be “pure”) is probably a trend which will continue, and it’s probably sensible that it does. This means most APIs will never fully reach Richardson Maturity Level 3 – but it’s good to keep pushing for at least some HATEOAS. Perhaps trying to get everyone to Level 2.5?

Versioning is still hard, and still overlooked

The start of too many APIs go something like:
Step 1: “Create an API”
Step 2: Want to change the API
Step 3: “oops I didn't think about versioning”
Step 4: “Create Version 2 of an API”

Versioning is very often overlooked. At one level it’s simple: try to make changes backwards compatible so you don’t need to introduce a new version, up-version when you do, retire the old one once all users have upgraded.

Sadly, it’s tricky when there’s lots of reuse, and webs of interdependence between APIs (the cost of HATEOAS again). This is true in large Object Oriented projects; it was true in large SOA Web Services projects; and it’s true in RESTful APIs.

In OO, reuse is achieved by distributing copies – so if people run old versions it’s not your problem. If you’ve got webs of interdependence within your own monolithic application you can at least (fairly) easily trace and refactor where things are used.

Services on a network are more difficult. If the service is only used by a couple of internal applications you can force them to upgrade (perhaps even as a big bang). If APIs used by external consumers you don’t control or apps given to “Joe Public”, then this can be really tricky. As API owner, you’re responsible for keeping them running – or suffering the fallout if you turn them off too soon. Tooling such as API gateways with developer keys can help know who uses an API, as can sensible logging; but even knowing who is using an API isn’t always enough.

As an example, if a company has four versions of a mobile app running on: Android, iOS and Windows Phone (a few do I'm told); then they have 12 apps using their APIs. Hopefully they had foresight to include functionality to force upgrades, or a kill switch triggering a polite message saying, “I won’t work anymore”. They probably didn’t think of that (versioning is overlooked) so they need to maintain all the APIs supporting these apps – or disappoint their customers by their apps suddenly stopping with no warning.

In some ways, the tooling is better than it was, but apps mean it’s more important than ever to think about versioning early.

Middleware still exists

This is sometimes overlooked in the present buzz around Microservices.
  • Old view: Systems provide service, an ESB connects them (with discovery, transformation, orchestration and routing) – but scales poorly and is ultimately inflexible.
  • New view: Microservices encapsulate single functions; provide well defined interfaces; do transformation; orchestrate other Microservices. REST allows discovery. All scales well.

This sounds great but has two problems: Microservice Architectures aren’t going to become the norm overnight, and Microservice architectures aren’t the best answer for every problem – so may never become ubiquitous.

Microservices are sold on their ability to adapt quickly and scale. Scalability comes in the form of high throughput, but also scalable development. Not having a shared database helps solve the former, breaking down the problem into small “two pizza sized” teams solves the latter.

Whilst all this is great, a microservice architecture has drawbacks: distributed applications incur a cost (multiple databases, added latency of more network calls, an “eventually consistent” data architecture) and can be more complex. In cases where scale isn’t that important, and where time to market isn’t key, the cost might not be worth it.

Areas of business differentiation can really benefit from an innovative Microservices approach – but commodity IT (finance, HR, supply-chain management etc) will likely remain COTS packages – which may never be re-architected into Microservices COTS packages. Add to that the fact that most organisations have Legacy products that they don’t plan to retire or replace any time soon.

In most cases therefore there is a need to integration between COTS/Legacy products – or need to expose a COTS/Legacy product to custom builds. In these cases, having something flexible in the middle is still useful. My last post discussed “The fall of the ESB”, but whilst it’s lost its golden place as a “MUST USE”, middleware software is still a useful tool in the toolbox for solving some problems.

Are services an asset or just part of a user story?

A common ideal (then as now) is that services/APIs are a reusable asset. A common reality (then as now) is that they are just something needed to deliver a particular widget or journey. Registry & Repositories, and API Portals have both tried to solve the technical challenge of re-use (finding an existing service/API and understanding it). As I highlighted last time, the fact that APIs are inherently open to the internet makes them reusable outside the organisation. But the design challenge remains. If a service is designed well, as a generic component, then it will be more re-useable than if it’s designed narrowly, in a rush, for an immediate need.

There are best practices (then as now) – most recently approaches such as Mulesoft’s “API led architecture” for layering different granularities of APIs; but ultimately this takes discipline. This works best if someone with authority (if not the CEO, then at least an Architect or a Product Owner) sees APIs as an asset and not just a bit of pluming to connect an app to its data source.

Data is central

Fundamentally Integration involves only two things:
  1. Moving data around
  2. Non-functional requirements (see below)

The only truly tricky integration challenges are where data doesn’t “fit” somehow. When trying to connect two systems and: a required field is missing; there’s no shared key; or the formats don’t match. Ultimately most integration design/build is data mapping and data flow; the rest is solving silly problems, or triaging defects which had nothing to do with integration, but integration got the blame until someone proved otherwise (guilty until proven innocent).

Finally... Fundamentals, and non-functionals

Some things which are provable (like CAP theorem). Others are obvious and common sense, once they’ve been pointed out; but are easily done wrong until they’re pointed out. Lots has changed in the last decade but most fundamentals have remained (and because they’re fundamental they’re not unique to integration).

The unhappy path is always followed eventually

Data can be sent at-most-once (try and give up), at-least-once (try, and retry but it might arrive more than once), or exactly once (but only if you use a product with transaction commit guarantees such as JMS – and even then there are issues). This needs to be thought about, and tested. REST APIs help by making as much as possible idempotent – although the consumer still needs to know to retry. POST APIs need thought about who is responsible for not duplicating records in case timeouts (clients by doing a GET to find out if it's made it, or servers ignoring replays - if it can detect them).

Non-functionals are key

Non-functionals could be a blog post in its own right, and this is already far too long. Suffice it to say that security, scalability, reliability, operability (including monitoring and alerting), latency, and maintainability are all important, both for individual APIs and for the whole platform.

Conclusion

The biggest mistake I made when first being exposed to SOA and Web-services was assuming we were reaching an integration end of history, and that there was a “correct answer”. I came out of university having been taught a lot of good things, but forgetting that all science is evolution and there are a lot of missteps and re-appraisals along the way. XML was “better” than what came before, it was an open standard which everyone agreed on, thus it would be that way from now on. Right?

What I didn’t see was that it was bloated, that end-developers hated XML DOMs, and that namespaces were overkill for most situations.

XML, and SOAP fell faster than one might have expected, but the fact that it was replaced by something new should not have been a surprise. Now I’m a little older, and hopefully a little wiser, I’m trying not to make the same mistake with the tools and techniques of today. They too have advantages over what came before, but they too have problems. “old” ideas like RPC are being talked of again, this time with new fashionable implementations – so the days of REST might be numbered. What I do expect is that at least some of the things which have stayed constant for the last decade (especially around data, fundamentals and non-functionals) to be around for a while yet.

Disclaimer

My postings reflect my own views and do not necessarily represent the views of my employer.

For more by me follow me on Twitter @JAGLees

Comments