Building Resilient API Clients

Slug kyle fuller cover?fm=jpg&fl=progressive&q=75&w=300

Building Resilient API Clients

by Kyle Fuller

Aug 24 2015

Current methods of API versioning don’t actually solve the problem of breaking changes — they just delay them. Kyle Fuller presents the REST architecture as a means of allowing changes in API & API clients to develop independently. While dispelling common misconceptions about REST, he’ll demonstrate how to build such resilient API clients.

Building Resilient API Clients (0:00)

Today, we’re here to talk about building resilient API clients that can adapt to API changes without requiring a deep understanding of the API’s implementation details. Let’s first talk about how a client is built and common problems that arise. I’ve made a simple, dynamic demo iOS application called Polls, which allows you to view a list of multiple choice questions and asks you to vote on your favorite choice. You can swipe a question to delete it or tap on one to view it. When you tap on a question you can also cast in your own vote. Finally, you can go back to the home screen or the list of questions and hit the create button to create your own question.

How does it work? (1:37)

Now that we know what this app can do, let’s talk about its implementation and how it communicates with our API. The initial list of questions will create our API by performing a GET HTTP method for the users. We’ve already made an assumption that our API is going to return its JSON, a representation of questions, in this particular implementation. When we swipe to delete we’re going to construct a URI to delete the question. In this case, we’re going to construct this URI with the id prompt and make an assumption that the user is allowed to delete this question and perform the operation.

When we go into a question itself, to download the question we will need to construct another URI by using the query parameter id. Remember that the ID is a server side implementation detail. In this case it comes from using a relational database with auto implemented IDs. Thus, it seems absurd that the client has a notion of this ID. and it uses this to understand how it works. This is an implementation detail of how it’s stored on the server in our API. When a user taps on a choice to make a vote we again construct a URI, this time with two IDs.

Now, we probably need to duplicate some business logic. If we have already voted on a question, can we vote on it a second time? We need to have an interpretation of such rules on our client itself. When we go back to the first screen and hit create, we’re going to be shown a form and make an assumption about what the API accepts and how it’s encoded. We make an assumption that the API accepts an adjacent encoded body. It’s clear to us that our application has been built from an out-of-band specification, and we’ve designed our application in a certain way because of our understanding of how the API is going to work. By designing our application around this out-of-band specification, we’ve really tightly coupled our application to how the API works. This might not seem like a problem initially, but eventually you’re going to need to change something.

API Versioning (4:03)

People often solve this problem with API versioning: they create a completely different version of the API. When we move fast with our APIs, we often change how the API works in breaking ways, but that shouldn’t cause our old clients to break. We still want to move fast without breaking things.

Get more development news like this

When we make a breaking change in our API, we often need to read a prior application with updates at the source. With iOS, we can’t change this immediately because there’s often a one to three week review period. If we don’t offer an update client, we’ll stay in the dark with a broken application. It’s not like the web, where you can simply redeploy effects quickly.

Some API providers say they’ll keep each version of their API around for quite a long time. For example, Facebook says they’re going to keep their API around for two years, which is great, but eventually you’re going to have to switch to the new API. Freezing versions of the API doesn’t actually solve the problem - it just gets delayed. Doing so also requires the API team to maintain this legacy API. It also means they might be resilient of changing the version and how it currently works. They often won’t be able to innovate APIs quickly because they’re still keeping hold of these existing concepts.

Move Fast and Break Nothing (5:39)

Wouldn’t it be great if we could move fast and not break anything by building more resilient clients that could adapt to future change? I’m sure you’ve all heard of REST. It’s a common architecture for building APIs, prominently designed to allow us to embrace change. REST is built of a set of constraints, which allow the server and clients to change independently, and the web is built off of this. Many of the constraints in REST are designed specifically to embrace and anticipate change. It would be ridiculous if clients had to be updated every time a website was redeployed.

People often think of REST as CRUD, pretty URLs, JSON, HTTP verbs, and routing, but this isn’t actually what REST is about. Instead, it’s about resources and representation of those resources. It’s about utilizing REST as the engine of application state. We can’t allow change in our API if our clients hard code the controls about how the API works at build time. Out-of-band specifications used to build API clients reincorporate client’s assumptions about the API. REST is not about exposing your database for your API, and doing so will result in tight coupling.

We can achieve this by using representational state transfer, hence REST. We can transfer representations of state from the server to the client. If we view a question, the server should tell us how we can transition from that current state to a different state.

HATEOAS (7:27)

This is achieved by something called HATEOAS, or Hypermedia as the Engine of Application State, one of the key constraints behind REST. When you look at the questions resource, which often is a collection of resources, it forces two things: one, to create new questions and two, to list them. When we look at each individual question resource, it could also offer one of two things. It could offer us the ability to delete that resource, or the ability to view it and its choices. Each of those choice resources allows us one thing in this particular case, which is to vote on it.

We can build better API clients by teaching the client the semantic meaning of our domain. In this case, the semantic meaning behind the Polls is the name. Instead of making compiled time assumptions about our API, we should learn about these controls at run time. We should do this without diverging implementation details and without building our application on top of them. We can instead load our application and download roughly how the API works. We can ask the API what transitions it offers us and what states are possible with it. Does it offer us the ability to view a list of questions?

When it comes to each individual question, we should determine if the API allows us to transition to different states. We should check if an API offers the transition to be able to vote on it. We shouldn’t assume that the user can vote on all of the choices or even duplicate the business logic from the server around the rules on when a user can vote on a choice. We should check if the API allows us to create questions and what kind of parameters are used to create those questions. By hard coding this kind of information, we disallow the client to change or a server to change without the client being updated.

HAL and Siren (9:51)

How do we get run time information from our server into the client? Unfortunately, many APIs don’t offer us information about how the API runs on run time. One way for an API to expose how this would work and the information about that would be to use special hypermedia content types. There are two popular ones called HAL and Siren. They’re both standardized content types, and both of them are built on top of JSON. Unfortunately, these will both require you to implement your API to support either HAL or Siren.

HAL (10:24)

HAL stands for Hypertext Application Language. It’s a simple format that gives you a consistent and easy way to hyperlink between resources in your API. Adopting HAL will make your API explorable and its documentation easily discoverable from within the API itself. It will make your API easier to work with and therefore perhaps more attractive to client developers.

HAL extends your data structure by adding two new parameters. It adds _links, which is a collection of all the links of that resource to other resources, and embedded resources, which allow you to embed other resources instead of throwing a link to them. This means you can change how your API will work instead of following a link to it. You can embed that in your API client, which will just understand that it’s been changed. HAL is currently one of the most popular hypermedia standards, and it’s even used by the new GitHub API. A simple link to another resource can look like this.

{
    "links": { 
        "question": { "href": "/questions/1" }
    }
}

It has the relation of that resource and an href to it. You can embed another resource, so that instead of having to follow a link, the server can change how it works to optimize the client’s usage. It notices that every client is following certain links and just embeds them together so the clients don’t have to. As a client developer, you should first look in the embedded before you follow the link.

{
    "_embedded": {
        "choices": [
            {
                "_links": {
                    "self": { "self": "/questions/1/choices/1" }
                },
                "choice": "Swift",
                "votes": 2048
            }
        ]
    }
}

Siren (11:54)

Siren is another hypermedia specification, but instead represents entities. Just as HTML is used to visually represent the documents on websites, Siren is the specification for presenting entities via web API. Siren offers structures the ability to communicate information about entities, their actions, state transition execution, and links for clients to navigate. Siren allows you to define actions on your API along with a name, the URI, and HTTP method. It also allows you to express fields or attributes required to make the transition and send the client a suggested content type.

Both content types require you to change your API and implement them. I’ve allowed this framework to work without you doing so. This supports a language called API Blueprint, which is a specification of an API. It allows you to communicate with plain APIs, but side-load the specifications of how the API works so it can utilize them to perform a similar interface. API Blueprint is the specification language that allows you to design and communicate your API before you build the concrete implementation. Since you’ve already written the specification of API, API Blueprint can power your actual client so you can change your API specifications. Your client can make use of those changes because it’s also machine-readable.

Hyperdrive (13:35)

Hyperdrive allows us to enter an API by its root URI. In this case we’ve passed in an HTTP URL, and we can determine how the API works at run time. This allows us to change how the API works between application launches. When we enter the API we pass in the closure, and this is going to be executed synchronously. When the API succeeds, we’re going to be given the representor.

hyperdrive.enter("https://polls.apiblueprint.org/") { result

}

A representor is a canonical interface to the API resource. In this case, it represents the root of our API. The representor is a representation of the resource and services, and includes various different attributes. One of the most important is probably the one you’re most familiar with: the attributes on the resource itself. It also gives us transitions, which are transitions we can make from this resource with a different resource. We can introspect transitions from a current API resource to a different one to learn about new features at run time. We’re also given a property called representors, which is a collection of other related resources that might become embedded. Before following a transition, a client should look here before following it. This allows the server to optimize it again.

Once we have a root resource back we can introspect what is on it. In this example, we can see if there is a transition to "questions", which is what we know as the question list. If so, we can perform that and follow this transition. This will result in Hyperdrive following the hyperlink and giving us a new representor for the resultant resource.

if let questions = representor.transitions["questions"] {
    
}

hyperdrive.request(questions) { result in 

}

In this case, we get back a resource including a collection of question resources. We’re going to loop over each of those and call the function viewQuestion using map. Since our API can change, we shouldn’t just expect that these questions already exist. We should handle the fact that the feature might be later missing and gracefully adapt their interface.

if let questions = representor.representors["questions"] {
    map(questions, viewQuestion)
}

Initially, viewQuestion is going to look for the question attribute, and it’s just going to print to send it out. Then, it’s going to look for a relation to other resources which are the choices and, if so, what it maps over them, it will looping over each choice and calling the viewChoice function. Finally, we check if there is a transition to the deleter, where each individual resource indicates if we can swipe to delete.

func viewQuestion(question:Representor<HTTPTransition>) {
    println(question.attributes["question"])
    if let choices = question.representors["choices"] {
        map(choices, viewChoice)
    } else {
        println("-> This question does not have any choices.")
    }

    if let delete = question.transitions["delete"] {
        // User may delete this question
    }
}

Our viewChoice function is called for each choice in the question, prints out the choice and the amount of votes it has, and then checks to see if there is a transition to vote on. We should check this individually because it could vary between each different resource. We don’t need to worry about whether or not we can vote twice and that kind of business logic because our server will handle it.

func viewChoice(choice:Representor<HTTPTransition>) {
    let text = choice.attributes["choice"]
    let votes = choice.attributes["votes"]
    println('-> \(text) (\(votes))')
    
    if let vote = choice.transitions["vote"] {
        // User may vote on this choice
    }
}

This allows our clients to change. We can change the API. We can change the logic on the API and the client doesn’t have to change a thing. When there is a transition to be able to vote on it, we can follow it without any understanding of how the API works, how its URIs are laid out, which HTML method to use, how to encode data, or even the HTTP. When we view the list of questions, we can also determine if our client allows us the transition to be able to create a new question. Of course, perhaps the feature for creation of new questions could be removed. We should determine if that’s the case, and then perhaps gracefully remove it from our interface.

if let create = questions.transitions["create"] {
    // We may create a new question

    for attribute in create.attributes {
        // Creation takes `attribute.name`
    }
} else {
    // Gracefully handle the lack of being able to create a question
}

Why? (18:00)

Why would we bother architecting our API client and server in such a way? One of the key benefits is that we can change business logic around our application. We can completely change how our feature works, and our clients can simply adapt. We can remove an API feature, either completely or temporarily. We get feature negotiation for free. We can expose different transitions to different users. We can display A/B testing. We can change how our API works.

Let’s say we want to make it so only admins can delete questions. Our API can simply only give this transition to admins. Without any client or code change, the client simply adapts to this. We can add new transitions that didn’t exist when the app was built, which could be surfaced in a button that appears and uses the information from the server to display it. A client can be built up with the semantic meanings of certain keywords. Our client could understand the term create, which assumes you have to create a new resource, or the term delete, and when it has that affordance we can just add a delete button.

Toaster as a Service (19:24)

Without tightly coupling our API and our client, we can build generic applications that can support a wide range of services. Imagine toaster as a service. We could build a toaster application that understands the semantic meaning behind toast. It understands the concepts of toast and defrost. The application could point to a different toaster API regardless what the manufacturer or feature set. It may offer us transitions, such as delay timers and other things, depending on the toaster. Our application can adapt its URI based on the available transitions and transitions for every single toaster in the world, as long as it conforms to the same semantic meaning of how toasters work. We can build forms at run time based off attributes from our API, because we have all this transition information.

Let’s say a registered transition takes in an email, password, name, gender, date of birth, and age. We could take these available fields and perform validation. The servers told us that the top field is an email field, and it’s told us the rules on how to validate that. We could download how validation works at run time. This could be really useful for things like maximum length on password fields, because hard coding the maximum length of your password into your app makes it really hard to change. When the users enter the correct thing, they can validate that it’s correct and press the enable buttons to submit forms.

Demo (20:56)

The Polls application I was showing you at the start of this talk actually exists, and I’ve built it to use Hyperdrive. I’m going to give you a quick demo of how it and our API works. We’re going to change their feature sets as a test. This API supports both Siren and HAL, and plain JSON described by API Blueprint. Here we have to find how the API works and what the client is doing when it loads or when I pull to refresh. It downloads this specification and use that to create interface. The client looks for specific transitions, and if they’re no longer available, they just hide that from the interface. Our clients adapt to change.

Conclusion (27:29)

We’ve seen how we can adapt our client to learn about how it works at run time. This could be used to move faster and write less code without breaking our API clients. It allows us to consume APIs without prior knowledge of how they work because we’re simply learning this at run time. It allows us to move fast and break nothing. Everything I’ve shown today, including hyperdrive, model objects, and information on how to transition in an API from HAL, JSON, or Siren, is fully open source.

Q&A (28:17)

Q: Let’s say you want to store this data in Core Data or Realm. How can you change attributes and things like that without versioning?

Kyle: When an API returns to you how it can work, that may not actually be how it works in the future, so it might be just at a certain point in time. Maybe the API allowed us to delete a question when we first downloaded it, but that might not be the case in the future. When you reload the application you must have an expectation that this relation could fail at a later date. But when serializing it and returning, you might want to refresh it so that it has a URL transition to reload itself. You can call this transition to be able to get an updated version of it if changes occured.

Q: How would this work with some sort of a migration, or a schema migration?

Kyle: You could simply serialize the structure itself into a database. The canonical interface to represent a state is never going to change, so therefore you would need migrating, but in theory you could change it. You’d have to have custom rules on how to change the representation from one version to another.

Q: How would you define REST? I’ve always looked at REST as a mapping between HTTP verbs to methods in Rails, but it seems like it could be any kind of mapping from verbs to methods, in any protocols.

Kyle: REST is built from a set of constraints, eight of which I think are in Roy Fielding’s dissertation. One of the most important ones I covered here is HATEOAS, which is using hypermedia for the engine replicating state by following transitions from one state to a different one. It’s often misunderstood - many people don’t understand that these constraints are actually there, and so they’re not really followed these days. REST is not about CRUD: create, read, update and delete. There are different kinds of transitions you can perform, and Rails has kind of had this concept of how REST is. They’ve sort of put it upon the world, but that’s not actually how REST is defined.

About the content

This content has been published here with the express permission of the author.

Kyle Fuller

Kyle is a developer from the UK. He’s been working in open source for a lot of time. Software Developer. Creator of Palaver, a beautiful IRC client for iPhone and iPad. Part of the core team in open source projects such as CocoaPods, Pelican, and many others.

From his own words: “I craft beautiful applications and developer tools. Mostly focusing on iPhone and iPad. Active in many open source communities”

Website: https://fuller.li

Twitter