einarvalur.co

Application Programming Interface

Where I work, we are really fond of micro-services. We tend to write a lot of them. I general we can split our micro-services in to two types: those who keep state and those who don't keep state. Micro-services that don't keep state are usually something like micro-processes that receive a message, process the message and then pass it on. For micro-services that do keep state we are usually talking about something like APIs that store data in relation-databases or data-stores. That data can be read, manipulated and removed through an API. When I say state, I mean persistence of data, not user-state (as in, who is logged into the system). Because we have some many of these micro-services, it would be beneficial to write them in such a way that if a programmer has worked on one API, he/she should feel comfortable in working on another. But, how can we write them in a consistent way? What follows will be my take on that problem...

Transfer protocol.

While it might seem like a "no brainer" to use the Hyper Text Transfer Protocol, there are other options available for system-to-system communication. For systems on the internet, we are pretty much stuck with HTTP. That isn't such a bad thing. Almost every programming language can access the TCP/IP protocol and if they can, they can parse it as HTTP messages. Because of this fact it is relatively easy to gravitate towards API/HTTP (application programming interface over hypertext transfer protocol) but by doing so we are (willingly or unwillingly) subscribing to the HTTP1.1 standard. The problem with that standard is that it is ambiguous and in many parts up for interpretation.

REST, really???

The term representational state transfer (REST) was introduced and defined in 2000 by Roy Fielding in his doctoral dissertation. Fielding's dissertation explained the REST principles that were known as the "HTTP object model" beginning in 1994, and were used in designing the HTTP 1.1 and Uniform Resource Identifiers (URI) standards. The term is intended to evoke an image of how a well-designed Web application behaves: it is a network of Web resources (a virtual state-machine) where the user progresses through the application by selecting resource identifiers such as http://www.example.com/articles/21 and resource operations such as GET or POST (application state transitions), resulting in the next resource's representation (the next application state) being transferred to the end user for their use. — wikipedia.

The term REST or RESTful is thrown around a lot but most of the APIs that I have worked on have very little to do with REST. Much has been written about this topic on the internet and I'm not going to re-hash it here, but I want to to go through the 3 (+1) stages that can take us to a REST API.

Glory of REST
level 3Hypermedia controls
level 2HTTP Verbs
level 1Resources
level 0The swamp of pox

level 0 - The swamp of pox.

The HTTP1.1 speck is not utilized in this level. All requests come in through a single endpoint. That is, resources are not identified by a unique resource-locater. The action to be taken on a resource is in the payload itself (The payload could have a {action: update}, to describe the action). This is how SOAP worked and more recently GraphQL. Because the transfer is not using any of the HTTP feature, it could in theory run on other protocols like TCP/IP directly. This level sits outside the path to Glory of REST and it therefor labeled as zero.

level 1 - Resources.

In this level, each resource has a unique resource-locater. No longer is the API accessed through a single endpoint, but each endpoint is responsible for each resource. Actions on each resource are not addressed in this level. The actions can be embedded into the payload as described above or they can be part of the resource-locater. Adding an action to a resource-locater is to have a verb in the URL, like: /users/10/**delete**. This level is not concerned about the hierarchy of the URL: user 10 can be represented as /ae6bvakdth and user 11 could be /7nbhatg58v

level 2 - HTTP Verbs.

This level introduces HTTP verbs as the indicator on which action should be taken on the resource (GET, POST, PUT, PATCH, DELETE...). This implicitly states that the URL should not contain verbs. Level 2 also puts constrains on response-codes. Successful operations should return 2xx, while client errors should return 4xx and server errors should be reported with 5xx (Table of response codes is located at the end of this article). Fetch action would look like this

GET /users/10 HTTP/1.1
Host: example.org

while delete looks like this

DELETE /users/10 HTTP/1.1
Host: example.org

:Note

Before we go into level 3 I want to point out that everything we've discussed so far sits comfortable within the HTTP1.1 specs. None of the constraints we have put in place are REST or RESTful. If these are the only constraints you have put on your own API, you don't have a RESTfull API. You just have an API that runs over HTTP1.1

level 3 - Hypermedia controls.

Here is where things get RESTfull. Level 3 adds requirements to the returned payload. Mainly: Statefulness and Discoverability. A returned resource should not overstep its scope. If a resource is made out of many smaller resources, they should not be embedded into the payload but rather referenced via a URL.

// DON'T
{
    "id": 10,
    "name": "John",
    "friends": [{
        "id": 11,
        "name": "Jane",
    },{
        "id": 12,
        "name": "Susan",
    }]
}

// DO
{
    "id": 10,
    "name": "John",
    "__resources": {
        "friends": [
            "/users/11",
            "/users/12",
        ]
    }
}

This makes the resources discoverable and the client can navigate the resources' links.

Statefulness says that a resource fetched should reflect the state of the client. Additional resources or actions can be attached or omitted depending on the state of the client

// User 10 is logged in and authenticated
{
    "id": 10,
    "name": "John",
    "__links": {
        "actions": {
            "edit": "/users/10",
            "delete": "/users/10",
        }
    }
}

// User 10 is NOT logged
{
    "id": 10,
    "name": "John",
    "__links": {}
}

In this example, because user 10 is logged in, when he navigates to his own user-record, he will get access to delete and edit links. When he is not, he doesn't get access to these actions. In effect, the server is returning responses based on state. Roy Fielding didn't define a response protocol in his white-paper so it has been left up to the community to come up with a standard. Arguably the most successful one is HAL. A JSON formatted response that provides fields for discoverability and embedding of related resources.

This part of the REST specs is sometimes referred to as Hypermedia as the Engine of Application State (HATEOAS)

The main problem with level 3 is that each API is responsible for maintaining user-state. That doesn't work well with horizontal scaling of services. In the micro-services architecture, you want to be able to add or remove instance of a service without user-disturbance. If a service goes down, so goes the user-session and the client is forced to log in again. You want your API to be a dumb data-store that will happily fulfill any request that is passed to it. In front of the API you have an authentication/authorization service that blocks or passes requests to the API based on user-permissions. The authorization service consults an external service to make these pass/block decisions. This works well with JSON Web Tokens and Attribute-based access control. Additionally, access-control logic isn't baked into the API's source-code which also aligns with separation of concerns architecture.

authorization api authorization api authorization api authorization api authentication/authorization load balancer

The point that I'm trying to drive home is that, only when an API design reaches Level 3 can it be considered REST or RESTful. If your API sits at Level 2, it means that it complies with the HTTP1.1 standard and should not be associated with REST, rather it should be called API/HTTP (application programming interface over hypertext transfer protocol). In my experience, level 3 has not suites my needs and therefor I am reluctant to label my API designs as REST/RESTful.

As stated previously, the HTTP1.1 spec is somewhat loose and open for interpretation. To come back to the original question: how can we write APIs in a consistent way? To me, the answer feel simple: Build APIs, based on the HTTP1.1 specs, but define a best practice guidelines that guides decisions in design and development. What follows is my take on a best practice guide.

URIs, Records and Collections.

An API is responsible for resources. Within an API there might be many shapes of resources, things like; artist records, album records and song records. It is up the API to manage and keep them in order. Furthermore, an API exposes either a single resource, let's call then records or a collection, which is just a fancy name for a list of records.

Each collection or a record needs to be uniquely identifiable so they can be accessed through the API. While we could give each one a random sequence of characters, it is better to provide descriptive identifiers. For that we turn to the Uniform Resource Identifier (URI) specifications.

We add further restrictions on the identifiers and say that, reading from left to right, the identifier should go from broad to narrow while describing their relationship in the hierarchy of the API.

/artist/the-beatles/albums/please-please-me/songs/love-me-do

In this example, we go from all artist, to one artist, to all their albums, to one album, to all the songs on the album, to one song.

An identifier is made up of collection-ids and record-ids separated by a forward slash. The collection-ids are constants in the URI and don't change, while record-ids are variables. Generic way of writing an URI for any song is:

/artist/:artist_id/albums/:album_id/songs/:song_id

Collection can contain a record, collection can also contain another collection. A record can contain a collection but a record can not contain another record. Therefor, in a URI, a collection-id can be followed by another collection-id or a record-id. A record-id can be followed by a collection-id but not a record-id. All URIs need to start with a collection-id. Because collection-id are always collections, they should always be plural. Don't do album, do albums

It's perfectly reasonable to have two endpoints that go up or down the hierarchy tree respectively.

/artist/the-beatles/albums/please-please-me/songs/love-me-do

/songs/love-me-do/albums/please-please-me/artist/the-beatles

URIs can have modifiers/filters in the form of query-parameters.

/artist?order=descending
/artist?from=1970&to=1980

Be extra careful not to include such modifiers/filters in the URI

/artist/order/descending

There is no record or a collection in the API corresponding to order. It makes no sense when the URI is read from left to right: go from all artists, to all orders of this artist, to one descending of all orders.

URIs should not contain verbs: no create, delete, run...,

Verbs.

Verbs describe which action should be taken on a record or a collection. Verbs can be idempotent or non-idempotent. An idempotent verb should return the same result regardless of how often it is called, provided that is called with the same parameters. Non-idempotent verbs will produce different result every time they are called.

GET.

GET returns a record or a collection. It is idempotent and should return the same result every time. A URI ending in a record-id should return a single record, while a URI ending in collection-id should return a collection. The URI /artist/the-beatles returns a record. The URI /artist/the-beatles/albums returns a collection.

GET requests should return, as their payload, the content which was requested. APIs that provide many formats should accept the Accept HTTP header. The client can therefor state which format they are interested in.

Accept: application/json, application/xml

Pagination can be done through HTTP Headers. A Server can state that they support returning a subset of a collection by issuing a Accept-Ranges Header.

Accept-Ranges: record

The client will then make a request with the following Range header:

Range: record=0-100

Stating that it is requesting records from zero to one hundred. The server then responds with a Content-Range

Content-Range: record 0-100/4724

Indicating that it is indeed returning the range requested and additionally providing the full size of the collection.

When successful, GET should return a status-code of 200 OK for a single record. A collection can either return a 200 OK or in the case of pagination, 206 Partial Content.

When a record is not found, 404 Not Found should be returned. If a collection is empty, a 200 OK should be returned along with a payload representing an empty list (in JSON, it would look like []).

The Server should also return Content-Type and Content-Length HTTP Headers. Cache headers are optional but if they can be provided, they are preferred.

OPTIONS.

The Options verb is used for discovery. It will return all Verbs available for a collection or a record. It is idempotent.

Collection request:

OPTIONS /artists HTTP/1.1
Host: http://example.com

Response:

HTTP/1.1 200 OK
Allow: GET, OPTIONS

Record request:

OPTIONS /artists/the-beatles HTTP/1.1
Host: http://example.com

Response:

HTTP/1.1 200 OK
Allow: GET, PUT, PATCH, DELETE

Options should return 200 OK.

HEAD.

The HEAD is requested in the same way as GET, and the response is the same. The only difference is that HEAD doesn't return a payload. It would make sense for HEAD to return 204 No Content, but in this context, it makes more sense to return whatever the GET would have responded with. This Verb is idempotent.

HEAD is often used in conjunction with caching or to resolve Same-origin policies, in which case, the Server should return the Access-Control-Allow-Origin HTTP Header.

POST.

POST is a non-idempotent operation. Is should be used for creating records when you want the API to generate the unique identifier on on your behalf. It therefor only makes sense to issue a POST request on a URI that end in a collection-id. Issuing this Verb on a URI with a record-id ending will result in a URI that has two record-ids in a row, that is not allowed. Reissuing the same request with the same dataset should produce a new record every time.

When issuing a POST request, you should provide the full dataset that is required. An API servicing the request should reject it if is has an incomplete dataset. In other words. POST is used to add a record to a collection, therefor, this Verb should be called on a collection.

POST /artist/comments HTTP/1.1
Host: http://example.com

text=this is the best band ever

The server should respond with a HTTP Location Header.

HTTP/1.1 201 Created
Location: /artist/comments/1756352

This tells the client where the new record is located. The server should also respond with a 201 Created status code. If the dataset is incomplete or invalid, the server should respond with a 4xx client error, preferable 400 Bad Request.

There is no requirements on the Server to provide the newly created record in the response. If the client wants a copy, a request should be made using the value provided in the HTTP Location Header.

If a POST request is made two times (or more) containing a value that is stored under a unique key in a relation database and therefor can not be stored again, a 409 Conflict response can be issued. In theory however, by definition, POST requests can be made multiple times resulting in multiple records without errors.

PUT.

PUT can be thought of as create or replace operation. If a request is made on a URI ending in a record-id, the Server should replace existing record if present or create a new record identified by the give URI if no one exist. It is therefor important that the complete dataset is provided in the request.

PUT /artist/the-rolling-stones HTTP/1.1
Host: http://example.com

name=The Rolling Stones

PUT is idempotent, the Server's state should not change for subsequent requests after the first one has been made.

The Server returns a 201 Created if new record was created. If a record is replaced with a new one, 202 Accepted, 204 No Content or 205 Reset Content status codes should be provided. Like with POST, a 400 Bad Request should be returned if the dataset is incomplete or invalid.

PUT can also be issued on a collection URI in which case the Server should remove all records in the collection and replace with the records provided in the dataset.

PUT /artist HTTP/1.1
Host: http://example.com

name[]=The Beatles
name[]=The Rolling Stones
id[]=the-beatles
id[]=the-rolling-stones

It is important when this is done, that the payload contains the :id part of the URI in the dataset. 201, 202 or 205 are acceptable status codes returned by the Server.

The Server has no obligations to return the newly created record(s).

PATCH.

PATCH is used to partially update a record. It only needs a subset of the dataset, only the fields that need to be updated. When issued on a record URI (ending in a record-id) it will update only that record and only the fields that are requested. If the record does not exist, the Server should return a 404 Not Found.

PATCH can also be issued on a collection (URI ending in a collection-id) in which case all records in the collection should be updated. A subset of records in the collection can be updated by providing a query-parameter

PATCH /artist?created-before=2010-01-01 HTTP/1.1
Host: http://example.com

status=old-records

This will result in all records created before the 1st of Jan 2010 to be updated while newer records were unaffected.

It might be noted that the PATCH Verb is not part of the original HTTP rfc2616 spec and is interpreted a bit differently in this document from the formal rfc2068 definition.

PATCHing both records and a collections should result in a 205 Reset Content if records did change or a 202 Accepted is no changes were made. A 400 Bad Request can be issued by the Server is the input was invalid. PATCH is idempotent, the state of the Server doesn't change after the initial request. The status code might. On the first request, a 205 is returned. All subsequent request return a 202.

DELETE.

As the name suggests, DELETE Verbs delete records or collection depending on if the request is made on a URI ending in a collection-id or a record-id. For records a 205 Reset Content should be returned for a successful deletion or a 404 Not Found if the record did not exist. For a deletion of at least one records in a collection a 205 Reset Content should be returned, else 202 Accepted.

Response Payloads.

I generally find that a response payload should not be deeply nested. It should for most parts not contains objects that contain other objects and so on and so forth. By doing so, a particular resource-locater is extending its scope into other locaters.

// GET /artists/the-beatles HTTP/1.1
{
    "name": "The Beatles",
    "albums": [{
        "name": "Please, please me",
        "songs": [{
            "name": "Love me, do"
        }]
    }, {
        "name": "With The Beatles",
        "songs": [{
            "name": "It Won't Be Long"
        }]
    }]
}

In this example, there wouldn't be any need for the /artists/the-beatles/albums locater as all of the data has already been presented. If on the other hand, the client is not interested in the collection of Albums, the majority of the payload is redundant. There are other technologies better suited for this type of aggregation like GraphQL.

The format itself can any format: XML, JSON, YML...

Request payload.

Many APIs accept JSON as its payload. While I don't think that is bad, I often question if such a format isn't too verbose. Generally, a record is one-dimensional. Just a collection of key/value pairs where each value is scalar (strings, numbers, booleans and NULL). Most HTTP code-libraries already support application/x-www-form-urlencoded and multipart/form-data. No additional parsing is required to support the incoming payload. Both these types support arrays of scalar values and additionally multipart/form-data supports binary data. For my projects I have usually gone with application/x-www-form-urlencoded.

What is the benefit of this

PUT /artists/the-beatles HTTP/1.1
Host: http://example.com

"{
    \"name\": \"the beatles\",
    \"genres\": ["Rock", "Pop"]
}"

over this?

PUT /artists/the-beatles HTTP/1.1
Host: http://example.com

name=the beatles
genres[]=Rock
genres[]=Pop

Conclusion

Everybody wants to say that their APIs are REST/RESTful, it has become such a buzz-word. I would argue that most of them use the definition quite liberally and the designers of such APIs should define their guidelines a bit better and utilize the HTTP1.1 specs more.

As a final note I'll leave you with all available HTTP status codes for reference.

1xx Information 405
Method Not Allowed
100
Continue
406
Not Acceptable
101
Switching Protocol
407 Proxy Authentication Required
103
Early Hints
408
Request Timeout
409
Conflict
2xx Successful 410
Gone
200
OK
411 Length Required
201
Created
412
Precondition Failed
202
Accepted
413
Payload Too Large
203
Non-Authoritative Information
414
URI Too Long
204
No Content
415
Unsupported Media Type
205
Reset Content
416
Range Not Satisfiable
206
Partial Content
417
Expectation Failed
226
IM Used (HTTP Delta encoding)
418
I'm a teapot
421
Misdirected Request
3xx Redirection 425
Too Early
300
Multiple Choice
426
Upgrade Required
301
Moved Permanently
428
Precondition Required
302
Found
429
Too Many Requests
303
See Other
431
Request Header Fields Too Large
304
Not Modified
451
Unavailable For Legal Reasons
305
Use Proxy
306
unused
5xx Server error
307
Temporary Redirect
500
Internal Server Error
308
Permanent Redirect
501
Not Implemented
502
Bad Gateway
4xx Client error 503
Service Unavailable
400
Bad Request
504
Gateway Timeout
401
Unauthorized
505
HTTP Version Not Supported
402
Payment Required
506
Variant Also Negotiates
403
Forbidden
510
Not Extended
404
Not Found
511 Network Authentication Required