I've recently been reading REST In Practice, by Ian Robinson, Jim Webber and Savas Parastatidis. I had some knowledge of the hypermedia-driven architectural style, but this book has really helped me clarify my understanding of both the how and why. It's also convinced me that I should definitely consider Atom for event-based integration requirements.
There was one concept I found puzzling. In Chapter 5 (the callout box on page 114, if you have the book), the authors recommend using POST to update the state of a resource. This is driven by a choice of interpretation of the semantics of PUT. According to the HTTP spec:
If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.
In the book, the authors interpret this to mean that the body enclosed with the PUT request should contain the same elements as the representation served by GET requests at the same URI. Since we are using the HATEOAS style, representations include links and other hypermedia controls. The implication is therefore that clients should PUT representations containing both business data (for example the new contents of the coffee order) and hypermedia controls (the available next steps in the workflow). To quote the book:
this obliges a client to PUT all the resource state, including any links, as part of the representation it sends
The problem here is that the client has no business determining what workflow steps are available. It's the server's job to understand what steps are available given the current resource state, and advertise those to clients using hypermedia controls. Therefore, we don't want the client to send the complete representation including both data and hypermedia controls.
I see four potential resolutions to this conflict:
- Use PATCH
- The PATCH HTTP verb is designed to explicitly support partial updates to a resource. However it's not widely supported. It also feels semantically wrong to me. From the client's point of view, the business data comprising an order (how many lattes?) is the whole resource. The client doesn't see this as a PATCH, but as a replacement, i.e. a PUT. PATCH might make sense to express concepts like "use skimmed milk in the latte, instead of the full fat I originally ordered".
- Use POST
- This is the approach suggested in the book. For me, it has similar drawbacks to PATCH. POST implies appending to a resource. POSTing one cappuccino to a coffee order resource feels like it should add one cappuccino, not replace the existing set of ordered coffees with one cappuccino.
- Use PUT, including hypermedia
- To follow the strict interpretation that PUT should include entire representations, the client sends both the entire new coffee order and whatever hypermedia controls the service last sent it. The service then ignores these controls, since it's the service's job to determine what they should be, and on future GETs the service sends whatever controls it deems appropriate at that time. This feels nasty; we are sending unnecessary data just to satisfy some architectural OCD (hat tip to Seb Lambla for that phrase!).
- Use PUT, don't include hypermedia
- The client sends a complete representation of the new order, but no links. To me, this feels conceptually right. The client fulfils the expectations of PUT by sending a complete representation of the parts of the data for which it is responsible, but does not pretend to be responsible for determining what hypermedia controls are available.
I explored some of these thoughts through a twitter conversation with @serialseb, @iansrobinson and @jimwebber. Being able to explore these thoughts in conversation with some of the experts is incredibly rewarding, compared to sitting alone pondering, so thanks to those guys for their contributions. Ultimately we came up with a simple rule of thumb, which seemed to attract mutual agreement:
In response to GET requests, services serve complete representation of the current known state, including business data and available hypermedia controls. Clients PUT complete representations of the parts for which they are responsible.
HTTP places two significant expectations on PUT requests: idempotency, and the concept that the enclosed entity-body is a complete representation. This rule of thumb satisfies both, while absolving clients of the need to include data over which they have no control or responsibility, provided we accept that GET and PUT representations need not be the same.
This raises another question. Available hypermedia controls are just one example of representation elements that only the service should be generating. Other examples include service computed values (for example the total cost of a coffee order) and resource state that is owned by the service (for example, whether or not the order has been paid for). We don't want the client updating the total cost, or sending state fields telling the service that the client has paid, when it hasn't. Either of those could be bad for business! Should these data items be included in client PUT requests? My feeling is no. According to the rule of thumb, clients only include the parts for which they are responsible. However I have a feeling that this may be a controversial standpoint. From the twitter conversation, I learned that the authors aversion to partial PUT came from none other than Mark Nottingham.
Disclaimer: I hope that I have not misrepresented the viewpoint of the book and its authors. If I have, I apologise and welcome feedback and corrections.
