Argot, Colony and stuff about internet protocol stacks.

Sunday, September 07, 2008

Part 6 - Cache Constraint

The BORED protocol already meets the first two constraints of REST; client-server and stateless. We've also extended the client-server constraint to allow asynchronous client-server. The next REST constraint to meet is the Cache constraint.

Cache
Returning to Fielding's REST dissertation, we find:
"Cache constraints require that the data within a response to a request be implicitly or explicitly labelled as cacheable or non-cacheable. If a response is cacheable, then a client cache is given the right to reuse that response data for later, equivalent requests."
In the BORED protocol there's an additional requirement to this, which relates to the stateless requirement. To label a response as cacheable or non-cacheable requires that the request is uniquely identifiable. In BORED, the stateless request data is broken into two parts; the location and the message data. To satisfy this constraint a proxy server or client must identify the location and the request data as a single object and match this against the response data. As the request message data is binary the simplest solution is for a client or proxy server to keep a hash on the message data and location. To improve performance this hash value could be added to the request data to provide a key to a cache that will lower its overhead to calculate the key. It's important to add that the hash should only be based on the message data. This allows proxies to perform operations such as rerouteing of messages to new locations without needing to update the hash value.

To support the response aspect of the cache requirement, BORED includes cache information in the response header:


preamble - BORED
version
dictionary parts
available request slots
request identifier

response code
cache information


In the REST mismatches with HTTP Fielding writes:

"Differentiating Non-authoritative Responses
One weakness that still exists in HTTP is that there is no consistent mechanism for differentiating between authoritative responses, which are generated by the origin server in response to the current request, and non-authoritative responses that are obtained from an intermediary or cache without accessing the origin server. The distinction can be important for applications that require authoritative responses, such as the safety-critical information appliances used within the health industry, and for those times when an error response is returned and the client is left wondering whether the error was due to the origin or to some intermediary. Attempts to solve this using additional status codes did not succeed, since the authoritative nature is usually orthogonal to the response status.

HTTP/1.1 did add a mechanism to control cache behaviour such that the desire for an authoritative response can be indicated. The ’no-cache’ directive on a request message requires any cache to forward the request toward the origin server even if it has a cached copy of what is being requested. This allows a client to refresh a cached copy, which is known to be corrupted or stale. However, using this field on a regular basis interferes with the performance benefits of caching. A more general solution would be to require that responses be marked as non-authoritative whenever an action does not result in contacting the origin server. A Warning response header field was defined in HTTP/1.1 for this purpose (and others), but it has not been widely implemented in practice."
When the request message headers are developed in detail it will be important to include the ability to define a 'no-cache' directive. The cache information returned in the response should also indicate if the response is non-authoritative.

Location only constraint
At this point we add another new constraint to the system; the location only constraint. The location in each request should only include the location specific information. Request parameters must only be supplied in the message data. This constraint is designed to ensure the separation of the message data from the location data. This allows fast and easier routing of message data.

This constraint is a direct opposite of a common practise of encoding request parameters on to URI's in HTTP. For example:
http://www.livemedia.com.au/bookstore?author=ryan&page=1&list=10

In the BORED protocol the location must be separate from the message data.

(location bored://www.livemedia.com.au/bookstore)
(message author=ryan@page=1&list=10)

This constraint is designed to combine with the cache constraint to ensure message parameters are not confused with location data in cache systems. It also ensures that the required meta data to decode the message is included in the message meta data.

It is interesting to note that the cache constraint requires the stateless constraint to function. A cache must be able to deal with a whole message uniquely to operate correctly.

No comments: