Argot, Colony and stuff about internet protocol stacks.

Wednesday, December 16, 2009

Trillions - Climbing the right mountain

If you haven't already seen it, watch this great clip by the maya design group. It describes the problems that we are about to face with integrating the Internet of Things in the most succinct way I've seen to date.

Trillions from MAYAnMAYA on Vimeo.

A section of it resonated with me greatly; that is that we are not going to create a network of Trillions of devices using the technology we have today. We need to climb a new mountain of ideas that will enable Trillions of devices to work together.

Looking a little further past the clip, you can find the White Paper by Peter Lucus, a founder of the Maya Design group on the challenges faced by creating a Trillion node network. An interesting point he makes is that in terms of information exchange, there have only been two universally accepted units of data representation; the bit and the byte. It has been forty years and there has been no improvement on this. The White Paper also discusses some of the architectural requirements that any solution must meet to make the Trillon node network a reality; these include scalability, tractability and comprehensiveness.

Argot/XPL is our attempt at climbing the right mountain. Argot is designed around those architectural requirements identified by the Maya Design group, however, at a more practical level I believe the solution must meet a number of other technical requirements. These are:

  1. Strict Internal Data Models – Each node/file in a Trillion node network must have an internal consistent data model of the types of data it contains or it can communicate. This data model describes the structure of each element it can communicate. Most distributed systems already have a strict data model; however the data model is often referenced and is not specific to the node of the network. XML Schema offers a shared data model which makes it difficult to split schemas and implement parts of a schema. A Strict Internal Data Model describes the data types that the individual node is capable of communicating.

  2. Element Versioning – The Data Model must be versioned at each individual element or structure. This is one area where XML Schema and most other data modelling tools currently fail at providing the necessary functionality to meet the needs of a Trillion node network. As functionality improves and changes, the data model must be updated. XML Schema only allows very limited changes which retain backward compatibility. Sooner or later an XML Schema requires changes which break backward compatibility. When backward compatibility is at this stage all nodes in the network must be updated to support new Schemas. This already happens in large corporate systems that use XML Schemas heavily.

  3. Partial Comparison – It must be possible to perform a partial comparison of the data model. The data model of a Trillion node network is likely to be made up of many different schemas. Devices will implement different amounts of functionality from different shared data models. It must be possible to allow two nodes to directly discover if devices can communicate and discover the method of communications.

An important aspect of these requirements for building a Trillion node network of heterogeneous devices is that agreement on the data format is not a core requirement. The format could be Argot, or any other suitable format supporting those requirements. The requirements for the Trillion node network is on the data models and the nodes in the network to be able to allow change through versioning, partial data model implementation, partial data model comparisons and finally discovery mechanisms to allow agreement on data formats to be established.

Peter Lucus finishes his white paper by suggesting that attribute/value pairs might be the next universally agreed unit of data representation. This is where I disagree, the next unit of universal data representation needs to be the data model. From this point of view, data modelling to support change is very immature. I've seen very little practical implementations that deal with change acceptably in a heterogeneous network.

In the end, I agree with the Maya Group. We need to start climbing the right mountain to create suitable solutions to the Internet of Things. I don't believe we will get there using REST and XML as they don't support change, and change is one of the few requirements I can truly be certain is required as the Internet of Things become a reality.

Wednesday, December 02, 2009

Argot submitted to IETF

A little while ago Esmond Pitt and I submitted the “Extensible Presentation Language (XPL) and Type Resolution Protocol (XPL/TRP)” to the IETF as part of the 6lowapp working group. XPL is the name we have selected, however, the technology is based on Argot which has been developed over the past six years. I'm incredibly pleased with the result, as the document now provides the most clear description of what Argot is all about. The document draft-ryanpitt-6lowapp-xpl-00 is published on the IETF web site.

The following exerts from the proposal provide a nice short and concise description of the XPL/Argot concept:

“XPL was created to combine (i) the extensibility provided by XML and XML Schema, (ii) the ability to describe and encode binary information like IDL+IOP and ASN.1, (iii) the ability to create remote procedure call services like CORBA, and (iv) to be inherently 'version-aware', to ensure that a change to one aspect of the system did not create cost across the whole network.

The XPL solution is to have each application in the network contain metadata about all the information it can communicate (send and receive). This allows each XPL application to negotiate directly with other peers the information they are able to exchange. It is this fundamental change of moving the metadata knowledge into the application or file, rather than keeping it externally, which is the most important aspect of XPL.

The XPL type system embedded (or notionally embedded) in each device is combined with methods for performing type comparisons with external systems. The type comparisons are designed to find the common definition type set shared by two XPL type systems.”

The proposal breaks the technology into two parts. The following descriptions are also from the proposal:

  1. XPL, a powerful Extensible Presentation Language which is used both at protocol design time and by applications at runtime via very simple and small code libraries to communicate both primitives and compound types. XPL creates an internally consistent directed graph, which may be cyclic, to define complete versioned type systems to be stored with (or notionally with) an application or device.
  2. XPL/TRP, a compact Type Resolution Protocol, in turn provides (i) dynamic negotiation of protocols and their constituent types between disparate devices; (ii) protocol versioning all the way down to the type level; and (iii) dynamic discovery of device capabilities and versions. An XPL device is able to describe and communicate its own application protocol. All these features are built-in to the protocol and intrinsic to its operation, rather than being extra-cost additions to it.

The XPL system is a departure from how most application protocols are designed. Protocols are normally designed and then the implementation is created from the design. XPL binds the design and implementation together so that they are interrelated. This has interesting consequences for versioning and detailed discovery. A device can be interrogated to discovery the structure of all the data that can be sent/received to it. Combined with scripting languages and other methods it would allow clients to be built automatically with zero code. XPL is the first time that you can implant a formal protocol description in any device or application down to the smallest of devices.

There's a lot of other interesting proposals that have been delivered to the IETF. In the application protocols area includes, XPL, Binary HTTP (chopan), Binary XML (exi), ZigBee Alliance requirements, and others have been submitted. The area of M2M (Machine 2 Machine) is going to continue to hot up over the next few years. There's still plenty more work to do, and it will take a while before a consensus is reached over the various protocols. Some interesting times are ahead!

Tuesday, October 06, 2009

Argot Meets Contiki

I've spent the last week getting a demo Argot application communicating with a Contiki device. Thankfully, I was able to dust off the code developed in 2005 and get it working reasonably painlessly. The end result is a java client and a Contiki host with a published method that can be called via UDP from a Java application.

The Contiki device has implemented a very simple interface as defined by the following Argot source file. It defines a “test” interface and a “doSomething” method which is attached to the “test” interface. The “doSomething” method takes a single 32-bit integer as a request and responds with a 32-bit integer. The method multiplies the input by three and sends back the result.

(library.definition"test" meta.version:"1.3")

(library.relation #test meta.version:"1.3" u8utf8:"doSomething")
(remote.method u8ascii:"doSomething"
[ (remote.parameter #int32 u8ascii:"param" ) ]
[ (remote.parameter #int32 u8ascii:"ret" ) ]
[ ]

This Argot source file includes types like “remote.method” and “remote.interface” that were not defined by the meta dictionary in the last post. These other types are also defined in other argot files. For brevity, I haven't included them here. I might cover them in a post later.

The java client uses the meta information supplied by the Argot dictionary to allow a Java Interface to be bound to the meta data. The result is that calling the method is simple:
public void testApp() throws Exception
// Create the object reference
MetaObject objectReference = new MetaObject(
new SimpleLocation(2,"local"),
_client.getTypeLibrary().getTypeId(ITestClass.TYPENAME,"1.3") );

// Call the test method and check the result.
ITestClass test = (ITestClass) _client.getFront( objectReference );

// Call the remote method.
int result = test.doSomething(10);

assertEquals( result, 30 );
The first line creates the object reference. It consists of a location and a type. The location is an abstract data type meaning that different types of location specifiers can be used. In this case the location specifier is a simple Integer. The server contains a list of index based objects. For small devices this is appropriate, however, other devices may wish to use URLs or other location specifier. The second parameter is the identifier of the “test” interface.

The second line retrieves the interface to the “test” interface. In Java this has been implemented as the ITestClass. During setup the ITestClass is bound to the “Test” interface definition. All the methods are checked to ensure they are consistent with the methods that are defined in the Argot dictionary.

Finally, the method “doSomething” is called with the value 10. The end of the test asserts that the final result is 30.

On the Contiki side, I created a static dictionary structure of all the data types required by the interface. This included the full meta dictionary as defined in the last post, the “test” interface and method definition and everything in between. The end result is a dictionary with 60 entries and uses about 3kb of data. This can probably be halved with a few changes; I'll get to those later.

Argot uses a Type Resolution Protocol to create a tight data binding between client and server. The protocol uses only a few basic concepts. These are:
  • TYPE_META – The initial request sends a challenge to request the meta dictionary. As this is the base for all other types, the client and server must ensure these are the same on both sides.

  • TYPE_MAP – A type map request sends a data type dictionary location and definition and returns a type identifier.

  • TYPE_MAP_DEFAULT – Sends a type name and returns a location with version and definition. The client then checks its own dictionary for a match.

  • TYPE_RESERVE – In cases where the data type is self referencing the client must retrieve and identifier for the type to adequately define the type. This reserves an identifier before TYPE_MAP is called.

  • TYPE_REVERSE – In cases where the server sends a client an identifier that the client hasn't mapped, the client sends the identifier and is returned the location with version and definition of the data type.

  • TYPE_BASE – This is a type of boot strap discovery mechanism. The client is able to request the base type or interface to be found on the host.<.li>

  • TYPE_MSG – Finally, the last message is a user defined message and is for the application protocol being used.
These basic messages allow a client to discover from first principles all the data types required to communicate with a host. The following demonstrates the messages being sent/received between the java application and Contiki host.

Note: The very first message should be a TYPE_META message. This is yet to be implemented on Contiki as it needs to be changed from the previous method of making this call. Previously the client would send the full meta dictionary so that the server can perform a binary compare. To reduce size this will be reversed. The client will issue a meta dictionary challenge; the server will respond with either the full meta dictionary or just a meta dictionary version. Obviously on small devices the ability to remove the 35 types from the meta dictionary will help size greatly. To reduce size further the response to the meta dictionary challenge could be a URL of where to find the device dictionary. In this way the benefits of the Argot protocol could fit the smallest of devices.

The following shows the full conversation between Java client and Contiki host:
-> 02 00 09 0b 00 06 r e m o t e
<- 02 00 00 00 23 0b 00 06 r e m o t e 00 01 05

-> 02 00 06 0b 23 03 r p c
<- 02 00 00 00 3d 0b 23 03 r p c 00 01 05

MAP DEFAULT: remote.rpc.request
-> 02 00 0a 0b 3d 07 r e q u e s t
<- 02 00 00 00 3e 0c 3d 07 r e q u e s t 01 03 00 23 0f 03 0e 08 l o c a t i o n 0d 25 0e 06 m e t h o d 0d 28 0e 04 d a t a 10 0d 01 0d B

MAP DEFAULT: remote.location
-> 02 00 0b 0b 23 08 l o c a t i o n
<- 02 00 00 00 25 0c 23 08 l o c a t i o n 01 03 00 02 07 00

-> 02 00 09 0b 00 06 u i n t 31 36
<- 02 00 00 00 28 0c 00 06 u i n t 31 36 01 03 00 09 13 10 10 04 16 10 17 18 19

MAP DEFAULT: meta.identified
-> 02 00 0d 0b 03 0a i d e n t i f i e d
<- 02 00 00 00 B 0c 03 0a i d e n t i f i e d 01 03 00 11 0f 01 0e 0b d e s c r i p t i o n 0d 08

MAP DEFAULT: remote.rpc.response
-> 02 00 0b 0b 3d 08 r e s p o n s e
<- 02 00 00 00 3f 0c 3d 08 r e s p o n s e 01 03 00 18 0f 02 0e 07 i n E r r o r 0d 3c 0e 04 d a t a 10 0d 01 0d B

-> 02 00 07 0b 00 04 b o o l
<- 02 00 00 00 3c 0c 00 04 b o o l 01 03 00 02 0d 01

MAP DEFAULT: index (definition)
-> 02 00 08 0b 00 05 i n d e x
<- 02 00 00 00 27 0c 00 05 i n d e x 01 03 00 04 0f 01 0d 28

MAP: index (relation)
-> 01 00 08 0d 25 05 i n d e x 00 02 06 27
<- 01 00 00 00 26

MAP DEFAULT: remote.method
-> 02 00 09 0b 23 06 m e t h o d
<- 02 00 00 00 2e 0c 23 06 m e t h o d 01 03 00 33 0f 04 0e 04 n a m e 0d 2f 0e 07 r e q u e s t 10 0d 01 0d 30 0e 08 r e s p o n s e 10 0d 01 0d 30 0e 05 e r r o r 10 0d 01 0d 28

MAP DEFAULT: u8ascii
-> 02 00 0a 0b 00 07 u 38 a s c i i
<- 02 00 00 00 2f 0c 00 07 u 38 a s c i i 01 03 00 10 12 10 0d 01 0d 01 09 I S O 36 34 36 2d U S

MAP DEFAULT: remote.parameter
-> 02 00 0c 0b 23 09 p a r a m e t e r
<- 02 00 00 00 30 0c 23 09 p a r a m e t e r 01 03 00 12 0f 02 0e 04 t y p e 0d 28 0e 04 n a m e 0d 2f

MAP: remote.method (relation)
-> 01 00 10 0d 0b 0d r e m o t e 2e m e t h o d 00 02 06 2e
<- 01 00 00 00 2d

MAP_DEFAULT: remote.interface
-> 02 00 0c 0b 23 09 i n t e r f a c e
<- 02 00 00 00 2b 0c 23 09 i n t e r f a c e 01 03 00 07 0f 01 10 0d 01 0d 28

MAP: remote_interface remote.interface (relation)
-> 01 00 13 0d 0b 10 r e m o t e 2e i n t e r f a c e 00 02 06 2b
<- 01 00 00 00 2a

MAP: test (interface)
-> 01 00 09 0c 00 04 t e s t 01 03 00 02 2b 00
<- 01 00 00 00 29

MAP: int32
-> 01 00 0a 0c 00 05 i n t 33 32 01 03 00 09 13 20 20 04 16 20 17 18 19
<- 01 00 00 00 33

MAP: test.doSomething (method definition relation)
-> 01 00 0e 0d 29 0b d o S o m e t h i n g 00 20 2e 0b d o S o m e t h i n g 01 00 33 05 p a r a m 01 00 33 03 r e t 01 00 33
<- 01 00 00 00 32
All the above is purely related to the data type agreement required before actually making the real method call. Obviously this is expensive and would not be performed every time. The client would cache the mappings already performed so that they can be used later.

The most important aspect of the list of data types resolved above is that only the data types required to make the doSomething method call are resolved. Additional types may need to be resolved before calling another method. This allows the client to only resolve the parts of the interface it uses. There are numerous advantages to this which I'll cover in another post.

Finally the actual method call is made.

-> 07 27 00 02 00 32 01 00 33 00 00 00 0a
<- 07 00 01 00 33 00 00 00 1e

The final message request can be broken down as the following. It is defined by the remote.rpc.request type:
 07 – Message Request
27 – Location identifier type 27 (index)
00 02 – Location index 2. This is currently uint16. Could be modified to uvint28 to reduce a byte.
00 32 – Method identifier. Could also be changed to use uvint28.
01 – Number of parameters.
00 33 – First parameter is a int32.
00 00 00 0a – The input value 10.
The result is a defined by remote.rpc.response and is broken down as follows:
 07 – Message response
00 – In Error flag. Boolean value. False.
01 – One parameter returned.
00 33 – First parameter is int32.
00 00 00 1e – The return value 30.
The remote.rpc protocol is very simple and is missing some elements that would be required for a UDP implementation. For instance an identifier so that requests could be matched up with correct responses would be required.

It should also be noted that remote.interface and remote.method define a style of RPC mechanism. Different meta data could be produced to define a REST like interfaces and protocols.

There's still plenty to be done to get the Internet Draft ready by the 19th of October. Hopefully the above provided a small insight to how Argot works and can be implemented on even the smallest of devices.

Tuesday, September 29, 2009

IETF and Squeezing the Meta Dictionary

In the last few month I've struggled to find the right direction to take Argot. I've looked at reviving the Personal Browser concept, investigated SCTP and a few other things. These are all good research areas for Argot, however, they take the focus away from the core Argot idea. I've now returned to the core of Argot with a renewed focus driven by the 6lowapp IETF working group.

The 6lowapp IETF working group is being formed to develop the application protocols that will form the basis for the “Internet of Things”. Argot was originally created with small embedded systems in mind; in fact, in October 2005 I blogged about reducing an Argot RPC server to 7kb. While Argot can solve problems in other domains, the “Internet of Things” is the best fit for the problems it does solve.

The current plan is to develop an IETF Internet Draft (I-D) which provides the rationale for Argot, the technical problems it solves and provide a specification. In addition, I plan on developing an example service using Contiki. A lot of work to be done before the 19th October. Of course, there's no guarantee that Argot will become an RFC, however, I should at the very least receive some good feedback and allow Argot to fit into the application stack developed.

An important part of developing for embedded systems is size, so I've been squeezing the Argot meta data and developing ways to allow the Argot protocol to work on the smallest of devices. In doing so, I've also been improving the meta dictionary and removing a few niggling constraints.

The first change to help size was the introduction of the uvint28 data type. This is an unsigned variable length integer with up to 28 bits of integer data. It uses the high bit of each octet as a continuation bit. The integer can be between 1 and 4 octets. The 28 comes from the fact that the normal 32 bit integer loses 4 bits of precision to the continuation bits. This type has replaced the uint16 (unsigned 16 bit integer) in the meta dictionary and in doing so removes many zero bytes. It also removes the limitation of the 16 bit integer. The meta dictionary when encoded after this change is 985 bytes long and includes 29 data types.

The next change was to introduce a meta.cluster type. This is a group definition and allows each name to refer to a cluster. This allows to use cluster references instead of recording the full class names for each definition. This change introduces a few new types, however, overall it removes a lot of duplicate information. The result is that the meta dictionary is now 888 bytes long and includes 35 data types. I'm not superstitious, however it's pretty cool that the end result is 888 bytes long; a very lucky number in Chinese.

[Edit: After doing some testing I discovered a one off bug which was causing an extra byte of data in strings. The meta dictionary is now 859 bytes long. Not as cool as the 888 byte length, but it is even shorter which is great!]

The end result of these changes should allow a full service description to use from 3kb of data and 3kb of code. In the coming weeks this will be confirmed and tested using the same mechanisms developed back in 2005. That is, using a Java client with full Argot protocol stack and a cut down purpose built embedded Argot stack.

Depending on the application and size of device, 3kb of data may still be too large. To resolve this, I've been looking at the Argot type resolution protocol. Instead of storing the Argot meta data on the device, the device can simply report an URL or other host that contains the Argot meta data. This should in effect allow the device to report the full message definition of its services in less than 1kb of code and data.

For those interested, here's the new meta dictionary data definitions.

(library.list [

/* BASE_ID 1 */


/* UINT8_ID 2 */

(library.definition"uint8" meta.version:"1.3")
(meta.atom uvint28:8 uvint28:8
[ (meta.attribute.size uvint28:8)
(meta.attribute.bigendian) ] ))

/* UVINT28_ID 3 */

(library.definition"uvint28" meta.version:"1.3")
(meta.atom uvint28:8 uvint28:32
[ (meta.attribute.size uvint28:28)
(meta.attribute.bigendian) ] ))


("meta" )

/* META_ID_ID 5 */

(library.definition"" meta.version:"1.3")
(meta.reference #uvint28))


(library.definition"meta.cluster" meta.version:"1.3")
(meta.sequence []))


(library.definition"meta.abstract_map" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference


(library.definition"meta.abstract" meta.version:"1.3")
(meta.sequence [
(meta.reference #uint8)
(meta.reference #meta.abstract_map))]))

/* U8UTF8_ID 9 */

(library.definition"u8utf8" meta.version:"1.3")
(meta.reference #uint8)
(meta.reference #uint8))

/* META_NAME_ID 10 */

(library.definition"" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"group" (meta.reference
(meta.tag u8utf8:"name" (meta.reference #u8utf8))


(library.definition"meta.version" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"major" (meta.reference #uint8))
(meta.tag u8utf8:"minor" (meta.reference #uint8))


(library.definition"meta.definition" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #meta.cluster)
(meta.abstract_map #meta.atom)
(meta.abstract_map #meta.abstract)
(meta.abstract_map #meta.abstract_map)
(meta.abstract_map #meta.expression)


(library.definition"meta.expression" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #meta.reference)
(meta.abstract_map #meta.tag)
(meta.abstract_map #meta.sequence)
(meta.abstract_map #meta.array)
(meta.abstract_map #meta.envelope)
(meta.abstract_map #meta.encoding)


(library.definition"meta.reference" meta.version:"1.3")
(meta.sequence [(meta.reference]))

/* META_TAG_ID 15 */

(library.definition"meta.tag" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"name"
(meta.reference #u8utf8))
(meta.tag u8utf8:"data"
(meta.reference #meta.expression))]))


(library.definition"meta.sequence" meta.version:"1.3")
(meta.reference #uint8)
(meta.reference #meta.expression)))

/* META_ARRAY_ID 17 */

(library.definition"meta.array" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"size" (meta.reference #meta.expression))
(meta.tag u8utf8:"data" (meta.reference #meta.expression))]))


(library.definition"meta.envelope" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"size" (meta.reference #meta.expression))
(meta.tag u8utf8:"type" (meta.reference #meta.expression)) ]))


(library.definition"meta.encoding" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"data" (meta.reference #meta.expression))
(meta.tag u8utf8:"encoding" (meta.reference #u8utf8))]))

/* META_ATOM_ID 20 */

(library.definition"meta.atom" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"min_bit_length" (meta.reference #uvint28))
(meta.tag u8utf8:"max_bit_length" (meta.reference #uvint28))
(meta.tag u8utf8:"attributes"
(meta.reference #uint8)
(meta.reference #meta.atom_attribute)))]))


(library.definition"meta.atom_attribute" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #meta.attribute.size)
(meta.abstract_map #meta.attribute.integer)
(meta.abstract_map #meta.attribute.unsigned)
(meta.abstract_map #meta.attribute.bigendian)


("meta.attribute" )


(library.definition"meta.attribute.size" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"size" (meta.reference #uvint28))


(library.definition"meta.attribute.integer" meta.version:"1.3")
(meta.sequence []))


(library.definition"meta.attribute.unsigned" meta.version:"1.3")
(meta.sequence []))


(library.definition"meta.attribute.bigendian" meta.version:"1.3")




(library.definition"dictionary.base" meta.version:"1.3")
(meta.sequence []))


(library.definition"" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"name" (meta.reference


(library.definition"dictionary.definition" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference
(meta.tag u8utf8:"version" (meta.reference #meta.version))

(library.definition"dictionary.relation" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference


(library.definition"dictionary.location" meta.version:"1.3")
(meta.abstract [
(meta.abstract_map #dictionary.base)
(meta.abstract_map #dictionary.definition)
(meta.abstract_map #dictionary.relation)


(library.definition"dictionary.definition_envelope" meta.version:"1.3")
(meta.reference #uvint28)
(meta.reference #meta.definition)))


(library.definition"dictionary.entry" meta.version:"1.3")
(meta.sequence [
(meta.tag u8utf8:"id" (meta.reference
(meta.tag u8utf8:"location" (meta.reference #dictionary.location))
(meta.tag u8utf8:"definition" (meta.reference #dictionary.definition_envelope))]))


(library.definition"dictionary.entry_list" meta.version:"1.3")
(meta.reference #uvint28)
(meta.reference #dictionary.entry )))


And just for fun, here's the meta dictionary encoded. The encoding is a mixture of hex and ascii. Ascii is only used for a-z characters.

23 01 1c 01 06 02 1e 01 05 u i n t 8 01 03 09 14 08 08 04 17 08 18 19 1a 03 1e 01 07 u
v i n t 2 8 01 03 09 14 08 1c 04 17 08 18 19 1a 04 1d 01 04 m e t a 01 06 05 1e 04
03 2e i d 01 03 02 0e 03 06 1e 04 08 2e c l u s t e r 01 03 02 10 00 07 1e 04 0d 2e
a b s t r a c t _ m a p 01 03 08 10 01 0f 02 i d 0e 05 08 1e 04 09 2e a b s
t r a c t 01 03 07 10 01 11 0e 02 0e 07 09 1e 01 06 u 8 u t f 8 01 03 0c 13 11 0e
02 0e 02 05 U T F 2d 8 0a 1e 04 05 2e n a m e 01 03 13 10 02 0f 05 g r o u p 0e
05 0f 04 n a m e 0e 09 0b 1e 04 08 2e v e r s i o n 01 03 14 10 02 0f 05 m a j
o r 0e 02 0f 05 m i n o r 0e 02 0c 1e 04 0b 2e d e f i n i t i o n 01 03 07
08 05 06 14 08 07 0d 0d 1e 04 0b 2e e x p r e s s i o n 01 03 08 08 06 0e 0f 10 11
12 13 0e 1e 04 0a 2e r e f e r e n c e 01 03 04 10 01 0e 05 0f 1e 04 04 2e t a g
01 03
n c e 01 03 07 10 01 11 0e 02 0e 0d 11 1e 04 06 2e a r r a y 01 03 12 10 02 0f 04 s
i z e 0e 0d 0f 04 t y p e 0e 0d 12 1e 04 09 2e e n v e l o p e 01 03 12 10 02
0f 04 s i z e 0e 0d 0f 04 t y p e 0e 0d 13 1e 04 09 2e e n c o d i n g 01 03
16 10 02 0f 04 d a t a 0e 0d 0f 08 e n c o d i n g 0e 09 14 1e 04 05 2e a t o
m 01 03 7 10 03 0f 0e m i n _ b i t _ l e n g t h 0e 03 0f 0e m a x _ b
i t _ l e n g t h 0e 03 0f 0a a t t r i b u t e s 11 0e 02 0e 15 15 1e 04
0f 2e a t o m _ a t t r i b u t e 01 03 06 08 04 17 18 19 1a 16 1d 04 0a 2e a
t t r i b u t e 01 06 17 1e 16 05 2e s i z e 01 03 0a 10 01 0f 04 s i z e 0e
03 18 1e 16 08 2e i n t e g e r 01 03 02 10 00 19 1e 16 09 2e u n s i g n e d
01 03 02 10 00 1a 1e 16 0a 2e b i g e n d i a n 01 03 02 10 00 1b 1d 01 0a d i c
t i o n a r y 01 06 1c 1e 1b 05 2e b a s e 01 03 02 10 00 1d 1e 1b 05 2e n a m
e 01 03 0a 10 01 0f 04 n a m e 0e 0a 1e 1e 1b 0b 2e d e f i n i t i o n 01 03
15 10 02 0f 04 n a m e 0e 0a 0f 07 v e r s i o n 0e 0b 1f 1e 1b 09 2e r e l a
t i o n 01 03 0f 10 02 0f 02 i d 0e 05 0f 03 t a g 0e 09 20 1e 1b 09 2e l o c a
t i o n 01 03 06 08 04 1c 1d 1e 1f 21 1e 1b 14 2e d e f i n i t i o n _ e n
v e l o p e 01 03 05 12 0e 03 0e 0c 22 1e 1b 06 2e e n t r y 01 03 1e 10 03 0f 02
i d 0e 03 0f 04 n a m e 0e 20 0f 0a d e f i n i t i o n 0e 21 23 1e 1b 0b 2e
e n t r y _ l i s t 01 03 07 10 01 11 0e 03 0e 22

Tuesday, July 07, 2009

Colony Personal Browser – Part 1 - Overview

After researching and implementing versioning for Argot (see recent posts), I've spent a lot of time looking for which direction to head in next. The problem I find with Argot is that there are so many different directions it can be taken. There's so much work that can be done, yet due to time/money constraints I've only got an hour or two a day (while travelling to/from work) that progress can be made.

For a while I have been having a lot of trouble finding what to focus on and jumped between a few projects. I updated Colony to work with Argot versioning. I also rebuilt the software that Argot was originally built; a network monitoring solution that built on a virtual network of nodes for processing and filtering log messages at high speed. I contemplated polishing Argot 1.3; fixing up documentation, adding more examples and improving the web site. I've also thought about the BORED protocol (see older posts) and the redesign of the Colony communication protocol. This has lead to thinking about the model that Argot and Colony were based. The original model dates back over 15 years and hasn't changed much since then.

After a lot of heading off on different tangents, I think I'm now heading in the right direction. I'm looking at the full protocol stack from transport to application layer. I'm returning to the original model that Colony was based and re-establishing the fundamental concept of the personal browser. This will provide a more solid foundation from which to build Colony and the Argot Remote Type Negotiation system. It will also take on some of my learning's from investigating REST with the BORED (Binary Object REst Distributed) protocol.

In the numerous posts regarding BORED protocol, one of the most pertinent things I discovered is that designing protocols requires a very clear understanding of the vision for the protocol. The specific aims, tasks and model must be understood clearly from the start. For this reason its worth revisiting the history and thoughts behind Colony and Argot and understanding the model it's based upon.

Colony History

As I said before, the original model dates back over 15 years to the early 1990's from when I had just started computer science at University. Before I started at Uni I already had a fascination with protocols and communication. This interest was born from using Bulletin Board Systems (BBS's) as a teenager. Before starting Uni I had designed protocols and developed software for drawing vector graphics so two people could both draw on each others screens. What annoyed me from that early age about BBS's was the inability to do more than one thing. As I learnt more about communications I started to develop a model for distributed applications.

The model for Colony was based on a few simple constructs; zones, realms and nodes. These constructs still exist in Colony today, however, they don't integrate as cleanly into the design as I would like. I'll explain each of these constructs and their behaviour now.

A zone is a container with simple name/value pairs. A zone could be implemented as a HashMap, overlay a directory structure or proxy another zone from a remote system. It can contain simple values, objects or other zones. A zone provides the basis for naming and containment in the system.

A realm is an extension of a zone. A realm provides the security aspects of a zone or set of zones. The idea of a realm was that it be implemented as a process but include a security model to ensure applications only had a specific access to the underlying system.

A node provided the processing aspect of the design. A node contains a queue for receiving messages and a thread pool for processing the messages. A distributed application would consist of multiple nodes passing messages. The design was suited to network applications that would process a lot of data from different sources.

Applications would be designed with all their data contained in zones, realms and nodes which would allow real-time introspection of the data and how nodes were behaving.

The original design was based on a message processing architecture. Each node would be designed to perform a small and specific task and forward on the same or new message to another node. Applications could be configured by putting different nodes into a zone and wiring them up to perform a specific function. Messages could be passed between realms to create a distributed application. Since this original design a lot of things have changed about what is important in distributed applications.

Colony Architecture Revisited

Before launching into protocol design based on the original model of Colony, it's worth looking at the original design with over 15 years of experience. The original design was aimed at creating a BBS client/server system that allowed multiple application to communicate concurrently. It was expected that each application would be downloaded and installed and use the naming and communication system of Colony. Since that time things have changed considerably.

TCP/IP and the browser have become the key building block for the largest percentage of web communication. Many of the building blocks for distributed systems have been developed and are a lot better understood. Virtual Machines and scripting languages which leads to code on demand are now available. This all leads to a different architecture for Colony which mixes the old and new.

The revised aim and architecture of Colony is to build a “personal browser”. The Colony Personal Browser mixes concepts of Instant Messaging, Code on Demand, Distributed Computing, Security, Identity and Web Browsing to create a secure connected peer-to-peer browser environment. This environment should allow applications to create secure connections between both individuals and Colony servers offering services and/or web style documents. At a high level the components of the “personal browser” include:

Other than Identity all of these components make up a modern browser; so before moving on I'll explain what is different about each of these components.

Documents & Code on Demand – In a paper I wrote a number of years ago, I outlined a method of using Argot as a common data format for virtual machine byte code, document formatting and as a scripting language. All of these formats are compiled to a common byte code which is then targeted to a more specific virtual machine. Creating a common format for all data creates a very flexible development environment where different styles of programming languages can be built on the same virtual machine. It should also allow mixing different programming styles closely together allowing the best method of development to be used for each problem. You can read the evolvable programming languages paper here.

Security & Identity – The web of trust security model has been well documented and around for many years. However, after the initial interest in PGP its use has dwindled and it has not been accepted into the main stream. The aim of the personal browser is to wrap a set of user interfaces and processes around the web-of-trust model to make it easier and quicker for users to understand and use. You can read more about the web of trust at wikipedia.

Communications – The area of communications is obviously the most important aspect of the personal browser. Communicating between users and servers requires a hybrid peer-to-peer system. This style of communication requires a flexible transport layer that allows any peer to act as a client and initiate requests to other peers. This then requires a flexible naming system that allows peers to be found and communicated. The naming and routing aspect of the personal browser is probably the most difficult with the least defined solution. I'll discuss the protocol in more detail a little later.

Virtual Machine – The virtual machine and underlying execution environment must provide a sandbox environment that ensures the security of the underlying system. This is a well understood area and selecting the Google Chrome environment using V8 or Java Virtual Machine as a basis will both achieve the aims here.

User Interface – This is another area with very well understood and well developed code. Taking a browser layout engine that can be removed from an existing browser without being infected with HTML or CSS will provide a good clean user interface as a starting point.

Colony communication building blocks
This next section will examine some of the building blocks required as input into the next version of the Colony protocol.

Sand Boxes
The concept of realms can be compared to the sand boxes used by both Java and Google Chrome. Google Chrome is interesting in that it separates the browser into multiple Processes to improve security. Colony can use this same mechanism to separate each realm and provide limited access to hosted applications. This can be achieved by using a gateway system which provides firewall and routing mechanisms to data being sent and received from applications.

The gateway is able to handle connections to multiple hosts and direct messages between multiple Realms running on the same computer. Between the Gateway and Realms a system pipe or other communication mechanism can be used. For external communications IP based transport will be used.

SCTP – The right transport
One of the big challenges that I've been faced with over the years is building the right type of protocol over TCP. As TCP is a stream based protocol it is faced with a number of issues. A friend recently pointed me towards SCTP as a potentially better transport protocol. After reviewing some of the various introductory web sites on SCTP, I'm convinced. SCTP creates a much better base to build higher level protocols than both UDP or TCP. SCTP provides:
  • Multiple Streams on a single connections
  • Multi-homing
  • Unordered delivery
  • Reliable transmission
  • etc

There's more information on SCTP available out there. There is also the issue that SCTP is only being introduced in Java 7 which is still in beta. Also, Java 7 when released will not support SCTP on Windows. There is third party implementations of SCTP for windows, however, the protocol is not yet supported by Microsoft. Given I'm developing Colony as a longer term solution I don't believe this will be a problem.

I will publish some prototype code soon which demonstrates TLS on SCTP using Java 7 which I developed on Open Solaris using VMWare.

TLS – Web of Trust

Another big issue in distributed applications is security and identity. To provide security the obvious answer is TLS which is a proven secure protocol. As mentioned earlier, the solution for identity will be the Web-of-trust model.

Ideally the digital certificate data used in the web-of-trust model will be encoded using Argot instead of ASN.1 to ensure consistency across the system. The Web-of-trust model will be developed after the main protocol.

Naming & Location

The URI has proved itself as one of the most flexible solutions to object naming and location. People understand it and find it easy to use. It is not perfect for this application as it binds the name to a specific host and IP address. This may not be a perfect mechanism to use, however, initially it provides the best choice for naming and location. I'll discuss why the URI is not perfect a little later into the design.

Another possible solution to naming and location would be based on a peer-to-peer naming system. This would fit better with the distributed nature of the system.

Protocol Stack Design

The main ingredients of Sand Boxes, SCTP, TLS/Web-of-trust and URI naming are all the ingredients required to design the lower levels of the protocol stack. I'll now examine how these parts would be best glued together in a protocol to meet the design.

My initial thought process for the protocol stack was that it would be obvious that TLS would sit over SCTP to provide secure communications between systems. However, after delving further into the design I realise that using the model where a gateway is the external communications end point it fails to provide true end to end security to the application. To provide end to end security the TLS connection must terminate at the realm and not at the gateway. To support this, a protocol gateway layer (Colony Routing Protocol) is required which acts as an application message router. The protocol stack then looks like:

In effect what is being created is a high level transport protocol that allows messages to be delivered directly to objects with security. This change in model has a number of advantages. It allows the gateway to act as a load balancer or facade onto a group of larger systems. It can also allow the gateway to proxy requests to other gateways. The gateway can also use other transport mechanisms to reach the final Realm destination. On a local computer it can use pipes, while on a remote system it could use SCTP, TCP or other transport. The protocol should also be light weight enough to send through to embedded devices.

As mentioned before, using SCTP encourages a message based system. The gateway router mechanism also suggests that a message based system would also be appropriate for this layer of the protocol stack. A secondary restriction is that TLS requires that each message packet be no longer than 16kb. This puts a restriction on the size of the overall SCTP packet to 16kb plus any header information.

It is expected that the end point location for a message will be any URL. The router will find the realm that contains the target object and direct the message to that realm. For example, a target of crl://some.server/target/realm/object will be received by the realm container at crl://some.server/target/realm.

Another advantage to this model is that while it offers a lot of flexibility, it does so without putting any restriction on the actual protocol used by the Realm or target object. Different objects can use different protocols on the same transport.

The Colony Routing Protocol data packet is likely to contain the following parts:

Preamble – header signifying the colony routing protocol.
Version – version of the protocol. Major & minor version details.
Headers – Additional header data. Does message require response, etc.
Target Location – URI of the target location.
Data – data to be delivered to the realm. Max. size 16kb.
Digital Signature – Optional signature for message.

An interesting element of this protocol design is that it does not dictate request/response semantics as used in REST/HTTP. A client can send a one way message to a target location. It is up to the semantics of the session layer protocol to decide if a response is required. The underlying system also puts no restriction on whether the client or server initiates the request.

To support server initiated requests requires that each client has a host and realm that will receive messages. As clients do not usually have a fully qualified host name it will require an alias is given to the client connection on the server. This is where a traditional HTTP browser paradigms do not provide enough flexibility and a peer based host naming system may be more appropriate.

Another requirement for the routing protocol is to allow the creation of routes between two realms/objects that are expected to last a long period. The advantage of this is that the target location and any digital signatures can be dropped allowing less overhead per packet. This would be especially useful for voice or video communication protocols. It would also allow some of the benefits of SCTP to be exposed to the upper layer protocols. This requires that the communication layer keep session state. Keeping session state is not ideal at this layer of the protocol stack, however, is inevitable to support this type of feature. The state should only be limited to link information state which is no different to the way a NAT router holds state.

The routing layer has many parallels with peer-to-peer naming and routing systems which have been documented previously. This area will need to be researched more thoroughly before locking in a specific solution.

Before finishing the subject of routing and naming it is important not to forget firewalls and NAT. This is the enemy of any peer to peer based system. There are various solutions to this problem which have been documented. A possible solution is to use Colony application servers which act as Internet routers and Proxies for clients behind firewalls. This is another area which will require further investigation.

Session (Discovery) Layer
The session layer will sit on the TLS layer to provide elements of a REST architectural style; specifically the client/server, stateless and uniform interface constraints. The aim of this layer is to provide a discovery, reflection and basic set of mechanisms to communicate with the target objects. The discovery aspect of the protocol layer will allow a client to discover what protocols can be used with the target object. The server could allow multiple presentation and application layer protocols to be provided, allowing a client to choose the most appropriate method to communicate.

The Uniform Interface would include at least the following types of messages:

GET - return the selected or default presentation of the target object. This may be a image or document file. It may also be code on demand.

META – return the associated protocols and presentations available for the target object. This could looking something like:

(protocol:cache,stream data-type:image/jpg)
(protocol:dynamic-argot-dictionary application:colony-vm)

DATA – send data to the target object using a selected protocol.

The packet structure for the stream includes the following:

message type – GET, META or DATA. Most likely encoded as a byte.
Protocol type – the protocol type being sent.
Data – data to be passed to the target object.

Depending on the protocol selected there may be multiple session, presentation and application layer options over this transport layer. These should be registered with the target object meta information. The client is able to select the most suitable method to communicate.

Application Level Protocols

The aim of this layered protocol design is to expose as many of the features of the underlying transport to the upper layers of the protocol stack and therefor distributed applications. It is likely that a number of application level protocols would be developed for the most common situations; the most obvious being a stream based solution for serving static files. This would allow both client or server side cache mechanisms to be included. Message queue based protocols are another obvious possible protocol. Smart agent or mobile code as already used by Colony is another possible application layer protocol.

As the design exposes the underlying SCTP packet structure to the application a combination of these protocols are possible. Using SCTP out-of-order and non-guaranteed delivery options also allows video or voice data protocols to be established.

This is very rough draft of the Colony Personal Browser model being developed and how the protocol would be designed to support the model. It builds on concepts from REST, uses SCTP and includes TLS security based on the web of trust model.

The design provides the following features:

Peer to Peer – Application protocols can direct messages to any object using a URL style format that users can easily understand.
User to User – Applications & Documents can be developed which connect users to users in the same way instant messenger applications work.
User to Server – Applications can be developed which connect directly to server applications in the same way the traditional browser operates.
Code & Data on demand – Using the virtual machine for both data and code creates a highly flexible environment to build interactive documents or applications.
Flexible – Application protocols have full access to the flexibility of the SCTP protocol; allowing to choose out of order delivery or non guaranteed delivery. This allows video and voice applications to be built.
Packet Based – Application protocols must adhere to sending a maximum data packet size of 16kb to adhere to TLS. To send larger data packets higher layer protocols must be established.
Short Messages – Messages shorter than 16kb can be sent directly to another object.
Established Links – Long conversations between peers can be created over established links.
Reflection – Objects can publish meta data to describe the available protocols or data presentation formats available.
Simplicity – A user is able to still add a simple text URL into a browser address bar and have a default data representation returned. This ensures the simplicity of web browsers is retained.
Complexity – An application developer can select the best protocol design for the interaction of client and server.
Security – TLS provides Realm to Realm security on all links.
Identity – Where a user chooses, their identity can be made known to the server they are connecting. This is perfect for peer to peer or applications where identity is required.

Obviously this is a rough outline and the details of all the components needs to be fleshed out. There are numerous parts to the design which is not complete; hopefully I've detailed enough to provide a clear understanding of the direction I'm taking the Colony Personal Browser. If you've made it this far please leave a comment. Do you think the design is good, think its got gaping holes, or just way too ambitious?

Thursday, April 23, 2009

Argot Versioning - Part 3 - Remote Type Negotiation

A key concept of Argot is that it allows a client and server to perform type agreement dynamically. Introducing meta data versioning creates a number of issues when performing type agreement. The following goes into the details of Argot remote type negotiation and investigates the issues and some possible solutions.

To understand the problems of dynamic type negotiation, the fundamental concept of Argot data encoding must be understood. Argot meta data definitions are a reflection of how the data is encoded in communications. This is the opposite of Abstract Syntax Notation(ASN1) which defines the abstract meta data of a structure and then applies one of various encodings when the data is being written.

For instance, I'll use the Address data type as an example:
(library.definition u8ascii:”address” u8ascii:”1.0”)
(sequence [
(tag u8ascii:”street” (reference #u8ascii))
(tag u8ascii:”suburb” (reference #u8ascii))
(tag u8ascii:”state” (reference #u8ascii))
In the above data definition, the address structure has three fields; street, suburb and state which are all defined as ASCII strings with a maximum length of 256 characters (u8ascii). Defining an instance of this in Argot would be:

(address street:”PO Box 4591” suburb:”Melbourne” state:”Victoria”)
If this instance was to be serialised for communications it would look as follows:
    0x0B “PO Box 4591” 0x09 “Melbourne” 0x08 “Victoria”
(note strings have not been changed to hex to ease readability)
Referring back to the address meta data you can see that this encoding shows a sequence of three strings. The u8ascii type uses a unsigned 8-bit byte to specify the length of each string before the data. Other than the length of each string before the string data there is no other meta data embedded into to encoding. This format has a number of consequences for how Argot must read data from the stream. The most important requirement is that Argot must know exactly how many fields each structure contains and what data is coming next. The advantage is that Argot is able to use a very compact data format with little to no meta data being utilised in the data stream. The disadvantage is that the exact structure of the data must be known before it can be read. This requires a client/server to both agree on the data types being used for communication.

Remote Dynamic Data Type Agreement

When communicating between client and server using Argot, information such as the Address instance above are identified using a 16bit identifier. This identifier is assigned dynamically to allow the communication channel to dynamically discover the data types it can communicate between client and server (as was discussed in the Versioning Part 1). In Argot's current form this negotiation is quite simple. It involves the following transactions.
  • Meta dictionary check – There is an initial call to the server which checks if the core meta dictionary data types are the same. These data types are assigned a common set of identifiers that both client and server must adhere. This operates as a bootstrap mechanism for other types to be defined. This boot strap mechanism allows the meta data to be expanded and include new concepts for how to describe the data being transferred that were not previously developed in the core of Argot's meta data.

  • Resolve Identifier – When a client needs to send a data type for the first time (that is not in the meta dictionary list of types) it sends a “resolve” message to the server. This message has the type's name and structure. The server receives the message, finds the data type and decides if the data structures match. If they match, an identifier is assigned for the channel for that type; if they don't match, an error is returned to the client. If an agreement can not be found, then the client will receive an error and the message being sent must be aborted.

  • Reserve – In some circumstances a data type will have a cyclic reference to itself. Before the data type structure can be resolved using the above call the client must assign an identifier so that its data structure meta data can be written. The reserve call sends a message with a type name to the server. If the server has a data type with the same name it assigns an identifier to the client. If the server does not have the specific data type name then an error is returned to the client.

  • Resolve Reverse – When a server is responding to a request it may wish to send a client a data structure that has not been resolved by the client. Argot uses asynchronous request/response semantics for all calls. This means that the server is unable to initiate a request to the client to resolve the data structure. In this situation the server assigns an identifier and sends the data to the client. When the client reads an identifier it doesn't recognise, it makes a “resolve reverse” call to the server with the identifier. The server responds with the name and the data type meta data assigned to the identifier. If the client finds it has the same data structure it is able to continue reading the data. If the client does not find a match it must abort reading the data as it does not understand the data received.
These four calls work well in Argot without versioning. The client and server are able to check for each and every type if the structure's match. This includes the data type meta data.

Adding versioning into Argot requires a location to be used instead of a name for all of the above calls. To a large extent, this is all that is required. However, there is a problem at the protocol level centred around the concept of “resolve reverse”. As stated above, currently the server has one version of each data type. When responding with a previously unused data type it is able to respond with the single version of the data structure. The client either reads the data or doesn't. In a situation where the server has multiple versions of the data structure it needs to know which version to send to the client. In a protocol that uses asynchronous request/reply semantics, the server is unable to initiate request to the client to ask which version to use.

Here's some ideas that were explored to solve this issue:
  • Place holder – The server could return a place holder for the data instead of the actual data. This would require that the client must find this place holder in the data stream and send a message to the server.

    This is not suitable as it puts a burden on the application to keep the instance data around to be encoded as required by the client. It also makes the Argot streaming interfaces very complex.

  • Second Channel – Require that the client hold open a second channel to the server. This allows the server to initiate requests such as this to the client. The client could close both channels when communications has completed.

    This is not suitable as TCP sessions are a scarce resource. Keeping open a communications channel for the chance of communications is not appropriate.

  • Pause Stream – Stop the current response and return a message in the stream asking the client to resolve the version required. The client would resolve the version and then return a message to the server asking it to continue with the selected version.

    This is not suitable as the client may be in the middle of reading any other data type. It may not be in a position to find the message in the stream.
A few possible workable solutions are:
  • Chunked Stream – Require that a response stream is broken up into chunks. The server fills up a chunk before sending the response to the client. If the server finds a data type that needs a version selected, the server can initiate a request to the client asking which version to use. As the stream is chunked the client is able to receive requests from the server interleaved with the response stream.

    The interesting part of this solution is it changes the underlying request/response semantics and opens up the stream to be a bidirectional group of channels. This aligns well with the asynchronous request/reply concept already stated in BORED (Binary Object REst Distributed system). Using a chunked stream is bringing the concepts of TCP up a layer to allow multiple communications to occur on the same channel. Allowing the server to initiate requests also creates a new set of opportunities and challenges. The chunked stream also has some similarities to SCTP, the protocol used in VOIP systems.

  • Pre-fetch – Require that the client know the type of data structures that will be returned by a request. The client must send a request to the server with all the data types that it could receive in the response.

    Ideally the client would send a group of data types to the server for data type and version negotiation. An issue with this is that some data types may need to be resolved before the group can be sent. To resolve this, the client can use a set group of reserved identifiers for the purpose of performing type resolution.
Having two solutions only solves part of the problem. There are a few scenarios which should be catered for:
  • Negotiated Versioning – The discussion above revolves around the concept that the client and server require to negotiate the version of data they should use. This scenario would be most effective where a client and server have a communication which persists over a series of calls. This is especially true in environments that have multiple servers and clients with data that is evolving over time.

    In negotiated versioning the server has a set of versions for each data type. The client is able to select which version it would like to communicate with for each type.

  • Shared Versioning – Another possible scenario is that both the client and server agree to adhere to a version dictated by separate server. In this scenario the server is able to send any data to the client as long as it adheres to the shared version. This may be appropriate for organisations wishing to centralise the data dictionary for better management purposes. It is also likely that the server will contain multiple versions of data types. The client would select a version of the dictionary which would select the correct version of each data type; much like a version in traditional version control systems. This is required as clients and servers programmed for a particular version of a data type can not have the structure of that type changed unexpectedly.

    This scenario requires that the client or server select the server that will be used to select the data type version.

  • Server Dictated – In some cases the server may only have a single version of each data type. In this scenario there is no point attempting to negotiate the data types. The client must have the specific versions of data required by the client. This is scenario that Argot currently uses for communications. In scenarios where deploying new servers are more expensive and clients talk to many servers this is most appropriate. The client must contain all data type versions required to communicate with each server.

    This case is also true for any form of message queueing or file base messages. The server will have no idea which version the client requires. If the server has multiple versions available then it will need to select the most appropriate version of each type for a given file. It is also true for embedded systems where the server will not have the resources or processing power to perform full negotiation of data types.

  • Client Pre-Selected – In some cases its most appropriate for the client to pre-select the data type versions it expects to receive from the server. In this scenario the client sends a group of data types and versions that the server should use for communications. In this scenario the client is more expensive to deploy or a server needs to communicate with multiple versions of a client.

    This method fits into the pre-fetch method above.

  • Version Controlled – In this scenario the stream includes a version selector for a group of data types. This is more in-line with how many systems currently operate. The disadvantage of this mechanism is that every data type must be pre-selected to be part of the group of a selected version.

If you can think of other methods of performing version agreement between servers please let me know so that I can add it to the list.

Catering for the various types of version management in a single protocol is not a simple task. The protocol redesign should also align with the BORED protocol discussed in previous posts. In addition to these requirements the final requirement is to build security into the protocol. These issues will be explored in a later post.

Argot Programming Model

Another piece in the versioning puzzle is how the programmer's API has changed for the most common functions of Argot. In the current system, Argot uses the concept of a TypeMap for mapping specific data types to a stream. Currently this does not include any type of version information. Some example code looks like:
    TypeMap map = new TypeMap( typeLibrary ); 1, typeLibrary.getId(“u8”)); 2, typeLibrary.getId(“u8ascii”));
In this scenario, the user is mapping a local identifiers to data types in the type library. To support versioning, the developer would need to specify which version of u8 and u8ascii they wanted from the TypeLibrary.
    TypeMap map = new TypeMap( typeLibrary ); 1, typeLibrary.getId(“u8”, “1.0.0”)); 2, typeLibrary.getId(“u8ascii”, “1.0.0”));
The problem with this is it re-introduces a specific version too early in the communications. The solution is the introduction of a TypeMapper interface which is passed into the TypeMap. The TypeMapper has the task of selecting which version of a data type is required at the time it is being used. The user simply creates the TypeMap with the required TypeMapper. The TypeMap initialises the TypeMapper which gives it a chance to map any required types. When a developer writes a data type that is not in the map, the TypeMapper is called which resolves which version of a type to use. Creating a type map now looks like:
    TypeMap map = new TypeMap( typeLibrary, new TypeMapperDynamic());
In this case a dynamic type map is being used to resolve the data types. It dynamically assigns any types required by the type map. A stream is created and written using:
    typeStream = new TypeOutputStream( stream, map );
typeStream.writeObject( “address”, addressObject );
If we were to write a specific version, the API would change to:
    typeStream.writeObject( “address”, “1.2”, addressObject );
Once again this re-introduces a specific version too early. If the line above was on a server, any client or receiver of the data would be locked in to version 1.2 of the address. For this reason the first example is how objects should be written to the stream. This requires that the TypeMapper select the correct version of the address type.

Different type mappers can be created to deal with the various styles of type negotiation listed above. After an id has been mapped, any use of that name will tie directly to the specified version.

Meta Dictionary Update

Near the end of the last post I suggested a change to the meta dictionary which had the effect of only allowing a single version of any data type to be used on a stream. The requirement at the API to only use the name and not the version validates that this change would match the API. The meta dictionary has been updated to reflect this change. This changes very little in actual meta dictionary.

A consequence of this change is that an additional request/response pair is required for the traditional method of performing type agreement. The first time a client wishes to use a type it must send to the server the name of the type without specifying the version. The server maps a specific version to the type and returns the mapped identifier, the location of the definition and the definition structure. The client is able to check this against its local version. This method continues to use server dictated versioning and is a temporary solution until a more advanced protocol can be devised.

Another small update to the meta dictionary is naming. Currently types in Argot are defined using a short ascii string (e.g. “” ). In the meta dictionary this is defined as:
(library.definition u8ascii:"" meta.version:"1.3")
(meta.reference #u8ascii))
While the string implies that the name has groupings, each name is simply a unique string. As the number of objects in the TypeLibrary increases it will become more difficult to find specific groups of types. The string also goes against a central concept of Argot; there's no need to define string based expressions for encodings. The solution is to change the to:
(library.definition"meta.name_part" meta.version:"1.3")
(meta.reference #u8utf8))

(library.definition"" meta.version:"1.3")
(meta.reference #uint8)
(meta.reference #meta.name_part”)))
This creates an array of name parts which is a more true representation of the name. This allows the TypeLibrary to build a hierarchy in the type library. For programmer simplicity a parser is still used for the text representation. However it would still be possible to write one of the entries above as:
( [ u8utf8:“meta” u8utf8:”name_part ] )
(meta.version major:1 minor:3))
(meta.reference #u8utf8))
Another small change is from u8ascii to u8utf8. This allows a wider variety of languages to be used for names. In the future I'll introduce a meta.alias type as an extension to the meta dictionary. This will allow different languages (I.e. Spanish, Japanese, etc.) to define their own names for data types while still keeping compatibility.

Object Relationships

Another issue to add to the list. How to you deal with the relationship between data types and data objects across multiple versions. The whole point of Argot is to create a simple API for making it easy to read/write data to/from data streams. In essence to communicate knowledge between systems.

When binding a Java class to the TypeLibrary, is the Name or the data type definition version used? If the same class can be used for all Definitions then binding to the name is appropriate. If different classes are required between versions then binding to the definition is required. All definitions should define a common interface or super class. If this is not the case then the developer must be very careful not to create data streams that intermix objects as class cast exceptions will be likely. This is another area which is not fully developed and will need to be explored. However, relative to the versioning base meta data changes, this is a small task.


The Argot library now supports versioning, however, there's still some loose ends to tidy up. Future posts will explore the some of these loose ends. Version 1.3.0 which includes versioning is currently being cleaned up and will be released in the next week or two.

Tuesday, April 14, 2009

Argot Versioning - Part 2 - Meta Data Naming

In this post I'll introduce the solution implemented for meta data versioning in Argot. It builds on the last post which introduced some of the versioning issues. Some light reading for getting the brain in gear after easter.

During the development of the versioning feature, a very important aspect of the system has been modified and updated. I found that every type definition needs more than a simple ascii string to define its name. Instead of a name, a location in the the type library is defined. To explain further, it's best to understand some background information and what this means for Argot.

To recap on the last post, performing type negotiation between peers (client and server) or application and file in the past requires each data type definition have a unique name. This has caused an issue with various aspects of meta data requiring a name where it has not been essential. This is because the basis of Argot is a single table which contains an identifier, name and definition.

Adding versioning into the meta data causes the single table to be broken up into multiple levels. Each name in the table may have multiple definitions. The small table example given in the last post now expands to a much larger table as shown in the table below.

Another example in Argot without versioning is that of abstract data types. These required multiple named definitions. A short example is:

meta.definition: meta.abstract();
meta.definition#basic: #meta.definition, #meta.basic );
meta.definition#map: #meta.definition, );

The three definitions is actually trying to represent the following:

This diagram represents three levels to the data type structure. The first entry defines the name (meta.definition). The second entry defines version 1.0 as being an abstract type. The third and fourth entries are a relation to the version 1.0 definition and map the abstract type to other types.

Using the same naming mechanism to flatten this into a single table useful for Argot creates a group of ugly name strings:

id:10, name:”meta.definition” -;
id:11, name: meta.definition#v1.0 - meta.abstract;
id 12, name: meta.definition#meta.basic#v1.0 - #meta.basic;
id 13, name: -;

The solution to this is to replace each name with a location. The location is an abstract type that initially has three concrete location types. The first location includes the name, the second is a version definition and includes the id of the name location and the version information. The third location is a relation type and includes the id of a versioned definition( eg 11 in the above list) and a tag. The tag is a unique string used to uniquely identify the location. As in the flat table version of Argot where every name must be unique, a location must also be unique. It must be possible to find any definition using just its location data.

The separation of location from the definition is the key concept in Argot with versioning. The location being an abstract type also means that it can be extended to include any type of location specifier. The location specifier replaces the name and provides a flexible method to specify where to place a definition in the meta data library.

An interesting aspect of the above is that there is often more information being used to specify where the data belongs than the actual data. The abstract type "meta.definition" and mapping data definitions now look like:

// 1. define the name.
(meta.identity) )

// 2. define version 1.0 as abstract.
(dictionary.definition name:”meta.definition” version:”1.0”)
(meta.abstract [])

// 3. map meta.basic to the abstract type.
(dictionary.definition name:”meta.definition” version:”1.0”)
( #meta.basic))

// 4. map to the abstract type.
(definition name:”meta.definition” version:”1.0”)

Each entry in the above is in two parts, the location and the definition. This separation has also had other beneficial flow on effects. In the previous versions of Argot, information in the name string had to be replicated in the definition. In effect the definition was previously being used to specify both location and definition information. By using a data structure in the location, this is no longer required. An example of this is the “” definition which previously included both the abstract target and the mapping type. This now only includes the mapping.

The location information provides a mechanism that allows very flexible data structures to be defined in the data type library. This can be extended to define methods signatures or other methods of defining protocol semenatics. In effect it allows the type library to define a complex directed graph while still providing a flat one dimensional table structure so that each individual definition can be found.

Dictionary Text Format

An obvious change in the example above is that the syntax used to define a data type has also changed. The syntax is loosely based on LISP and provides a more flexible way of encoding the meta data in a text format.

Each parenthesis starts with the name of the data type. All subsequence elements is the data for that type. eg.

(library.definition name:”empty” version:”1.3)

This is an instance of the “library.definition”(v1.3) data type. The library.definition is defined as follows:

(library.definition name:”library.definition” version:”1.3”)
(meta.sequence [
(meta.tag name:“name” (meta.reference
(meta.tag name:“version” (meta.reference #meta.version))

This shows that each list shown in parenthesis is actually a strict data structure.

Also in the example is how to include simple type data. “name” and “version” are the names of the fields in the library.definition structure. Field names can be specified for both simple types and data structure. The values for each follow the colon. i.e.

“field name”:”value” // not currently implemented
“field type”:”value”
“field name”:(“data structure” … ) // not currently implemented

For all value types the “field type” must provide a parser capable of parsing the value into an object used internally. In some cases a parser may be provided to parse a string into a complex internal structure. This is currently used for the meta.version type which uses MAJOR.MINOR string type.

The only other form are arrays. Arrays are specified using square brackets. e.g.

[ element1 element2 element3 ]
“field name”:[ element1 element2 element3 ] // not currently implemented

Meta Dictionary

The following is the full meta dictionary in its pre-compiled form. Each and every data type and structure used is defined in the meta dictionary. This provides the self referencing base from which all elements are defined. It does not attempt to define all basic data types. It only attempts to define those data types required as part of the meta dictionary. The data structures in the meta dictionary are used later to define all other common data types in the common dictionary.

You might want to skip the meta dictionary definition unless you really want to give the brain a work out.

// 0. empty
(library.definition u8ascii:"empty" meta.version:"1.3")
(meta.fixed_width uint16:0
[ (meta.fixed_width.attribute.size uint16:0) ]))

// 1. uint8
(library.definition u8ascii:"uint8" meta.version:"1.3")
(meta.fixed_width uint16:8
[ (meta.fixed_width.attribute.size uint16:8)
(meta.fixed_width.attribute.bigendian) ] ))

// 2. uint16
(library.definition u8ascii:"uint16" meta.version:"1.3")
(meta.fixed_width uint16:16
[ (meta.fixed_width.attribute.size uint16:16)
(meta.fixed_width.attribute.bigendian) ] ))

// 3.
(library.definition u8ascii:"" meta.version:"1.3")
(meta.reference #uint16))

// 4.
(library.definition u8ascii:"" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"id" (meta.reference

// 5. meta.abstract
(library.definition u8ascii:"meta.abstract" meta.version:"1.3")
(meta.sequence [
(meta.reference #uint8)

// 6. u8ascii
(library.definition u8ascii:"u8ascii" meta.version:"1.3")
(meta.reference #uint8)
(meta.reference #uint8))

// 7.
(library.definition u8ascii:"" meta.version:"1.3")
(meta.reference #u8ascii))

// 8. meta.version
(library.definition u8ascii:"meta.version" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:”major” (meta.reference #uint8))
(meta.tag u8ascii:”minor” (meta.reference #uint8))

// 9. meta.definition
(library.definition u8ascii:"meta.definition" meta.version:"1.3")
(meta.abstract [
( #meta.fixed_width)
( #meta.abstract)
( #meta.expression)
( #meta.identity)

// 10. meta.identity
(library.definition u8ascii:"meta.identity" meta.version:"1.3")
(meta.sequence [

// 11. meta.expression
(library.definition u8ascii:"meta.expression" meta.version:"1.3")
(meta.abstract [
( #meta.reference)
( #meta.tag)
( #meta.sequence)
( #meta.array)
( #meta.envelop)
( #meta.encoding)

// 12. meta.reference
(library.definition u8ascii:"meta.reference" meta.version:"1.3")
(meta.sequence [(meta.reference]))

// 13. meta.tag
(library.definition u8ascii:"meta.tag" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"name"
(meta.reference #u8ascii))
(meta.tag u8ascii:"data"
(meta.reference #meta.expression))]))

// 14. meta.sequence
(library.definition u8ascii:"meta.sequence" meta.version:"1.3")
(meta.reference #uint8)
(meta.reference #meta.expression)))

// 15. meta.array
(library.definition u8ascii:"meta.array" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"size" (meta.reference #meta.expression))
(meta.tag u8ascii:"data" (meta.reference #meta.expression))]))

// 16. meta.envelop
(library.definition u8ascii:"meta.envelop" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"size"
(meta.reference #meta.expression))
(meta.tag u8ascii:"type"
(meta.reference #meta.expression)) ]))

// 17. meta.encoding
(library.definition u8ascii:"meta.encoding" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"data" (meta.reference #meta.expression))
(meta.tag u8ascii:"encoding" (meta.reference #u8ascii))]))

// 18. meta.fixed_width
(library.definition u8ascii:"meta.fixed_width" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"size" (meta.reference #uint16))
(meta.tag u8ascii:"flags"
(meta.reference #uint8)
(meta.reference #meta.fixed_width.attribute)))]))

// 19. meta.fixed_width.attribute
u8ascii:"meta.fixed_width.attribute" meta.version:"1.3")
(meta.abstract [
( #meta.fixed_width.attribute.size)
( #meta.fixed_width.attribute.integer)
( #meta.fixed_width.attribute.unsigned)
( #meta.fixed_width.attribute.bigendian)

// 20. meta.fixed_width.attribute.size
u8ascii:"meta.fixed_width.attribute.size" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"size" (meta.reference #uint16))

// 21. meta.fixed_width.attribute.integer
(meta.sequence []))

// 22. meta.fixed_width.attribute.unsigned
(meta.sequence []))

// 23. meta.fixed_width.attribute.bigendian

// 24.
(library.definition u8ascii:"" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"name" (meta.reference

// 25. dictionary.definition
(library.definition u8ascii:"dictionary.definition" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"id" (meta.reference
(meta.tag u8ascii:"version" (meta.reference #meta.version))

// 26. dictionary.relation
(library.definition u8ascii:"dictionary.relation"
(meta.sequence [
(meta.tag u8ascii:"id" (meta.reference

// 27. dictionary.location
(library.definition u8ascii:"dictionary.location"
(meta.abstract [
( #dictionary.definition)
( #dictionary.relation)

// 28. dictionary.definition.envelop
(meta.reference #uint16)
(meta.reference #meta.definition)))

// 29. dictionary.entry
(library.definition u8ascii:"dictionary.entry" meta.version:"1.3")
(meta.sequence [
(meta.tag u8ascii:"id"
(meta.tag u8ascii:"location"
(meta.reference #dictionary.location))
(meta.tag u8ascii:"definition"
(meta.reference #meta.definition.envelop))]))

// 30. dictionary.entry.list
(library.definition u8ascii:"dictionary.entry.list"
(meta.reference #uint16)
(meta.reference #dictionary.entry )))

Library types

These types are only used for the pre-compiled definitions and are used by the compiler. They are kept separate from the meta dictionary. These are required so that a user does not need to define identifiers for each type and keep track of which entry is defined by which identifier.

// library.entry
(library.definition name:”library.entry” meta.version:”1.3”)
(meta.sequence [
(meta.tag “location” (meta.reference #library.location)
(meta.tag “definition” (meta.reference #meta.definition)

// library.location
(library.definition name:”library.location” meta.version:”1.3”)
(meta.abstract [
( #library.definition)

// library.definition
(library.definition name:”library.definition” meta.version:”1.3”)
(meta.sequence [
(meta.tag “name” (meta.reference
(meta.tag “version” (meta.reference #meta.version))

Multiple Versions Per Stream

Each of the entries in the meta dictionary is compiled into two dictionary entries. For example the “empty” data type is defined by the following two dictionary entries:

( name:”empty”)

(dictionary.definition #empty meta.version:”1.3”)
(meta.fixed_width size:0
[ (meta.fixed_width.attribute.size size:0) ]))

This fits the versioning model of Argot. The first entry simply defines the name (“empty), while the second entry defines the definition of version 1.3 of the “empty” type. This mimics the internal representation of the type library. A question I am yet to resolve; should this be the external representation? Another solution for the external representation combines the two entries:

(dictionary.definition u8ascii:”empty” meta.version:”1.3”)
(meta.fixed_width size:0
[ (meta.fixed_width.attribute.size size:0) ]))

The advantage of this is that it reduces the data size for the dictionary. A consequence of this is that the meta identifier ( is the same for both the name and the specific meta data version (in this case 1.3). Therefore only the one version of a data type can be used in any individual communication or stream. This is possibly an advantage, as the constraint will create an easier to debug and program communications environment. It also allows a simpler API to be developed which only needs to map each named type to a single version. The disadvantage is that it reduces the flexibility of the communications environment; there may be situations where multiple versions of the same type need to be communicated in the one stream.

Nearly Full Circle

Another adaption to the above is to remove the version from the definition. e.g.

( u8ascii:”empty”)
(meta.fixed_width size:0
[ (meta.fixed_width.attribute.size size:0) ]))

The removal of the version information requires that each definition is used to create a unique signature. The signature becomes the version data used to match particular versions. This method is very close to the original method of defining data, however, the location instead of name is still required. The location type allows the relation location type to be used for abstract types and other types that are defined using multiple entries. This disadvantage of removing the version data is that it requires a more complex library and doesn't provide any form of ordering to be performed between versions. For this reason it won't be used.


The solution implemented for versioning meta data in Argot provides a new and innovative approach to this difficult problem. The concept of using a location in a directed graph allows any graph to be built and partially compared. In the next post I'll explore the area of remote data type negotiation and show how versioning adds new complexities.