[infinispan-dev] HotRod: Digging further into the encoding of data

Galder Zamarreno galder at jboss.org
Thu Feb 18 10:26:43 EST 2010


Hi,

I've been looking again at the encoding of data in Hot Rod  
(http://community.jboss.org/wiki/HotRodProtocol) and there's a few things  
I'm not too happy about or they're not totally clear:

1. The type appears to be pretty wasteful because the majority of types  
cannot combined with others, for example: You cannot have a type that is  
Boolean and Long at the same time, or Double and Character. However, you  
can potentially have an Arrays of Serialized. So, I propose instead  
separating between a meta type and type.

Meta types would be: Array, Map, Primitive and Compressed. The meta type  
would be encoded using bit ops, so you could combine them in diff ways,  
i.e. Array of primitives. I don't think we should support combinations off  
diff collections, i.e Map of Array, or Array of Maps. It would complexity  
and don't forsee an immediate req for this.

Type would be: Byte, Boolean, Character, String, Date, Double, Float,  
Integer, Long, Short, Serialized, StringBuilder, and Any. These would  
literals from 1 to N. Note that I've added Any to separate between two  
different collections. For example, if you send an Array of String, each  
individual element just follows together with its size. However, if you  
send an Array of Any, each individual entry must define its type.

For Maps, we've got two options: First, no type assumptions made and let  
each key/value define its own type. Or allow maps meta-type definitions to  
be followed by not one but two type fields. Even if the map was of mixed  
types, you could have Any, Any. My preference is for the latter.

Both type and metatype would be variable length integers.

2. Serialized will be stored as byte[] internally, no attempt to  
unmarshalling will be done in Hot Rod. Clients decide how they wanna  
marshall this Serialized types. They just need to gives us a byte[] and  
its length.

3. To clarify something that Alex mentioned in the previous encoding data  
email, Arrays and Maps are followed by the number of items in Hot Rod and  
not the number of bytes. In case of Arrays of Any, each individual field  
gives us its size, type and the data. In case of Array of Booleans, each  
individual field comes with size and data. Size might have been optional  
in each field of Array of Boolean, but it simplifies deadling with Array  
of Serialized, where each individual field is a byte[] of arbitrary length.

For Maps, a similar thing happens. If we have a Map of String, Boolean, we  
get key/value pair like: k=[size+data]v=[size+data]. If it's a Map of Any,  
Any, we get k=[type+size+data],v=[size+data]

Thoughts?
-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list