[infinispan-dev] HotRod: Digging further into the encoding of data
Galder Zamarreno
galder at jboss.org
Thu Feb 18 10:26:43 EST 2010
Hi,
I've been looking again at the encoding of data in Hot Rod
(http://community.jboss.org/wiki/HotRodProtocol) and there's a few things
I'm not too happy about or they're not totally clear:
1. The type appears to be pretty wasteful because the majority of types
cannot combined with others, for example: You cannot have a type that is
Boolean and Long at the same time, or Double and Character. However, you
can potentially have an Arrays of Serialized. So, I propose instead
separating between a meta type and type.
Meta types would be: Array, Map, Primitive and Compressed. The meta type
would be encoded using bit ops, so you could combine them in diff ways,
i.e. Array of primitives. I don't think we should support combinations off
diff collections, i.e Map of Array, or Array of Maps. It would complexity
and don't forsee an immediate req for this.
Type would be: Byte, Boolean, Character, String, Date, Double, Float,
Integer, Long, Short, Serialized, StringBuilder, and Any. These would
literals from 1 to N. Note that I've added Any to separate between two
different collections. For example, if you send an Array of String, each
individual element just follows together with its size. However, if you
send an Array of Any, each individual entry must define its type.
For Maps, we've got two options: First, no type assumptions made and let
each key/value define its own type. Or allow maps meta-type definitions to
be followed by not one but two type fields. Even if the map was of mixed
types, you could have Any, Any. My preference is for the latter.
Both type and metatype would be variable length integers.
2. Serialized will be stored as byte[] internally, no attempt to
unmarshalling will be done in Hot Rod. Clients decide how they wanna
marshall this Serialized types. They just need to gives us a byte[] and
its length.
3. To clarify something that Alex mentioned in the previous encoding data
email, Arrays and Maps are followed by the number of items in Hot Rod and
not the number of bytes. In case of Arrays of Any, each individual field
gives us its size, type and the data. In case of Array of Booleans, each
individual field comes with size and data. Size might have been optional
in each field of Array of Boolean, but it simplifies deadling with Array
of Serialized, where each individual field is a byte[] of arbitrary length.
For Maps, a similar thing happens. If we have a Map of String, Boolean, we
get key/value pair like: k=[size+data]v=[size+data]. If it's a Map of Any,
Any, we get k=[type+size+data],v=[size+data]
Thoughts?
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
More information about the infinispan-dev
mailing list