[infinispan-dev] HotRod: Digging further into the encoding of data

Manik Surtani manik at jboss.org
Thu Feb 18 11:11:49 EST 2010


On 18 Feb 2010, at 15:51, Galder Zamarreno wrote:

> That's indeed an option but then debugging of logs could be quite a  
> nightmare if all we store in the cache a byte[] and there was no readable  
> presentation for that. This would also be a problem if in the future we  
> decided to allow some tool to inspect the contents of the cache.
> 
> The idea behind the encoding is to at least have a way to encode the  
> simplest objects that can be passed: primitives and Strings and  
> collections of these. Obviously, if users want to simply store a blog,  
> i.e. an image, then that's a byte[] and nothing can be done about that.
> 
> I agree that maybe some stuff can be simplified. We could simply support  
> primitives and Strings and treat the rest as a byte[].

That makes more sense, although you still then have the issue of collections and arrays.  Personally, I'd be in favour of just using byte[]s, unless we have a really good argument against this.  No point in reimplementing a marshalling layer and object encoding for HotRod.

Cheers
Manik

> 
> Thoughts?
> 
> On Thu, 18 Feb 2010 16:40:32 +0100, Manik Surtani <manik at jboss.org> wrote:
> 
>> Not sure I understand.  Why does this need to be as complex as this?   
>> When you say encoding of data, surely all this data is, is a byte[]?
>> 
>> On 18 Feb 2010, at 15:26, Galder Zamarreno wrote:
>> 
>>> Hi,
>>> 
>>> I've been looking again at the encoding of data in Hot Rod
>>> (http://community.jboss.org/wiki/HotRodProtocol) and there's a few  
>>> things
>>> I'm not too happy about or they're not totally clear:
>>> 
>>> 1. The type appears to be pretty wasteful because the majority of types
>>> cannot combined with others, for example: You cannot have a type that is
>>> Boolean and Long at the same time, or Double and Character. However, you
>>> can potentially have an Arrays of Serialized. So, I propose instead
>>> separating between a meta type and type.
>>> 
>>> Meta types would be: Array, Map, Primitive and Compressed. The meta type
>>> would be encoded using bit ops, so you could combine them in diff ways,
>>> i.e. Array of primitives. I don't think we should support combinations  
>>> off
>>> diff collections, i.e Map of Array, or Array of Maps. It would  
>>> complexity
>>> and don't forsee an immediate req for this.
>>> 
>>> Type would be: Byte, Boolean, Character, String, Date, Double, Float,
>>> Integer, Long, Short, Serialized, StringBuilder, and Any. These would
>>> literals from 1 to N. Note that I've added Any to separate between two
>>> different collections. For example, if you send an Array of String, each
>>> individual element just follows together with its size. However, if you
>>> send an Array of Any, each individual entry must define its type.
>>> 
>>> For Maps, we've got two options: First, no type assumptions made and let
>>> each key/value define its own type. Or allow maps meta-type definitions  
>>> to
>>> be followed by not one but two type fields. Even if the map was of mixed
>>> types, you could have Any, Any. My preference is for the latter.
>>> 
>>> Both type and metatype would be variable length integers.
>>> 
>>> 2. Serialized will be stored as byte[] internally, no attempt to
>>> unmarshalling will be done in Hot Rod. Clients decide how they wanna
>>> marshall this Serialized types. They just need to gives us a byte[] and
>>> its length.
>>> 
>>> 3. To clarify something that Alex mentioned in the previous encoding  
>>> data
>>> email, Arrays and Maps are followed by the number of items in Hot Rod  
>>> and
>>> not the number of bytes. In case of Arrays of Any, each individual field
>>> gives us its size, type and the data. In case of Array of Booleans, each
>>> individual field comes with size and data. Size might have been optional
>>> in each field of Array of Boolean, but it simplifies deadling with Array
>>> of Serialized, where each individual field is a byte[] of arbitrary  
>>> length.
>>> 
>>> For Maps, a similar thing happens. If we have a Map of String, Boolean,  
>>> we
>>> get key/value pair like: k=[size+data]v=[size+data]. If it's a Map of  
>>> Any,
>>> Any, we get k=[type+size+data],v=[size+data]
>> 
>> 
>> --
>> Manik Surtani
>> manik at jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>> http://www.infinispan.org
>> http://www.jbosscache.org
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org








More information about the infinispan-dev mailing list