[infinispan-dev] HotRod: Digging further into the encoding of data

Manik Surtani manik at jboss.org
Thu Feb 18 12:07:01 EST 2010


On 18 Feb 2010, at 16:39, Galder Zamarreno wrote:

> On Thu, 18 Feb 2010 17:11:49 +0100, Manik Surtani <manik at jboss.org> wrote:
> 
>> 
>> On 18 Feb 2010, at 15:51, Galder Zamarreno wrote:
>> 
>>> That's indeed an option but then debugging of logs could be quite a
>>> nightmare if all we store in the cache a byte[] and there was no  
>>> readable
>>> presentation for that. This would also be a problem if in the future we
>>> decided to allow some tool to inspect the contents of the cache.
>>> 
>>> The idea behind the encoding is to at least have a way to encode the
>>> simplest objects that can be passed: primitives and Strings and
>>> collections of these. Obviously, if users want to simply store a blog,
>>> i.e. an image, then that's a byte[] and nothing can be done about that.
>>> 
>>> I agree that maybe some stuff can be simplified. We could simply support
>>> primitives and Strings and treat the rest as a byte[].
>> 
>> That makes more sense, although you still then have the issue of  
>> collections and arrays.
> 
> What issue exactly? Anyone sending collections or arrays needs to marshall  
> them into a byte[] and pass that. I don't see a problem. Mircea, maybe you  
> have thought of how you expect to present this to the clients?

Well, if you want to store strings and primitives as they are for for debugging purposes, then you lose that benefit the moment someone passes in an array of strings or primitives.  :)

> 
>> Personally, I'd be in favour of just using byte[]s, unless we have a  
>> really good argument against this.  No point in reimplementing a  
>> marshalling layer and object encoding for HotRod.
> 
> Is debugging not enough of a reason to at least support the basic types?  
> Imagine having to debug through a Infinispan Hot Rod server storing String  
> k,v pairs but having to figure out each String's byte[] format to find the  
> one where things fail. Anything other than primitive types we have no  
> other option. We'd just have to make sure that if client logs are  
> available, things are logged properly. At the very least, we should make  
> it easy to pass Strings and maybe forget about the rest of types.

You will also need to deal with collections of {Strings, primitives, byte[]s}.  See above.

> 
>> 
>> Cheers
>> Manik
>> 
>>> 
>>> Thoughts?
>>> 
>>> On Thu, 18 Feb 2010 16:40:32 +0100, Manik Surtani <manik at jboss.org>  
>>> wrote:
>>> 
>>>> Not sure I understand.  Why does this need to be as complex as this?
>>>> When you say encoding of data, surely all this data is, is a byte[]?
>>>> 
>>>> On 18 Feb 2010, at 15:26, Galder Zamarreno wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I've been looking again at the encoding of data in Hot Rod
>>>>> (http://community.jboss.org/wiki/HotRodProtocol) and there's a few
>>>>> things
>>>>> I'm not too happy about or they're not totally clear:
>>>>> 
>>>>> 1. The type appears to be pretty wasteful because the majority of  
>>>>> types
>>>>> cannot combined with others, for example: You cannot have a type that  
>>>>> is
>>>>> Boolean and Long at the same time, or Double and Character. However,  
>>>>> you
>>>>> can potentially have an Arrays of Serialized. So, I propose instead
>>>>> separating between a meta type and type.
>>>>> 
>>>>> Meta types would be: Array, Map, Primitive and Compressed. The meta  
>>>>> type
>>>>> would be encoded using bit ops, so you could combine them in diff  
>>>>> ways,
>>>>> i.e. Array of primitives. I don't think we should support combinations
>>>>> off
>>>>> diff collections, i.e Map of Array, or Array of Maps. It would
>>>>> complexity
>>>>> and don't forsee an immediate req for this.
>>>>> 
>>>>> Type would be: Byte, Boolean, Character, String, Date, Double, Float,
>>>>> Integer, Long, Short, Serialized, StringBuilder, and Any. These would
>>>>> literals from 1 to N. Note that I've added Any to separate between two
>>>>> different collections. For example, if you send an Array of String,  
>>>>> each
>>>>> individual element just follows together with its size. However, if  
>>>>> you
>>>>> send an Array of Any, each individual entry must define its type.
>>>>> 
>>>>> For Maps, we've got two options: First, no type assumptions made and  
>>>>> let
>>>>> each key/value define its own type. Or allow maps meta-type  
>>>>> definitions
>>>>> to
>>>>> be followed by not one but two type fields. Even if the map was of  
>>>>> mixed
>>>>> types, you could have Any, Any. My preference is for the latter.
>>>>> 
>>>>> Both type and metatype would be variable length integers.
>>>>> 
>>>>> 2. Serialized will be stored as byte[] internally, no attempt to
>>>>> unmarshalling will be done in Hot Rod. Clients decide how they wanna
>>>>> marshall this Serialized types. They just need to gives us a byte[]  
>>>>> and
>>>>> its length.
>>>>> 
>>>>> 3. To clarify something that Alex mentioned in the previous encoding
>>>>> data
>>>>> email, Arrays and Maps are followed by the number of items in Hot Rod
>>>>> and
>>>>> not the number of bytes. In case of Arrays of Any, each individual  
>>>>> field
>>>>> gives us its size, type and the data. In case of Array of Booleans,  
>>>>> each
>>>>> individual field comes with size and data. Size might have been  
>>>>> optional
>>>>> in each field of Array of Boolean, but it simplifies deadling with  
>>>>> Array
>>>>> of Serialized, where each individual field is a byte[] of arbitrary
>>>>> length.
>>>>> 
>>>>> For Maps, a similar thing happens. If we have a Map of String,  
>>>>> Boolean,
>>>>> we
>>>>> get key/value pair like: k=[size+data]v=[size+data]. If it's a Map of
>>>>> Any,
>>>>> Any, we get k=[type+size+data],v=[size+data]
>>>> 
>>>> 
>>>> --
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> Lead, Infinispan
>>>> Lead, JBoss Cache
>>>> http://www.infinispan.org
>>>> http://www.jbosscache.org
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Galder Zamarreño
>>> Sr. Software Engineer
>>> Infinispan, JBoss Cache
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> --
>> Manik Surtani
>> manik at jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>> http://www.infinispan.org
>> http://www.jbosscache.org
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> -- 
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org








More information about the infinispan-dev mailing list