[infinispan-dev] HotRod: Digging further into the encoding of data

Alex Kluge java_kluge at yahoo.com
Thu Feb 18 14:46:39 EST 2010


Hi,
  > That makes more sense, although you still then have the issue of collections and arrays.
  >  Personally, I'd be in favour of just using byte[]s, unless we have a really good argument
  >  against this. No point in reimplementing a marshalling layer and object encoding for HotRod.

  Well, we know that the user is going to do a put(key, value), and on doing a get(key) they
  will be expecting to get an equivalent value back.

  We also know that in order to transmit the key and value over the wire, they have to be converted
  into bytes.

  When they do a get, we have to transmit the bytes to the client, and the client has to be able to
  reassemble the object.

  This means that they byte representation has to contain the object type,  and if it is a more complex
  type, it's size. We can use a custom encoding, or we can use serialization, there just isn't much else.
  Serialization struck me as a little heavy weight, and our application has a lot of primitive values, and
  arrays of primitive values, and a few maps that get cached, so this encoding scheme is working well
  for us.

--- On Thu, 2/18/10, Manik Surtani <manik at jboss.org> wrote:

> From: Manik Surtani <manik at jboss.org>
> Subject: Re: [infinispan-dev] HotRod: Digging further into the encoding of data
> To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>
> Date: Thursday, February 18, 2010, 10:11 AM
> 
> On 18 Feb 2010, at 15:51, Galder Zamarreno wrote:
> 
> > That's indeed an option but then debugging of logs
> could be quite a  
> > nightmare if all we store in the cache a byte[] and
> there was no readable  
> > presentation for that. This would also be a problem if
> in the future we  
> > decided to allow some tool to inspect the contents of
> the cache.
> > 
> > The idea behind the encoding is to at least have a way
> to encode the  
> > simplest objects that can be passed: primitives and
> Strings and  
> > collections of these. Obviously, if users want to
> simply store a blog,  
> > i.e. an image, then that's a byte[] and nothing can be
> done about that.
> > 
> > I agree that maybe some stuff can be simplified. We
> could simply support  
> > primitives and Strings and treat the rest as a
> byte[].
> 
> That makes more sense, although you still then have the
> issue of collections and arrays.  Personally, I'd be in
> favour of just using byte[]s, unless we have a really good
> argument against this.  No point in reimplementing a
> marshalling layer and object encoding for HotRod.
> 
> Cheers
> Manik
> 
> > 
> > Thoughts?
> > 
> > On Thu, 18 Feb 2010 16:40:32 +0100, Manik Surtani
> <manik at jboss.org>
> wrote:
> > 
> >> Not sure I understand.  Why does this need to
> be as complex as this?   
> >> When you say encoding of data, surely all this
> data is, is a byte[]?
> >> 
> >> On 18 Feb 2010, at 15:26, Galder Zamarreno wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I've been looking again at the encoding of
> data in Hot Rod
> >>> (http://community.jboss.org/wiki/HotRodProtocol) and
> there's a few  
> >>> things
> >>> I'm not too happy about or they're not totally
> clear:
> >>> 
> >>> 1. The type appears to be pretty wasteful
> because the majority of types
> >>> cannot combined with others, for example: You
> cannot have a type that is
> >>> Boolean and Long at the same time, or Double
> and Character. However, you
> >>> can potentially have an Arrays of Serialized.
> So, I propose instead
> >>> separating between a meta type and type.
> >>> 
> >>> Meta types would be: Array, Map, Primitive and
> Compressed. The meta type
> >>> would be encoded using bit ops, so you could
> combine them in diff ways,
> >>> i.e. Array of primitives. I don't think we
> should support combinations  
> >>> off
> >>> diff collections, i.e Map of Array, or Array
> of Maps. It would  
> >>> complexity
> >>> and don't forsee an immediate req for this.
> >>> 
> >>> Type would be: Byte, Boolean, Character,
> String, Date, Double, Float,
> >>> Integer, Long, Short, Serialized,
> StringBuilder, and Any. These would
> >>> literals from 1 to N. Note that I've added Any
> to separate between two
> >>> different collections. For example, if you
> send an Array of String, each
> >>> individual element just follows together with
> its size. However, if you
> >>> send an Array of Any, each individual entry
> must define its type.
> >>> 
> >>> For Maps, we've got two options: First, no
> type assumptions made and let
> >>> each key/value define its own type. Or allow
> maps meta-type definitions  
> >>> to
> >>> be followed by not one but two type fields.
> Even if the map was of mixed
> >>> types, you could have Any, Any. My preference
> is for the latter.
> >>> 
> >>> Both type and metatype would be variable
> length integers.
> >>> 
> >>> 2. Serialized will be stored as byte[]
> internally, no attempt to
> >>> unmarshalling will be done in Hot Rod. Clients
> decide how they wanna
> >>> marshall this Serialized types. They just need
> to gives us a byte[] and
> >>> its length.
> >>> 
> >>> 3. To clarify something that Alex mentioned in
> the previous encoding  
> >>> data
> >>> email, Arrays and Maps are followed by the
> number of items in Hot Rod  
> >>> and
> >>> not the number of bytes. In case of Arrays of
> Any, each individual field
> >>> gives us its size, type and the data. In case
> of Array of Booleans, each
> >>> individual field comes with size and data.
> Size might have been optional
> >>> in each field of Array of Boolean, but it
> simplifies deadling with Array
> >>> of Serialized, where each individual field is
> a byte[] of arbitrary  
> >>> length.
> >>> 
> >>> For Maps, a similar thing happens. If we have
> a Map of String, Boolean,  
> >>> we
> >>> get key/value pair like:
> k=[size+data]v=[size+data]. If it's a Map of  
> >>> Any,
> >>> Any, we get k=[type+size+data],v=[size+data]
> >> 
> >> 
> >> --
> >> Manik Surtani
> >> manik at jboss.org
> >> Lead, Infinispan
> >> Lead, JBoss Cache
> >> http://www.infinispan.org
> >> http://www.jbosscache.org
> >> 
> >> 
> >> 
> >> 
> > 
> > 
> > -- 
> > Galder Zamarreño
> > Sr. Software Engineer
> > Infinispan, JBoss Cache
> > 
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 


      




More information about the infinispan-dev mailing list