[infinispan-dev] Enconding of data WAS Re: Hot Rod - pt3

Galder Zamarreno galder at redhat.com
Thu Feb 18 07:12:35 EST 2010


Alex,

Thanks for the clarifications. One last question, is the implementation  
you did open source or closed source?

Cheers,

On Mon, 15 Feb 2010 21:01:38 +0100, Alex Kluge <java_kluge at yahoo.com>  
wrote:

> Hi,
>
>   There were a few questions, and it will be easier to understand the  
> answers with a
>  little background on the implementation.  The key and value objects are  
> mapped to
>  bytes for transmission over the network using an encoder object, an  
> instance of
>      org.jboss.cache.tcpcache.messages.encoder.BaseEncoder,
>  and then decoded by a decoder object, which is an instance of
>      org.jboss.cache.tcpcache.messages.decoder.BaseDecoder.
>  One of the goals of these objects is to, where possible, map the data  
> into a
>  representation that can be interpreted on any platform. As long as  
> there is no loss of
>  information, these classes can be adjusted as needed to encode and  
> decode data fields
>  efficiently.
>
>   > Wrt the discussion below about encoding data, I wanted to know what
>   > exactly COMPRESSED type meant in your wiki, or what exactly it  
> represents.
>
>  If a value encodes to a byte array longer than a threshold value,  
> compressionThreshold
>  in the encoder, an instance of java.util.zip.Deflater is used to  
> compress the bytes before
>  transmission over the wire, and the COMPRESSED flag is set. On receipt,  
> the decoder
>  sees the COMPRESSED flag and decompresses the bytes, then interprets  
> them in the
>  normal way. So, compression happens after converting the data to bytes,  
> and
>  decompression happens before converting from bytes.
>
>  > Also, I wanted to know what bits you send when the object sent accross
>  > is an instance of java.lang.Long. Apart from marking LONG, do you also
>  > marked as SERIALIZED? Or do you just use SERIALIZED for user-specific
>  > classes?
>
>  A  java.lang.Long would only have the long bit set. A long (not an  
> object) would have
>  the long and primitive bits set. An object that the encoder does not  
> understand would
>  then be serialized using Java object serialization, and the serialized  
> bit would be set.
>  The long and Long values will be encoded identically, however, the  
> decoder will return
>  the appropriate type as determined by the flags.
>
>  > With regards to individual fields in Map/Array, according to your  
> wiki,
>  > each field then only contains the size of the field and the field. So,
>  > in a map of booleans, each field would be represented as: 0x01 and
>  > 0x01/0x00 with 1st being the size and 2nd being the actual value.  
> Correct?
>
>  Arrays and maps are treated a bit differently. Arrays are generally of  
> a single known
>  type, where maps tend to be between arbitrary types of objects. The  
> encoding can
>  encode the single type of an array, but not the multiple types of a map.
>
>  As a result, arrays are encoded with a single size field, followed by  
> all of the data bytes.
>  The current, suboptimal, implementation places a byte count for all of  
> the data followed
>  by the type flags, followed by the data. Thus an array of two Integers  
> would be encoded
>  as
>
>   A byte count of 12,
>    four bytes for flags (INTEGER & ARRAY),
>    four bytes for integer 1,
>    four bytes for integer 2.
>
>  This requires the whole array of encoded integers be loaded at once.  
> Which should not
>  be necessary. However, I was a bit pressed for time.
>
>  A map is of necessity treated differently. The map also starts with the  
> total byte
>  count, and is then followed by the MAP flag. Each key and value in the  
> map are then
>  encoded as individual objects with their own, size, type and data  
> fields.
>
>  I would like to find a way of writing only the number of fields at the  
> beginning of these,
>  however, time pressure and the need to properly handle compressed data  
> forced me to
>  place the total number of bytes as the first field.
>                                                                  Alex
>
> --- On Tue, 1/12/10, Galder Zamarreno <galder at redhat.com> wrote:
>
>> From: Galder Zamarreno <galder at redhat.com>
>> Subject: Re: [infinispan-dev] Enconding of data WAS Re:  Hot Rod - pt3
>> To: infinispan-dev at lists.jboss.org
>> Date: Tuesday, January 12, 2010, 10:48 AM
>>
>>
>> On 01/12/2010 05:46 PM, Galder Zamarreno wrote:
>> > And another question:
>> >
>> > With regards to individual fields in Map/Array,
>> according to your wiki,
>> > each field then only contains the size of the field
>> and the field. So,
>> > in a map of booleans, each field would be represented
>> as: 0x01 and
>> > 0x01/0x00 with 1st being the size and 2nd being the
>> actual value. Correct?
>>
>> Well, this would be more like an array actually.
>>
>> By the way, in a Map, how do you send key/value pairs? I
>> suppose
>> [key][value][key][value]? And how do you provide type of
>> key vs type of
>> value?
>>
>> >
>> > On 01/12/2010 04:46 PM, Galder Zamarreno wrote:
>> >> Hi Alex,
>> >>
>> >> Wrt the discussion below about encoding data, I
>> wanted to know what
>> >> exactly COMPRESSED type meant in your wiki, or
>> what exactly it represents.
>> >>
>> >> Also, I wanted to know what bits you send when the
>> object sent accross
>> >> is an instance of java.lang.Long. Apart from
>> marking LONG, do you also
>> >> marked as SERIALIZED? Or do you just use
>> SERIALIZED for user-specific
>> >> classes?
>> >>
>> >> Cheers,
>> >>
>> >> On 01/05/2010 10:52 AM, Galder Zamarreno wrote:
>> >>>
>> >>>
>> >>> On 01/04/2010 10:44 PM, Alex Kluge wrote:
>> >>>>>>      </snip>
>> >>>
>> >>>>
>> >>>>>>      - What happens
>> if the key or the value is not text? I have a way of
>> >>>>>>      representing the data to allow for a wide variety of data
>> types,
>> >>>>>>        even
>> allowing for arrays or maps. This will make the protocol
>> more
>> >>>>>>      complex, but the assumption that the data is a string is
>> rather
>> >>>>>>      limiting. This is already sketched out in the wiki.
>> >>>
>> >>> Hmmmmm, I don't think I've made any
>> assumptions in the wiki that keys or
>> >>> values are Strings unless I've made a mistake
>> somewhere (maybe in the
>> >>> example where I've used a particular encoding
>> for Strings?). My thoughts
>> >>> around this was that I was gonna treat them
>> both as byte[]...
>> >>>
>> >>>>>
>> >>>>> </snip>
>> >>>>
>> >>>>      The idea is to prefix
>> each data block with a data type. This is a
>> >>>>      lightweight binary
>> protocol, so a full fledged mime type would be
>> >>>>      overkill. There is a
>> discussion and example in the Encoding Data
>> >>>>      section of this page:
>> >>>>
>> >>>>        http://community.jboss.org/wiki/RemoteCacheInteractions
>> >>>>
>> >>>>      Data types are limited
>> to things like integer, byte, string, boolean, etc.
>> >>>>      Or, if it isn't a
>> recognised type, the native platform serialisation is
>> >>>>      used. There can be
>> arrays of these types, or maps as well.
>> >>>>
>> >>>>      Each data type is
>> represented by a bit, and they can be used in
>> >>>>      combinations. An array
>> of bytes would have the array, byte and
>> >>>>      primitive bits set.
>> The set of recognised data types can of course be
>> >>>>      expanded.
>> >>>
>> >>> ... but that looks rather useful (at least
>> based on the cached version
>> >>> of that wiki that Google shows ;) - no wiki
>> access right now :( ),
>> >>> particularly the way you can combine different
>> bytes to create composed
>> >>> types. I don't see a problem with including
>> this in Hot Rod.
>> >>>
>> >>> I suppose you included type length into length
>> field so that in the
>> >>> future you can support other types and
>> possibly longer type fields?
>> >>>
>> >
>>
>> --
>> Galder Zamarreño
>> Sr. Software Engineer
>> Infinispan, JBoss Cache
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list