Alex,
Thanks for the clarifications. One last question, is the implementation
you did open source or closed source?
Cheers,
On Mon, 15 Feb 2010 21:01:38 +0100, Alex Kluge <java_kluge(a)yahoo.com>
wrote:
Hi,
There were a few questions, and it will be easier to understand the
answers with a
little background on the implementation. The key and value objects are
mapped to
bytes for transmission over the network using an encoder object, an
instance of
org.jboss.cache.tcpcache.messages.encoder.BaseEncoder,
and then decoded by a decoder object, which is an instance of
org.jboss.cache.tcpcache.messages.decoder.BaseDecoder.
One of the goals of these objects is to, where possible, map the data
into a
representation that can be interpreted on any platform. As long as
there is no loss of
information, these classes can be adjusted as needed to encode and
decode data fields
efficiently.
> Wrt the discussion below about encoding data, I wanted to know what
> exactly COMPRESSED type meant in your wiki, or what exactly it
represents.
If a value encodes to a byte array longer than a threshold value,
compressionThreshold
in the encoder, an instance of java.util.zip.Deflater is used to
compress the bytes before
transmission over the wire, and the COMPRESSED flag is set. On receipt,
the decoder
sees the COMPRESSED flag and decompresses the bytes, then interprets
them in the
normal way. So, compression happens after converting the data to bytes,
and
decompression happens before converting from bytes.
> Also, I wanted to know what bits you send when the object sent accross
> is an instance of java.lang.Long. Apart from marking LONG, do you also
> marked as SERIALIZED? Or do you just use SERIALIZED for user-specific
> classes?
A java.lang.Long would only have the long bit set. A long (not an
object) would have
the long and primitive bits set. An object that the encoder does not
understand would
then be serialized using Java object serialization, and the serialized
bit would be set.
The long and Long values will be encoded identically, however, the
decoder will return
the appropriate type as determined by the flags.
> With regards to individual fields in Map/Array, according to your
wiki,
> each field then only contains the size of the field and the field. So,
> in a map of booleans, each field would be represented as: 0x01 and
> 0x01/0x00 with 1st being the size and 2nd being the actual value.
Correct?
Arrays and maps are treated a bit differently. Arrays are generally of
a single known
type, where maps tend to be between arbitrary types of objects. The
encoding can
encode the single type of an array, but not the multiple types of a map.
As a result, arrays are encoded with a single size field, followed by
all of the data bytes.
The current, suboptimal, implementation places a byte count for all of
the data followed
by the type flags, followed by the data. Thus an array of two Integers
would be encoded
as
A byte count of 12,
four bytes for flags (INTEGER & ARRAY),
four bytes for integer 1,
four bytes for integer 2.
This requires the whole array of encoded integers be loaded at once.
Which should not
be necessary. However, I was a bit pressed for time.
A map is of necessity treated differently. The map also starts with the
total byte
count, and is then followed by the MAP flag. Each key and value in the
map are then
encoded as individual objects with their own, size, type and data
fields.
I would like to find a way of writing only the number of fields at the
beginning of these,
however, time pressure and the need to properly handle compressed data
forced me to
place the total number of bytes as the first field.
Alex
--- On Tue, 1/12/10, Galder Zamarreno <galder(a)redhat.com> wrote:
> From: Galder Zamarreno <galder(a)redhat.com>
> Subject: Re: [infinispan-dev] Enconding of data WAS Re: Hot Rod - pt3
> To: infinispan-dev(a)lists.jboss.org
> Date: Tuesday, January 12, 2010, 10:48 AM
>
>
> On 01/12/2010 05:46 PM, Galder Zamarreno wrote:
> > And another question:
> >
> > With regards to individual fields in Map/Array,
> according to your wiki,
> > each field then only contains the size of the field
> and the field. So,
> > in a map of booleans, each field would be represented
> as: 0x01 and
> > 0x01/0x00 with 1st being the size and 2nd being the
> actual value. Correct?
>
> Well, this would be more like an array actually.
>
> By the way, in a Map, how do you send key/value pairs? I
> suppose
> [key][value][key][value]? And how do you provide type of
> key vs type of
> value?
>
> >
> > On 01/12/2010 04:46 PM, Galder Zamarreno wrote:
> >> Hi Alex,
> >>
> >> Wrt the discussion below about encoding data, I
> wanted to know what
> >> exactly COMPRESSED type meant in your wiki, or
> what exactly it represents.
> >>
> >> Also, I wanted to know what bits you send when the
> object sent accross
> >> is an instance of java.lang.Long. Apart from
> marking LONG, do you also
> >> marked as SERIALIZED? Or do you just use
> SERIALIZED for user-specific
> >> classes?
> >>
> >> Cheers,
> >>
> >> On 01/05/2010 10:52 AM, Galder Zamarreno wrote:
> >>>
> >>>
> >>> On 01/04/2010 10:44 PM, Alex Kluge wrote:
> >>>>>> </snip>
> >>>
> >>>>
> >>>>>> - What happens
> if the key or the value is not text? I have a way of
> >>>>>> representing the data to allow for a wide variety of
data
> types,
> >>>>>> even
> allowing for arrays or maps. This will make the protocol
> more
> >>>>>> complex, but the assumption that the data is a string
is
> rather
> >>>>>> limiting. This is already sketched out in the wiki.
> >>>
> >>> Hmmmmm, I don't think I've made any
> assumptions in the wiki that keys or
> >>> values are Strings unless I've made a mistake
> somewhere (maybe in the
> >>> example where I've used a particular encoding
> for Strings?). My thoughts
> >>> around this was that I was gonna treat them
> both as byte[]...
> >>>
> >>>>>
> >>>>> </snip>
> >>>>
> >>>> The idea is to prefix
> each data block with a data type. This is a
> >>>> lightweight binary
> protocol, so a full fledged mime type would be
> >>>> overkill. There is a
> discussion and example in the Encoding Data
> >>>> section of this page:
> >>>>
> >>>>
http://community.jboss.org/wiki/RemoteCacheInteractions
> >>>>
> >>>> Data types are limited
> to things like integer, byte, string, boolean, etc.
> >>>> Or, if it isn't a
> recognised type, the native platform serialisation is
> >>>> used. There can be
> arrays of these types, or maps as well.
> >>>>
> >>>> Each data type is
> represented by a bit, and they can be used in
> >>>> combinations. An array
> of bytes would have the array, byte and
> >>>> primitive bits set.
> The set of recognised data types can of course be
> >>>> expanded.
> >>>
> >>> ... but that looks rather useful (at least
> based on the cached version
> >>> of that wiki that Google shows ;) - no wiki
> access right now :( ),
> >>> particularly the way you can combine different
> bytes to create composed
> >>> types. I don't see a problem with including
> this in Hot Rod.
> >>>
> >>> I suppose you included type length into length
> field so that in the
> >>> future you can support other types and
> possibly longer type fields?
> >>>
> >
>
> --
> Galder ZamarreƱo
> Sr. Software Engineer
> Infinispan, JBoss Cache
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder ZamarreƱo
Sr. Software Engineer
Infinispan, JBoss Cache