Hi,
There were a few questions, and it will be easier to understand the answers with a
little background on the implementation. The key and value objects are mapped to
bytes for transmission over the network using an encoder object, an instance of
org.jboss.cache.tcpcache.messages.encoder.BaseEncoder,
and then decoded by a decoder object, which is an instance of
org.jboss.cache.tcpcache.messages.decoder.BaseDecoder.
One of the goals of these objects is to, where possible, map the data into a
representation that can be interpreted on any platform. As long as there is no loss of
information, these classes can be adjusted as needed to encode and decode data fields
efficiently.
Wrt the discussion below about encoding data, I wanted to know what
exactly COMPRESSED type meant in your wiki, or what exactly it represents.
If a value encodes to a byte array longer than a threshold value, compressionThreshold
in the encoder, an instance of java.util.zip.Deflater is used to compress the bytes
before
transmission over the wire, and the COMPRESSED flag is set. On receipt, the decoder
sees the COMPRESSED flag and decompresses the bytes, then interprets them in the
normal way. So, compression happens after converting the data to bytes, and
decompression happens before converting from bytes.
Also, I wanted to know what bits you send when the object sent
accross
is an instance of java.lang.Long. Apart from marking LONG, do you also
marked as SERIALIZED? Or do you just use SERIALIZED for user-specific
classes?
A java.lang.Long would only have the long bit set. A long (not an object) would have
the long and primitive bits set. An object that the encoder does not understand would
then be serialized using Java object serialization, and the serialized bit would be set.
The long and Long values will be encoded identically, however, the decoder will return
the appropriate type as determined by the flags.
With regards to individual fields in Map/Array, according to your
wiki,
each field then only contains the size of the field and the field. So,
in a map of booleans, each field would be represented as: 0x01 and
0x01/0x00 with 1st being the size and 2nd being the actual value. Correct?
Arrays and maps are treated a bit differently. Arrays are generally of a single known
type, where maps tend to be between arbitrary types of objects. The encoding can
encode the single type of an array, but not the multiple types of a map.
As a result, arrays are encoded with a single size field, followed by all of the data
bytes.
The current, suboptimal, implementation places a byte count for all of the data followed
by the type flags, followed by the data. Thus an array of two Integers would be encoded
as
A byte count of 12,
four bytes for flags (INTEGER & ARRAY),
four bytes for integer 1,
four bytes for integer 2.
This requires the whole array of encoded integers be loaded at once. Which should not
be necessary. However, I was a bit pressed for time.
A map is of necessity treated differently. The map also starts with the total byte
count, and is then followed by the MAP flag. Each key and value in the map are then
encoded as individual objects with their own, size, type and data fields.
I would like to find a way of writing only the number of fields at the beginning of
these,
however, time pressure and the need to properly handle compressed data forced me to
place the total number of bytes as the first field.
Alex
--- On Tue, 1/12/10, Galder Zamarreno <galder(a)redhat.com> wrote:
From: Galder Zamarreno <galder(a)redhat.com>
Subject: Re: [infinispan-dev] Enconding of data WAS Re: Hot Rod - pt3
To: infinispan-dev(a)lists.jboss.org
Date: Tuesday, January 12, 2010, 10:48 AM
On 01/12/2010 05:46 PM, Galder Zamarreno wrote:
> And another question:
>
> With regards to individual fields in Map/Array,
according to your wiki,
> each field then only contains the size of the field
and the field. So,
> in a map of booleans, each field would be represented
as: 0x01 and
> 0x01/0x00 with 1st being the size and 2nd being the
actual value. Correct?
Well, this would be more like an array actually.
By the way, in a Map, how do you send key/value pairs? I
suppose
[key][value][key][value]? And how do you provide type of
key vs type of
value?
>
> On 01/12/2010 04:46 PM, Galder Zamarreno wrote:
>> Hi Alex,
>>
>> Wrt the discussion below about encoding data, I
wanted to know what
>> exactly COMPRESSED type meant in your wiki, or
what exactly it represents.
>>
>> Also, I wanted to know what bits you send when the
object sent accross
>> is an instance of java.lang.Long. Apart from
marking LONG, do you also
>> marked as SERIALIZED? Or do you just use
SERIALIZED for user-specific
>> classes?
>>
>> Cheers,
>>
>> On 01/05/2010 10:52 AM, Galder Zamarreno wrote:
>>>
>>>
>>> On 01/04/2010 10:44 PM, Alex Kluge wrote:
>>>>>> </snip>
>>>
>>>>
>>>>>> - What happens
if the key or the value is not text? I have a way of
>>>>>>
representing the data to allow for a wide variety of data
types,
>>>>>> even
allowing for arrays or maps. This will make the protocol
more
>>>>>>
complex, but the assumption that the data is a string is
rather
>>>>>>
limiting. This is already sketched out in the wiki.
>>>
>>> Hmmmmm, I don't think I've made any
assumptions in the wiki that keys or
>>> values are Strings unless I've made a mistake
somewhere (maybe in the
>>> example where I've used a particular encoding
for Strings?). My thoughts
>>> around this was that I was gonna treat them
both as byte[]...
>>>
>>>>>
>>>>> </snip>
>>>>
>>>> The idea is to prefix
each data block with a data type. This is a
>>>> lightweight binary
protocol, so a full fledged mime type would be
>>>> overkill. There is a
discussion and example in the Encoding Data
>>>> section of this page:
>>>>
>>>>
http://community.jboss.org/wiki/RemoteCacheInteractions
>>>>
>>>> Data types are limited
to things like integer, byte, string, boolean, etc.
>>>> Or, if it isn't a
recognised type, the native platform serialisation is
>>>> used. There can be
arrays of these types, or maps as well.
>>>>
>>>> Each data type is
represented by a bit, and they can be used in
>>>> combinations. An array
of bytes would have the array, byte and
>>>> primitive bits set.
The set of recognised data types can of course be
>>>> expanded.
>>>
>>> ... but that looks rather useful (at least
based on the cached version
>>> of that wiki that Google shows ;) - no wiki
access right now :( ),
>>> particularly the way you can combine different
bytes to create composed
>>> types. I don't see a problem with including
this in Hot Rod.
>>>
>>> I suppose you included type length into length
field so that in the
>>> future you can support other types and
possibly longer type fields?
>>>
>
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev