The type appears to be pretty wasteful because the majority of types
cannot combined with others, for example: You cannot have a type that is
Boolean and Long at the same time, or Double and Character. However, you
can potentially have an Arrays of Serialized. So, I propose instead
separating between a meta type and type.
Well, in a case like that, simply set the array bit, but set no type bit. The
encoder
would then encode each object separately. And the decoder decodes each object
separately. However, in the case that they are all ints, or all characters,
providing
that data at the beginning of the stream allows the data to be written without
the type.
Meta types would be: Array, Map, Primitive and Compressed. The meta
type
would be encoded using bit ops, so you could combine them in diff ways,
i.e. Array of primitives. I don't think we should support combinations off
diff collections, i.e Map of Array, or Array of Maps. It would complexity
and don't forsee an immediate req for this.
This follows pretty naturally from my above comments, including arrays of
maps or maps, etc.
For Maps, we've got two options: First, no type assumptions made
and let
each key/value define its own type.
Which is what I did.
Or allow maps meta-type definitions to be followed by not one but two
type
fields. Even if the map was of mixed types, you could have Any, Any. My
preference is for the latter.
Now that we have more time, this becomes a viable alternative.
2. Serialized will be stored as byte[] internally, no attempt to
unmarshalling will be done in Hot Rod. Clients decide how they wanna
marshall this Serialized types. They just need to gives us a byte[] and
Its length.
If they pass a byte array, then store it as a byte array. However, if they pass
an
object that doesn't match any known type, then serialize it. But, you do
have
to be careful not to attempt to deserialize these objects within the cache. It
is
more than likely that the cache has no definition for the object.
3. To clarify something that Alex mentioned in the previous encoding
data
email, Arrays and Maps are followed by the number of items in Hot Rod and
not the number of bytes. In case of Arrays of Any, each individual field
gives us its size, type and the data. In case of Array of Booleans, each
individual field comes with size and data. Size might have been optional
in each field of Array of Boolean, but it simplifies dealing with Array
of Serialized, where each individual field is a byte[] of arbitrary length.
My original plan was to use the count of objects, and in most cases this
works.
However, when the data is compressed, you need to read the whole compressed
block (at least as I had implemented the code). We have the time now, perhaps,
to come up with a more elegant approach.
--- On Thu, 2/18/10, Manik Surtani <manik(a)jboss.org> wrote:
From: Manik Surtani <manik(a)jboss.org>
Subject: Re: [infinispan-dev] HotRod: Digging further into the encoding of data
To: "infinispan -Dev List" <infinispan-dev(a)lists.jboss.org>
Date: Thursday, February 18, 2010, 9:40 AM
Not sure I understand.
Why does this need to be as complex as this?
When you say encoding of data, surely all this data
is, is a byte[]?
On 18 Feb 2010, at 15:26, Galder
Zamarreno wrote:
Hi,
I've been looking again at the encoding of data in Hot
Rod
(
http://community.jboss.org/wiki/HotRodProtocol)
and there's a few things
I'm not too happy about or they're not totally
clear:
1. The type appears to be pretty wasteful because the
majority of types
cannot combined with others, for example: You cannot have a
type that is
Boolean and Long at the same time, or Double and Character.
However, you
can potentially have an Arrays of Serialized. So, I propose
instead
separating between a meta type and type.
Meta types would be: Array, Map, Primitive and Compressed.
The meta type
would be encoded using bit ops, so you could combine them
in diff ways,
i.e. Array of primitives. I don't think we should
support combinations off
diff collections, i.e Map of Array, or Array of Maps. It
would complexity
and don't forsee an immediate req for this.
Type would be: Byte, Boolean, Character, String, Date,
Double, Float,
Integer, Long, Short, Serialized, StringBuilder, and Any.
These would
literals from 1 to N. Note that I've added Any to
separate between two
different collections. For example, if you send an Array of
String, each
individual element just follows together with its size.
However, if you
send an Array of Any, each individual entry must define its
type.
For Maps, we've got two options: First, no type
assumptions made and let
each key/value define its own type. Or allow maps meta-type
definitions to
be followed by not one but two type fields. Even if the map
was of mixed
types, you could have Any, Any. My preference is for the
latter.
Both type and metatype would be variable length integers.
2. Serialized will be stored as byte[] internally, no
attempt to
unmarshalling will be done in Hot Rod. Clients decide how
they wanna
marshall this Serialized types. They just need to gives us
a byte[] and
its length.
3. To clarify something that Alex mentioned in the
previous encoding data
email, Arrays and Maps are followed by the number of items
in Hot Rod and
not the number of bytes. In case of Arrays of Any, each
individual field
gives us its size, type and the data. In case of Array of
Booleans, each
individual field comes with size and data. Size might have
been optional
in each field of Array of Boolean, but it simplifies
deadling with Array
of Serialized, where each individual field is a byte[] of
arbitrary length.
For Maps, a similar thing happens. If we have a Map of
String, Boolean, we
get key/value pair like: k=[size+data]v=[size+data]. If
it's a Map of Any,
Any, we get k=[type+size+data],v=[size+data]
--Manik
Surtanimanik(a)jboss.orgLead,
InfinispanLead, JBoss
Cachehttp://www.infinispan.orghttp://www.jbosscache.org
-----Inline Attachment Follows-----
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev