[hibernate-dev] HSEARCH Serialization protocol ready for trunk

Emmanuel Bernard emmanuel at hibernate.org
Sat Aug 6 15:07:47 EDT 2011


On 6 août 2011, at 19:22, Sanne Grinovero wrote:

> Hi Emmanuel,
> 
> 2011/8/6 Emmanuel Bernard <emmanuel at hibernate.org>:
>> I forgot to say that
>> 
>> Our avro serializer is slower (1.6) than Java serialization esp when the VM is not warm (small loop value like = 1000
>> In evens up on longer loops like 100000
>> 
>> Our avro serializer is slower (2.5) than Java serialization esp when the VM is not warm (small loop value like = 1000
>> In evens up or beats the Java serialization on longer loops like 100000
> 
> What is the meaning of the number in parenthesis (the number which
> makes the two statements different) ?

On my machine under a 1000 loop size, the avro serializer is taking 1.6x time the raw Java serializer (not the model implementation mind you) and the avro deserializer is taking 2.5x the Java deserialization time. I noticed it because I had to increase the JGroups network wait period in the tests.

> 
>> However, Avro's message is half the size and there are some room for improvement.
> 
> that sounds pretty good and well suited for our purpose.

right and the following simple optimizations will make it even smaller:

- write a class name once and use a small reference to represent it (one byte - 256 different classes per message -  maybe two bytes to be safe)
- serialize simple id types with their Avro native equivalent (ie integer, long and string)

>>> 
>>> I have one minor question
>>> Should hibernate.search.jms.indexNameProperty that be part of the protocol or left separate like today?
> 
> You likely noticed I had it included in the JGroups backend, since it
> has no such properties. I've no strong opinions about it, only thing I
> can think of is that having it in the protocol would likely be better
> since it means it will be included in what we consider "binary format"
> and test for in backwards compatibility; for example I think it should
> be "optional": defined only if the receiving side handles more than a
> single index.

Can you open a JIRA do track that.
One drawback I see with this approach is that without parsing the nessage byte[] we can't know which index is targeted. I'm not sure that's a problem in practice.

> 
>>> I'll send a pull request in the next few minutes
> 
> I'll review it as soon as you do.

I've sent it yesterday :)
https://github.com/hibernate/hibernate-search/pull/108



More information about the hibernate-dev mailing list