[hibernate-dev] Re: Pushing indexes through JGroups

Wed Jun 10 15:22:30 EDT 2009

Hi Lukasz,
I've been looking into your code; I have some comments but please
forgive me as I don't have any real experience about JGroups, so I'll
only tell you how much I see this code fit into Hibernate Search.

1) The maven dependency upon JGroups should probably be of type
"optional", so please make sure search is also going to work fine for
people which are not really interested in this work and having the
jgroups.jar around, even if others will love it.

2) Look out for style, especially white spacing; for example in
BatchedQueueingProcessor look at the formatting difference of line 91
compared to 82,85 or 88.
Which IDE are you using? we have some template settings ready to help
with this, you can find them on the hibernate.org wiki's.

3) JGroupsBackendQueueProcessor is a Runnable setting two arguments in
the constructor, they should be final to make sure they are set before
they are run in another thread;
just add "final" modifier to lines 19 and 20.

4)JGroupsBackendQueueProcessorFactory's design:
It looks like a "listenerClassName" is mandatory and not providing a
default implementation; it actually falls back to a Lucene
implementation when this option is missing. This looks like IMHO
adding some extra complexity into the class which you don't really
need. Is there a good reason for that? Someone could forget some
option in the configuration, it would be better to throw an exception
to notify the user about the configuration inconsistency than to do
something differently, or rely on a good default.
Maybe I'm wrong, but then some comments could help me out. Is the user
supposed to configure both message producers and receivers with the
same kind of BackendQeueProcessorFactory? That's probably not needed,
and not consistent with the way the JMS backend is configured.

5)JGroupsAbstractMessageReceiver's design:
This is very similar to the JMS abstract receiver, but in case of JMS
I'd expect to have to "annotate" something in my ejb classes, so that
it gets deployed by the container and associated to the queue, so in
case of JMS it's mandatory for the user to write some class.
Your solution is fine, but wouldn't it be possible to have a "no-code"
solution? The user could just configure this deployment to say
something like "this is the jgroups configuration, this is the
hibernate configuration, you know where to find the entity classes...
please listen to the channels and do your job".
It would be very cool to have just to package the search jar with some
configuration lines (and the entities of course to read some more
Search configuration) and be ready to start listening for messages.
Actually some future version could avoid the entities and receive the
serialized configuration.. just a thought, but that would enable us to
prepackage a whole server ready as a Search backend without even
needing to deploy any user code.

6) testing... I couldn't start them as JGroups was failing to bind to
ports on my machine, I'm sure I am doing something wrong, will try
again after reading some docs about it.
But anyway I got a bit confused about the notion of "Master"s and
"Receivers"; I'm used in the JMS to see the master as the one taking
care of the index, so receiving the docs not sending them.

Generally speaking, add some comments and debug log statements (using
the {} instead of string + concatenation);
I'll try this weekend to try it on remote staging servers, it looks promising!

Sanne

2009/6/10 Łukasz Moreń <lukasz.moren at gmail.com>:
> Hi,
>
> I've finished task concerning JMS replacement with JGroups. The patch is
> attached. The general idea of pushing indexes through JG is assured, however
> there are issues to improve (i.e. flexible JG protocol stack configuration).
> Any review or advices would be welcome to make sure that I am not going into
> blind alley.
>
> Thanks,
> Lukasz
>
> 2009/5/27 Emmanuel Bernard <emmanuel at hibernate.org>
>>
>> Lukasz,
>> I have been discussing with Manik on #3 and we think that JBoss Cache /
>> Infinispan are probably a better fit than plain JGroups for that as all the
>> plumbing will be configured for you.
>> When you reach this problem, let's revive this discussion.
>>
>> On  May 25, 2009, at 11:07, Hardy Ferentschik wrote:
>>
>>> Hi,
>>>
>>> I talked with Łukasz about this last wekk. Definitely, #1 and #3.
>>> #2 I don't like either.
>>>
>>> The befefit of #3 would also be that one could drop the requirement of
>>> having a shared file system (NFS, NAS, ...) #3 should be quite easy to
>>> implement. Maybe easy to get started with.
>>>
>>> --Hardy
>>>
>>> On Mon, 25 May 2009 10:55:52 +0200, Emmanuel Bernard
>>> <emmanuel at hibernate.org> wrote:
>>>
>>>> Hello
>>>> I am not sure this is where we should go, or at least, it depends. here
>>>> are three scenarii
>>>>
>>>>
>>>> #1 JMS replacement
>>>> If you want to use JGroups as a replacement for the JMS backend, then I
>>>> think you should write a jgroups backend. Check
>>>> org.hibernate.search.backend.impl.jms
>>>> In this case all changes are sent via JGroups to a "master". The master
>>>> could be voted by the cluster possibly dynamically but that's not necessary
>>>> for the first version.
>>>>
>>>> #2 apply indexing on all nodes
>>>> JGroups could send the work queue to all nodes and each node could apply
>>>> the change.
>>>> for various reasons I am not fan of this solution as it creates overhead
>>>> in CPU / memory usage and does nto scale very well from a theoretical PoV.
>>>>
>>>> #3 Index copy
>>>> this is what you are describing, copying the index using JGroups instead
>>>> of my file system approach. This might have merits esp as we could diminish
>>>> network traffic using multicast but it also require to rethink the master /
>>>> slave modus operandi.
>>>> Today the master copy on a regular basis a clean index to a shared
>>>> directory
>>>> On a regular basis, the slave go and copy the clean index from the
>>>> shared directory.
>>>> In your approach, the master would send changes to the slaves and slaves
>>>> would have to apply them "right away" (on their passive version)
>>>>
>>>> I think #1 is more interesting than #3, we probably should start with
>>>> that. #3 might be interesting too, thoughts?
>>>>
>>>> Emmanuel
>>>>
>>>> PS: refactoring is a fact of life, so feel free to do so. Just don't
>>>> break public contracts.
>>>>
>>>> On  May 21, 2009, at 22:14, Łukasz Moreń wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have few questions that concern using JGroups to copy index files. I
>>>>> think to create sender(for master) and receiver(slave) directory providers.
>>>>> Sender class mainly based on existing FSMasterDirectoryProvider, first
>>>>> create local index copy and send later to slave nodes
>>>>> (or send without copying, but that may cause lower performance?).
>>>>> To avoid code redundancy it would be good to refactor a little
>>>>> FSMasterDirectoryProvider class, so then I can use copying functionality in
>>>>> new DirectoryProvider and add sending one; or rather I should work around
>>>>> it?
>>>>>
>>>>> I do not understand completely how does the multithreading access to
>>>>> index file work. Does FileChannel class assure that, when index is copied
>>>>> and new Lucene works are pushed?
>>>
>>>
>>>
>>
>
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>