[hibernate-dev] Re: Pushing indexes through JGroups

Fri Jun 12 19:08:54 EDT 2009

Hi,

#2. I am using Intellij 8.1. I have downloaded from wiki xml file
with codestyle for intellij, but it's still
a little bit different from currently existing in HS  (i.e. imports order or
line length).

#4. The "listenerClassName" from JGroupsBackendQueueProcessorFactory
is mandatory but only in case it is master node, then it specifies class
responsible for receiving messages from
slave's. I assumed if that option exist it is master node, otherwise slave.
Falling back to a Lucene implementation
was intended to avoid problem from JMS backend; setting up master node to
act also like slave node led to message
duplication - MDB received message, recreate it and put it back to the
queue. JGroupsBackendQueueProcessorFactory
for master node create also LuceneFactory, and received
from slaves Lucene works are processed by LuceneProcessor -
message duplication is avoided. It is the only one reason why I used that.
User is supposed to create readers and receivers with this same
- JGroupsBackendQueueProcessorFactory.

//receiver sample configuration
<property name="hibernate.search.worker.backend" value="jgroups"/>
<property name="hibernate.search.worker.jgroups.channel_name"
value="hs_jg_channel"/>
<property name="hibernate.search.worker.jgroups.listener_class"
value="pl.lmoren.master.JGroupsMessageReceiverImpl"/>

//producers sample configuraion
<property name="hibernate.search.worker.backend" value="jgroups"/>
<property name="hibernate.search.worker.jgroups.channel_name"
value="hs_jg_channel"/>

However, in receiver's case JGroups Factory create Lucene Processors in the
context of the problem described above.
JGroups Factory for master node just initialize communication channel, but
later all work is done by Lucene Backend, so this same like in JMS.
That solution is not fully consistent with JMS backend one, it was forced by
I think different nature of JMS and
JGroups.
In JMS receiver configuration was done in app server config file,
in JGroups I placed it in hibernate configuration.
What do you think about such design?

#5.That is question, I was also thinking about, to make clustering more
transparent.
Hovever I haven't found good idea for that. The purpose of
extending JGroupsAbstractMessageReceiver's by user was to
implement getSession method, where session used by backend could be created.
I suppose that without this method I would not know if
Hibernate sessionFactory should come from i.e. persistence context, looked
up from JNDI or some helper class.

#6. Problem with JGroups exists if it tries to multicast traffic towards
the ISP, not internal network, then I think packages are dropped. I will
look through that in the weekend.

Lukasz

2009/6/10 Sanne Grinovero <sanne.grinovero at gmail.com>

> Hi Lukasz,
> I've been looking into your code; I have some comments but please
> forgive me as I don't have any real experience about JGroups, so I'll
> only tell you how much I see this code fit into Hibernate Search.
>
> 1) The maven dependency upon JGroups should probably be of type
> "optional", so please make sure search is also going to work fine for
> people which are not really interested in this work and having the
> jgroups.jar around, even if others will love it.
>
> 2) Look out for style, especially white spacing; for example in
> BatchedQueueingProcessor look at the formatting difference of line 91
> compared to 82,85 or 88.
> Which IDE are you using? we have some template settings ready to help
> with this, you can find them on the hibernate.org wiki's.
>
> 3) JGroupsBackendQueueProcessor is a Runnable setting two arguments in
> the constructor, they should be final to make sure they are set before
> they are run in another thread;
> just add "final" modifier to lines 19 and 20.
>
> 4)JGroupsBackendQueueProcessorFactory's design:
> It looks like a "listenerClassName" is mandatory and not providing a
> default implementation; it actually falls back to a Lucene
> implementation when this option is missing. This looks like IMHO
> adding some extra complexity into the class which you don't really
> need. Is there a good reason for that? Someone could forget some
> option in the configuration, it would be better to throw an exception
> to notify the user about the configuration inconsistency than to do
> something differently, or rely on a good default.
> Maybe I'm wrong, but then some comments could help me out. Is the user
> supposed to configure both message producers and receivers with the
> same kind of BackendQeueProcessorFactory? That's probably not needed,
> and not consistent with the way the JMS backend is configured.
>
> 5)JGroupsAbstractMessageReceiver's design:
> This is very similar to the JMS abstract receiver, but in case of JMS
> I'd expect to have to "annotate" something in my ejb classes, so that
> it gets deployed by the container and associated to the queue, so in
> case of JMS it's mandatory for the user to write some class.
> Your solution is fine, but wouldn't it be possible to have a "no-code"
> solution? The user could just configure this deployment to say
> something like "this is the jgroups configuration, this is the
> hibernate configuration, you know where to find the entity classes...
> please listen to the channels and do your job".
> It would be very cool to have just to package the search jar with some
> configuration lines (and the entities of course to read some more
> Search configuration) and be ready to start listening for messages.
> Actually some future version could avoid the entities and receive the
> serialized configuration.. just a thought, but that would enable us to
> prepackage a whole server ready as a Search backend without even
> needing to deploy any user code.
>
> 6) testing... I couldn't start them as JGroups was failing to bind to
> ports on my machine, I'm sure I am doing something wrong, will try
> again after reading some docs about it.
> But anyway I got a bit confused about the notion of "Master"s and
> "Receivers"; I'm used in the JMS to see the master as the one taking
> care of the index, so receiving the docs not sending them.
>
> Generally speaking, add some comments and debug log statements (using
> the {} instead of string + concatenation);
> I'll try this weekend to try it on remote staging servers, it looks
> promising!
>
> Sanne
>
> 2009/6/10 Łukasz Moreń <lukasz.moren at gmail.com>:
> > Hi,
> >
> > I've finished task concerning JMS replacement with JGroups. The patch is
> > attached. The general idea of pushing indexes through JG is assured,
> however
> > there are issues to improve (i.e. flexible JG protocol stack
> configuration).
> > Any review or advices would be welcome to make sure that I am not going
> into
> > blind alley.
> >
> > Thanks,
> > Lukasz
> >
> > 2009/5/27 Emmanuel Bernard <emmanuel at hibernate.org>
> >>
> >> Lukasz,
> >> I have been discussing with Manik on #3 and we think that JBoss Cache /
> >> Infinispan are probably a better fit than plain JGroups for that as all
> the
> >> plumbing will be configured for you.
> >> When you reach this problem, let's revive this discussion.
> >>
> >> On  May 25, 2009, at 11:07, Hardy Ferentschik wrote:
> >>
> >>> Hi,
> >>>
> >>> I talked with Łukasz about this last wekk. Definitely, #1 and #3.
> >>> #2 I don't like either.
> >>>
> >>> The befefit of #3 would also be that one could drop the requirement of
> >>> having a shared file system (NFS, NAS, ...) #3 should be quite easy to
> >>> implement. Maybe easy to get started with.
> >>>
> >>> --Hardy
> >>>
> >>> On Mon, 25 May 2009 10:55:52 +0200, Emmanuel Bernard
> >>> <emmanuel at hibernate.org> wrote:
> >>>
> >>>> Hello
> >>>> I am not sure this is where we should go, or at least, it depends.
> here
> >>>> are three scenarii
> >>>>
> >>>>
> >>>> #1 JMS replacement
> >>>> If you want to use JGroups as a replacement for the JMS backend, then
> I
> >>>> think you should write a jgroups backend. Check
> >>>> org.hibernate.search.backend.impl.jms
> >>>> In this case all changes are sent via JGroups to a "master". The
> master
> >>>> could be voted by the cluster possibly dynamically but that's not
> necessary
> >>>> for the first version.
> >>>>
> >>>> #2 apply indexing on all nodes
> >>>> JGroups could send the work queue to all nodes and each node could
> apply
> >>>> the change.
> >>>> for various reasons I am not fan of this solution as it creates
> overhead
> >>>> in CPU / memory usage and does nto scale very well from a theoretical
> PoV.
> >>>>
> >>>> #3 Index copy
> >>>> this is what you are describing, copying the index using JGroups
> instead
> >>>> of my file system approach. This might have merits esp as we could
> diminish
> >>>> network traffic using multicast but it also require to rethink the
> master /
> >>>> slave modus operandi.
> >>>> Today the master copy on a regular basis a clean index to a shared
> >>>> directory
> >>>> On a regular basis, the slave go and copy the clean index from the
> >>>> shared directory.
> >>>> In your approach, the master would send changes to the slaves and
> slaves
> >>>> would have to apply them "right away" (on their passive version)
> >>>>
> >>>> I think #1 is more interesting than #3, we probably should start with
> >>>> that. #3 might be interesting too, thoughts?
> >>>>
> >>>> Emmanuel
> >>>>
> >>>> PS: refactoring is a fact of life, so feel free to do so. Just don't
> >>>> break public contracts.
> >>>>
> >>>> On  May 21, 2009, at 22:14, Łukasz Moreń wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> I have few questions that concern using JGroups to copy index files.
> I
> >>>>> think to create sender(for master) and receiver(slave) directory
> providers.
> >>>>> Sender class mainly based on existing FSMasterDirectoryProvider,
> first
> >>>>> create local index copy and send later to slave nodes
> >>>>> (or send without copying, but that may cause lower performance?).
> >>>>> To avoid code redundancy it would be good to refactor a little
> >>>>> FSMasterDirectoryProvider class, so then I can use copying
> functionality in
> >>>>> new DirectoryProvider and add sending one; or rather I should work
> around
> >>>>> it?
> >>>>>
> >>>>> I do not understand completely how does the multithreading access to
> >>>>> index file work. Does FileChannel class assure that, when index is
> copied
> >>>>> and new Lucene works are pushed?
> >>>
> >>>
> >>>
> >>
> >
> >
> > _______________________________________________
> > hibernate-dev mailing list
> > hibernate-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20090613/7eaf91f2/attachment.html