[hibernate-dev] [Hibernate Search] Database back end worker

Wed Aug 5 17:38:43 EDT 2015

Hi Flemming,
welcome on this list!

I waited a bit to reply myself, as you already know I like the
proposal. Unfortunately many others are on holidays, so other feedback
might be slow.

Still I wouldn't let that slow you down and start the works for
merging it; I already anticipated over chat that this would come and
we all agree that the concept is useful!
I don't think others looked at the details yet, but if it comes to
concerns at that level, we can address smaller issues incrementally.
(I also didn't look at micro-details, as it's easier to comment on
those on a pull request).

I had the same question as Martin regarding clustering: with the
current implementation you expect something like the master/slave
configuration, or Infinispan to be used as storage, correct?
I also think it would be interesting to explore the approach further
to also - optionally - serve as a replacement for these, but that's
another feature which is easier to experiment with after the core
concept is merged.

In short, I would simply merge your backend as a new module in
Hibernate Search! Fork our repository, and send a pull request.

# Code layout / Modules

In terms of code structure, you might have noticed that the module
'hibernate-search-engine' (/engine in the source code) does not depend
on JPA nor Hibernate ORM; the reason is that other projects reuse the
core indexing strategy and the backends. Since it would be nice to
allow them to optionally use your backend, still not mandate a
dependency on ORM for those who don't, I think this should be a new
Maven module.

We already have
 /backends/jgroups
 /backends/jms

So we could add (name to be refined?) :
 /backends/relationaldb

Also, your integration tests probably should be moved together with
our other integration tests. They are currently running WildFly
10.0.0.Alpha6, but that shouldn't be a problem.

# Code Style

We use tabs ;-)
And also have various other "exotic" conventions regarding white-space
usage, right header files, etc..
We use CheckStyle to keep it tidy, it will give you lots of errors and
when there are many it's not very helpful; I would suggest to take the
formatting templates attached at the following link and use your IDE's
formatting capabilities, resort to checkstyle just for the final
validation:
 - http://hibernate.org/search/contribute/

# JDK

It looks like your extension requires Java 8; if you could convert it
to Java 7 that would be nice.

# Rebasing to latest

I'm afraid we're now aiming at Hibernate ORM 5, so some details might
need to be updated; probably just in the configuration area. We're
also in the process of upgrading to Apache Lucene 5, but that
shouldn't affect you at all.

# Some improvement ideas

While we should support the case in which Hibernate Search is not
being run as an extension of Hibernate ORM, that's likely the most
common one.
In that scenario I think it would be nice to be able to lookup the
existing ORM services so that users don't need to repeat for example
the datasource configuration.

We might also be able to reuse all of the SessionFactory, but I'm not
sure how to include your model without it potentially interfering with
the end user's model; I'd say let's start by sharing some services
from ORM and then see what kind of improvements we can build into ORM
for this use case; for example this might simplify some of the
TransactionManager configuration code I'm seeing in your repository.

Of course your existing configuration properties are useful too,
especially for the non-ORM case as we'll need be able to reuse the ORM
services.

Also, you might have noticed we are now able to optionally include the
backend operations in the same transaction. That's not the default, as
commonly people don't want that, but it would be very interesting to
evolve this backend to support that option too, you wouldn't even
require XA when storing the entity in the same database!
 - http://in.relation.to/2015/07/09/hibernate-search-jms-transaction/

I'd be happy to help with this, feel free to share non-working and/or
intermediate experimental branches when having questions or just
stuck.
Please start by creating a JIRA, you can leave the target version
undefined: we'll merge it when it's ready.

Thanks,
Sanne

On 5 August 2015 at 20:05, Flemming Harms <flemming.harms at gmail.com> wrote:
> Hi Martin
>
> For this version the AbstractDatabaseHibernateSearchController is not able
> to process Lucene workers simultaneously. When we build it our initial
> requirement was only one node should process the workers at a time, but the
> “master” was floating. We use Quartz to get this type of functionality and
> it will synchronizing the execution between the nodes. But you could also
> use an HA-singleton to dedicate a specific node to process the workers.
>
> We had been playing with an idea where we stamp the LuceneDatabaseWork with
> the known cluster nodes, and then the last node will remove it from the
> database or a scheduled job can take care of it. The advance of this
> solution is it will make Infinispan optional, and it can store the indexes
> on each node instead in a shared cache.
>
> Your idea and work look very nice. Pretty awesome feature to support
> different JPA providers.
>
> --
> cheers
> Flemming
>
>
> 2015-08-05 11:57 GMT+02:00 Martin Braun <martinbraun123 at aol.com>:
>
>> Hi,
>>
>>
>> Note: I am no core developer of Hibernate Search, but I am currently
>> working on something
>> that looks quite similar to what you are doing :). One part of it is an
>> updating mechanism based on triggers
>> that uses the database as a event-storage as well. It's not the exact same
>> thing, but related.
>>
>>
>> https://github.com/Hotware/Hibernate-Search-JPA
>>
>>
>>
>> The idea is quite nice, but after looking at the sourcecode I am wondering
>> how the different nodes are able to work together, because in
>> AbstractDatabaseHibernateSearchController you remove the entity
>> from the persistence context and I wasn't able to find code that would
>> make up for that.
>>
>>
>> Doesn't that mean that the other workers will not be able to read that
>> entity?
>> Or will users of this need to implement their own synchronization
>> mechanism between
>> the different nodes?
>>
>>
>> Martin Braun
>> martinbraun123 at aol.com
>> www.github.com/s4ke
>>
>>
>>
>>
>> -----Original Message-----
>> From: Flemming Harms <flemming.harms at gmail.com>
>> To: Hibernate.org <hibernate-dev at lists.jboss.org>
>> Sent: Tue, Aug 4, 2015 6:40 pm
>> Subject: [hibernate-dev] [Hibernate Search] Database back end worker
>>
>>
>> Hey guys
>>
>> I want to introduce myself and a new database back-end worker, me
>> and
>> another guy have build for hibernate search. I already had some initial
>> talk
>> with Sanne regarding if this could be interested to the hibernate
>> search
>> project.
>>
>> I have been working with Hibernate Search from some time and actually
>> done
>> various small custom modification to search since 3.x, especial
>> around
>> running in a cluster and indexing. To make a long story short when
>> we
>> upgraded Hibernate search we thought it would be ideal to use a SQL
>> database
>> as storage for lucene workers for 3 main reasons.
>>
>> - The database was shared
>> between the nodes
>> - The workers was persistent in case of a node crash.
>> - No
>> master/slave
>>
>>
>> *In some way it’s very similar to the JMS back-end worker, where
>> the user
>> also have to implement a MDB that process the workers. In our case
>> they
>> will have to implement a job using something like quartz or a
>> timer
>> service. *
>>
>> *We are using JPA as persistence layer for the database, even
>> it’s a fairly
>> simple entity we persistent, but it make sense for supporting
>> various
>> databases and schema update out of the box. We have tried to make it’s
>> as
>> easy as possible to set-up by minimizing the number of properties, and
>> it’s
>> all configurable from the persistence.xml*
>>
>> *The actually work can* be
>> *find
>> here
>> https://github.com/umbrew/org.umbrew.hibernate.database.worker.backend
>> <https://github.com/umbrew/org.umbrew.hibernate.database.worker.backend>*
>>
>>
>>
>> *So
>> based on this introduction and the code, is this something you could
>> use? (of
>> course with the modification it requires to follow the design,
>> style, docs etc
>> for the search)*--
>>
>> Kind regards
>> Flemming
>> Harms
>> _______________________________________________
>> hibernate-dev mailing
>> list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>
>
>
> --
>
> Kind regards / Med Venlig Hilsen
> Flemming Harms
>
>    -
>
> https://twitter.com/fnharms
> https://dk.linkedin.com/in/fharms
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev