[hibernate-dev] development sprint start: Hibernate Search

Sun Jun 28 19:43:57 EDT 2015

Hello,
welcome to Hibernate Search time!

[for those unaware: some of are now experimenting to work on 2-3 week
sprints fully focused on a single Hibernate project, rotating the
subject. We decided this privately as it's a matter of time-management
for us, but I'm now opening the conversation up to all developers and
contributors as it affects the project evolution and technical
discussion; essentially it means we'll be focused on Hibernate Search
more than other projects in the next few weeks, and aim at get some
significant stuff done]

My first and foremost goal for the next couple of weeks would be to
drive forward a pain point which is all of:
 - showing active interest from several power-contributors [1,2,3]
 - highly demanded from product perspective
 - had lots of people *begging* for better solutions in the past

You might have guessed: I'm talking about the backend configuration
complexity in a clustered environment: both the JGroups and the JMS
solutions expose the user to various complex system settings.
With Emmanuel and Hardy I had some hints of conversations about it,
but essentially to start this subject I'm proposing a meeting to
discuss these; we can try and make it open to everyone, I might even
make a couple of slides.

 # What do we want
During our last meeting, a scary point was to hear that Emmanuel was
considering the priority to be free form. It never was for me, and
while we didn't dig during that call, we better clarify this soon.
Let's please find a moment on IRC to discuss the goals, especially as
I need to update the project roadmap.

 # How do we want it
I've been hoping for a clear/formal set of requirements to be provided
by some users, as there are many ways to look at the problem.
But this never came, and I'm concluding that:
 A) if a paying customer or other kind of sponsor will want to discuss
these requirements I better fly to them and talk face to face.
 B) I'm being lazy and selfish in expecting externals to clarify all
details.. I shouldn't try to deflect this hard problem.

I've been thinking of several possible ways, and there are lots of
options, and some tradeoffs to choose from.
One of these options is to use a distributed consensus - since we
already use JGroups in various projects, JGroups RAFT [4] seems a
natural candidate but while I'd love the excuse to play with it, it's
a very new codebase.
Another option would be the more mature Apache Kafka - great for log
based replication so might even be complementary to the JGroups RAFT
implementation - or just improve JMS (via the standard or via Apache
Camel) to have it integrate with Transactions [5: just got a
contribution!] and provide better failover options.
Not least, I just heard that WildFly 10 is going to provide some form
of automatic HA/JMS singleton consumer.. I just heard about it and
will need to find more about it.

While it's tempting to implement our own custom super clever backend,
we should prioritize for an off-the-shelf method with high return on
investment to solve the pain point.
Also, as suggested by Hardy some months ago, it would be awesome to
have the so called "Hibernate Search master node" not need any entity
classes nor depend at all to the deployed application (nor its
extensions like analyzers), so that if the solution still needs a
"master role", we could simply provide a master app which doesn't need
changes on application changes. This would necessarily be a change for
6.0, but let's either prepare for that, or get rid of the "master
node" concept altogether.

We have been sitting and thinking about the problem since a while, I'd
love now to see some empirical progress: merge them as experimental,
and have some natural selection happen while these would also help us
to refine the requirements.

## Other topics
Of course this isn't the only thing we'll be working on. The primary
goal of current branch is still to deliver an Hibernate ORM 5
compatible version, but we're in an horrible position with that since
WildFly 10 just released another alpha tag which still doesn't use
Hibernate  ORM v.5; Since it's [currently and temporarily] hard to run
WildFly overriding the Hibernate ORM version, we won't be able to
close in to a CR or Final release until at least the next Wildfly tag.
In the meantime we can do some of research needed for the above topic,
and make progress with the many issues open for 5.4 [6].

Another subject which we really should work on in this sprint, is to
avoid transaction timeout on MassIndexer within a container [7]

So, for tomorrow: to get started, JIRA is updated and you have all
tasks assigned already. Let's start from there, and then schedule a
meeting to discuss the above.

Thanks!
Sanne

References :
1 - https://github.com/umbrew/org.umbrew.hibernate.database.worker.backend
2 - https://github.com/mrobson/hibernate-search-infinispan-jms
3 - https://forum.hibernate.org/viewtopic.php?f=9&t=1040179
4 - https://github.com/belaban/jgroups-raft
5 - https://hibernate.atlassian.net/browse/HSEARCH-668
6 - https://hibernate.atlassian.net/issues/?filter=12266
7 - https://hibernate.atlassian.net/browse/HSEARCH-1474