[hibernate-dev] The current Hibernate Search sprint: lots of topics!

Sanne Grinovero sanne at hibernate.org
Tue Mar 31 19:50:16 EDT 2015


All,
let me clarify the general goal of this sprint. I don't expect to
celebrate with a 5.2.0.Final this time, but I'd aim at getting some of
the long standing big tasks done, and finish these three weeks with a
5.2.Beta1. We need to organize in several parallel significant themes.

There are some "big" themes going on which you need to be aware of
beyond the granularity of JIRA.
Your help in properly inspecting these with experiments and then break
them down in smaller tasks is what I'm needing the most right now. I'd
highly appreciate if each of you could take on leadership of one of
these themes, and get at least one other team member as primary
reviewer and brainstorming mate.

These are the primary themes:
 - the Faceting refactoring - led by Hardy
 - the dynamic types work - led by me
 - Hibernate ORM 5 compatibility and testing - almost done
 - getting rid of the Infinispan module - led by Gustavo
 - a discussion with the wildFly team about how to share the module
structure / build / definitions (more on this soon)
 - Lucene 5
 - R&D: explore better clustering strategies, better master election
(or no-master architectures)
 - Better integration with ORM's Multi-tenancy - being quite requested
recently - Davide?

If we really could upgrade both ORM and Lucene to 5, then we could
promote this to a new major release. Of course I'm dreaming and that's
not going to happen in practice - not least that would require an ORM
5.0.0.Final.

So what I'm expecting is that we explore the needs for these, and you
help me identify which steps are needed to get these both upgraded in
the near future. That means we might be raising more issues than
solving them, but that's good as it clarifies which atomic, self
contained and consistent steps we then need to perform to get there.

I'm currently working on ORM5 tasks, will soon share some PRs of
things which could already be merged, but of course the final step
won't be applied as we're not really going to upgrade yet - unless we
agree we're only releasing betas until ORM is final too.

For Lucene 5: the work which Hardy is doing is essential:
 - update the Faceting code
 - move our code to use the new FieldDocs
After that, the upgrade won't be that bad (not as hard as Lucene 4)

I just created some JIRAs as "container" for these larger themes, just
please keep in mind that I'm not setting the version to be "5.2" as
they will probably span multiple releases. The goal should not be to
resolve them, but to start them and split them up in subtasks which
can be merged already.
I'm pretty sure that several resulting sub-tasks can be merged already.

There is a new label in JIRA: "current_sprint", so we can identify
them all even though they are not marked to be fixed for version 5.2.

The "R&D" tasks are not in JIRA at all, I'm still gathering
requirements - still we'll need to dedicate some time to
experimentation and brainstorming.

I realize these are many parallel paths to work on; we're many
experienced devs though, and these should be workable in parallel.
If each of you can take some leadership on an area I hope we can close
them all by the next iteration (except probably the R&D task).

===

That said on the larger themes, there is of course a list of
traditional tasks which will shape the 5.2 improvements.
These are marked "5.2" on JIRA; some are trivial, like missing javadoc
or a paragraph of documentation but need some figuring out to craft
the right docs.

Let me comment these briefly to see if any picks your interest.

# HSEARCH-1848 Replace the Infinispan Directory provider with the one
distributed by the Infinispan project
As discussed: we'll remove the module, but need to make sure we can
plug in the one distributed by Infinispan. Needs Infinispan to release
it first.

# HSEARCH-1214 Review SearchFactory initialization
For our own sake of mind.. the boot process is hard to understand. I
have some ideas, and there are many things to keep in mind so I'll
probably try to take this myself but otherwise I'll transfer my brain
dump.. best over voice.

# HSEARCH-1472 Broaden collection of built in IndexManager
implementations to simplify choice of sensible configurations
As discussed at the team meeting. The goal is to simplify
configuration and documentation, prevent sick configuration choices.

# HSEARCH-1474 MassIndexer needs to avoid being timed out by the
TransactionManager
This is high value and long standing, but complex. Gunnar started
working on a test.

# HSEARCH-1536 Improve the test suite around MoreLikeThis
(association, custom fieldbridge, class bridges)
There are several open tasks around MLT. This is the warmup point to
finalize it MLT... I didn't schedule the other tasks for this sprint.

# HSEARCH-1589 ServiceManager closes services too aggressively
A sensible optimisation, probably easy. Beware: concurrency and
bootstrap related.

# HSEARCH-1654 Disable merge policy during Massindexing
A great performance optimisation for mass indexing people. I think
it's trivial, but to be verified you'll need to setup a relatively
long run - we have a repository with instructions to reindex the
Wikipedia

# HSEARCH-1681 Index optimisation should commit to publish the
performed optimisation
Trivial to do - one liner - but not so trivial to test for.

# HSEARCH-1684 ResultTransformer ignores transformList on tuples
No idea, needs to be looked at to make Marc S. happy.

# HSEARCH-1708 Using DistanceSortField does not verify the field
parameter passed to the constructor

# HSEARCH-1711 EntityIndexingInterceptor executes on different part of
the hierarchy

# HSEARCH-1729 Document the Infinispan configuration property
`metadata_writes_async`
This was not documented as it's an highly experimental property. I was
hoping we could run some more tests, but I won't have the time for
that at the moment, so either someone volunteers for the test, or we
keep it a secret, or decide to document it with warnings.

# HSEARCH-1762 Improve javadocs of builtin bridges

# HSEARCH-1773 org.hibernate.search.backend.impl.WorkVisitor not
exported by engine osgi bundle
Or find some alternative way... but whatever the solution we need to
get OSGi as "done" status.

# HSEARCH-1783 Reproduce transaction timeouts during mass indexing
Gunnar already on it.

# HSEARCH-1793 CriteriaObjectInitializer causes too many object loads
in cross hierarchy queries
This one is nasty, we should get rid of it.

# HSEARCH-1803 Infinispan integration test search in the wrong node
since we're removing the code.. we need to apply this as
https://issues.jboss.org/browse/ISPN-5339

# HSEARCH-1804 Boost on IndexedEmbedded properties
This really should just work as the user requests

# HSEARCH-1811 WIldcard with multiple fields
Another sensible usability improvement

# HSEARCH-1812 Documentation doesn't clearly explain how one obtains
the existing SearchIntegrator
Start a documentation section "integrators and framework developers" ?

# HSEARCH-1815 Clarify the need to depend on an implementation of
SerializationProvider
Apparently we don't state one will be needed ;)

# HSEARCH-1816 Explicitly validate the version of Hibernate ORM
A usability improvement, as proposed on the mailing list. +1 for Gunnar's ideas.

# HSEARCH-1826 Make it possible to test Hibernate Search with preview
builds of Hibernate ORM 5
I'm working on this one.

# HSEARCH-1828 Clarify documentation about ways to disable Hibernate Search

# HSEARCH-1839 FieldBridge instance initialization might use reference
access to the booting framework
This is needed by the jBPM / Drools teams. At least the programmatic
configuration should be trivial.

# HSEARCH-1844 Review which components should no longer be tagged as
experimental

# HSEARCH-1847 Create a FSDirectory extension which doesn't ever sync to disk
Requested by Infinispan - might become an urgent requirement soon,
better have this ready.


More information about the hibernate-dev mailing list