welcome to Hibernate Search time!
[for those unaware: some of are now experimenting to work on 2-3 week
sprints fully focused on a single Hibernate project, rotating the
subject. We decided this privately as it's a matter of time-management
for us, but I'm now opening the conversation up to all developers and
contributors as it affects the project evolution and technical
discussion; essentially it means we'll be focused on Hibernate Search
more than other projects in the next few weeks, and aim at get some
significant stuff done]
My first and foremost goal for the next couple of weeks would be to
drive forward a pain point which is all of:
- showing active interest from several power-contributors [1,2,3]
- highly demanded from product perspective
- had lots of people *begging* for better solutions in the past
You might have guessed: I'm talking about the backend configuration
complexity in a clustered environment: both the JGroups and the JMS
solutions expose the user to various complex system settings.
With Emmanuel and Hardy I had some hints of conversations about it,
but essentially to start this subject I'm proposing a meeting to
discuss these; we can try and make it open to everyone, I might even
make a couple of slides.
# What do we want
During our last meeting, a scary point was to hear that Emmanuel was
considering the priority to be free form. It never was for me, and
while we didn't dig during that call, we better clarify this soon.
Let's please find a moment on IRC to discuss the goals, especially as
I need to update the project roadmap.
# How do we want it
I've been hoping for a clear/formal set of requirements to be provided
by some users, as there are many ways to look at the problem.
But this never came, and I'm concluding that:
A) if a paying customer or other kind of sponsor will want to discuss
these requirements I better fly to them and talk face to face.
B) I'm being lazy and selfish in expecting externals to clarify all
details.. I shouldn't try to deflect this hard problem.
I've been thinking of several possible ways, and there are lots of
options, and some tradeoffs to choose from.
One of these options is to use a distributed consensus - since we
already use JGroups in various projects, JGroups RAFT  seems a
natural candidate but while I'd love the excuse to play with it, it's
a very new codebase.
Another option would be the more mature Apache Kafka - great for log
based replication so might even be complementary to the JGroups RAFT
implementation - or just improve JMS (via the standard or via Apache
Camel) to have it integrate with Transactions [5: just got a
contribution!] and provide better failover options.
Not least, I just heard that WildFly 10 is going to provide some form
of automatic HA/JMS singleton consumer.. I just heard about it and
will need to find more about it.
While it's tempting to implement our own custom super clever backend,
we should prioritize for an off-the-shelf method with high return on
investment to solve the pain point.
Also, as suggested by Hardy some months ago, it would be awesome to
have the so called "Hibernate Search master node" not need any entity
classes nor depend at all to the deployed application (nor its
extensions like analyzers), so that if the solution still needs a
"master role", we could simply provide a master app which doesn't need
changes on application changes. This would necessarily be a change for
6.0, but let's either prepare for that, or get rid of the "master
node" concept altogether.
We have been sitting and thinking about the problem since a while, I'd
love now to see some empirical progress: merge them as experimental,
and have some natural selection happen while these would also help us
to refine the requirements.
## Other topics
Of course this isn't the only thing we'll be working on. The primary
goal of current branch is still to deliver an Hibernate ORM 5
compatible version, but we're in an horrible position with that since
WildFly 10 just released another alpha tag which still doesn't use
Hibernate ORM v.5; Since it's [currently and temporarily] hard to run
WildFly overriding the Hibernate ORM version, we won't be able to
close in to a CR or Final release until at least the next Wildfly tag.
In the meantime we can do some of research needed for the above topic,
and make progress with the many issues open for 5.4 .
Another subject which we really should work on in this sprint, is to
avoid transaction timeout on MassIndexer within a container 
So, for tomorrow: to get started, JIRA is updated and you have all
tasks assigned already. Let's start from there, and then schedule a
meeting to discuss the above.
1 - https://github.com/umbrew/org.umbrew.hibernate.database.worker.backend
2 - https://github.com/mrobson/hibernate-search-infinispan-jms
3 - https://forum.hibernate.org/viewtopic.php?f=9&t=1040179
4 - https://github.com/belaban/jgroups-raft
5 - https://hibernate.atlassian.net/browse/HSEARCH-668
6 - https://hibernate.atlassian.net/issues/?filter=12266
7 - https://hibernate.atlassian.net/browse/HSEARCH-1474
Just wanted to point out a new repo in our GitHub org: https://github.com/hibernate/hibernate-test-case-templates
Many users have asked to have templates to use when creating reproducer/regression tests for bug reports. As a starting point, I included both a standalone example, as well as one that uses our unit-test framework's BaseCoreFunctionalTest.
Feel free to make modifications to these, upstream, as necessary. ORM is currently the only project with templates, but I assume this might be helpful for Search, Validator, and OGM as well.
I don't think you'll be able to convince the Infinispan of doing that:
one of the major burdens of using Infinispan for users is that it is
composed of so many jars, so we'll actually try to make the number of
dependencies smaller. Apparently a lot of people do not know how to
use dependency managers like Maven and struggle with such things.. :-(
I'd also question the usefulness of doing this: even if the platform /
container you're using does provide some JTA capability, it's not
necessary the same one as your JPA implementor is being configured to
use (some people really like to just use the JDBC transactions).
Hibernate ORM has a similar Service to abstract "how" exactly to
interact with transactions; I understand you don't want to rely on the
Hibernate specific one but other JPA implementors would probably have
a similar facility?
Maybe we should just make an Hibernate Search "Service" for this and
let users plug their custom one? You could provide a couple of
implementations to satisfy the needs of the main JPA implementations.
On 20 June 2015 at 17:33, Martin Braun <martinbraun123(a)aol.de> wrote:
> I just stumbled upon the JTA TransactionManager lookup mechanism of
> Infinispan and am now using this when a JTA transaction is needed.
> This means I don't have hacky lookups of UserTransactions. Now I've been
> wondering if it was possible to make the lookup mechanism
> a separate module in Infinispan so I don't have to import the whole thing.
> I am talking about
> org.infinispan.transaction.lookup.GenericTransactionManagerLookup in
> particular. Do you think that I can convince the Infinispan team
> (which includes you :D) to keep that in a different place than
> Martin Braun
As most of you know already, we are planning to redesign the current
Antlr-based HQL/JPQL parser in ORM for a variety of reasons.
The current approach in the translator (Antlr 2 based, although Antlr 3
supports the same model) is that we actually define multiple
grammars/parsers which progressively re-write the tree adding more and more
semantic information; think of this as multiple passes or phases. The
current code has 3 phases:
1) parsing - we simply parse the HQL/JPQL query into an AST, although we do
do one interesting (and uber-important!) re-write here where we "hoist" the
from clause in front of all other clauses.
2) rough semantic analysis - the current code, to be honest, sucks here.
The end result of this phase is a tree that mixes normalized semantic
information with lots of SQL fragments. It is extremely fugly
3) rendering to SQL
The idea of phases is still the best way to attack this translation imo. I
just think we did not implement the phases very well before; we were just
learning Antlr at the time. So part of the redesign here is to leverage
our better understanding of Antlr and design some better trees. The other
big reason is to centralize the generation of SQL into one place rather
than the 3 different places we do it today (not to mention the many, many
places we render SQL fragments).
Part of the process here is to decide which parser to use. Antlr 2 is
ancient :) I used Antlr 3 in the initial prototyping of this redesign
because it was the most recent release at that time. In the interim Antlr
4 has been released.
I have been evaluating whether Antlr 4 is appropriate for our needs there.
Antlr 4 is a pretty big conceptual deviation from Antlr 2/3 in quite a few
ways. Generally speaking, Antlr 4 is geared more towards interpreting
rather than translating/transforming. It can handle "transformation" if
the transformation is the final step in the process. Transformations is
where tree re-writing comes in handy.
First lets step back and look at the "conceptual model" of Antlr 4. The
grammar is used to produce:
1) the parser - takes the input and builds a "parse tree" based on the
rules of the lexer and grammar.
2) listener/visitor for parse-tree traversal - can optionally generate
listeners or visitors (or both) for traversing the parse tree (output from
There are 2 highly-related changes that negatively impact us:
1) no tree grammars/parsers
2) no tree re-writing
Our existing translator is fundamentally built on the concepts of tree
parsers and tree re-writing. Even the initial prototypes for the redesign
(and the current state of hql-parser which Sanne and Gunnar picked up from
there) are built on those concepts. So moving to Antlr 4 in that regard
does represent a risk. How big of a risk, and whether that risk is worth
it, is what we need to determine.
What does all this mean in simple, practical terms? Let's look at a simple
query: "select c.headquarters.state.code from Company c". Simple syntactic
analysis will produce a tree something like:
There is not a lot of semantic (meaning) information here. A more semantic
representation of the query would look something like:
Notice especially the difference in the tree rules. This is tree
re-writing, and is the major difference affecting us. Consider a specific
thing like the "c.headquarters.state.code" DOT-IDENT sequence. Essentially
Antlr 4 would make us deal with that as a DOT-IDENT sequence through all
the phases - even SQL generation. Quite fugly. The intent of Antlr 4 in
cases like this is to build up an external state table (external to the
tree itself) or what Antlr folks typically refer to as "iterative tree
decoration". So with Antlr 4, in generating the SQL, we would still be
handling calls in terms of "c.headquarters.state.code" in the SELECT clause
and resolving that through the external symbol tables. Again, with Antlr 4
we would always be walking that initial (non-semantic) tree. Unless I am
missing something. I would be happy to be corrected, if anyone knows Antlr
4 better. I have also asked as part of the antlr-discussion group.
In my opinion though, if it comes down to us needing to walk the tree in
that first form across all phases I just do not see the benefit to moving
to Antlr 4.
P.S. When I say SQL above I really just mean the target query language for
the back-end data store whether that be SQL targeting a RDBMS for ORM or a
NoSQL store for OGM.
 I still have not fully grokked this paradigm, so I may be missing
something, but... AFAICT even in this paradigm the listener/visitor rules
are defined in terms of the initial parse tree rules rather than more
I am not so sure that manually building a tree that would work with
listeners/visitors generated from a second grammar is going to be an
option. I have asked on SO and on the Antlr discussion group and basically
got no responses as to how that might be possible. See
So the question is whether generating a semantic tree that is not Antlr
specific is a viable alternative. I think it is. And we can still provide
hand written listener and or visitor for processing this.
JPA 2.1 shows examples of using multiple downcasts in a restriction:
SELECT e FROM Employee e
WHERE TREAT(e AS Exempt).vacationDays > 10
OR TREAT(e AS Contractor).hours > 100
CriteriaQuery<Employee> q = cb.createQuery(Employee.class);
Root<Employee> e = q.from(Employee.class);
These don't work in Hibernate for joined inheritance because Hibernate uses an inner join for the downcasts.
I've added a FailureExpected test case for this: https://github.com/hibernate/hibernate-orm/commit/1ec76887825bebda4c02ea2...
IIUC, inner join is correct when TREAT is used in a JOIN clause. If TREAT is only used for restrictions in the WHERE clause, I *think* it should be an outer join. Is that correct?
HHH-9862 also mentions that Hibernate doesn't work properly when there are multiple select expressions using different downcasts, as in:
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Object> query = cb.createQuery(Object.class);
Root<Pet> root = query.from(Pet.class);
I don't think this should work, at least not with implicit joins. Is this valid?
Let's keep this on list ok? Thanks.
First, have you seen the new hibernate-java8 module that is part of 5.0?
Some replies in-line...
> But I am curious what would be your opinion on this extended API idea and
approach that I took and whether you would for instance consider taking it
under the hibernate umbrella as a kind of official extension in future.
Hard to say. More than likely the next major release will drop support for
Java 6. I personally see no benefit in supporting "just" 7, so I'd likely
go right to 8.
> At the moment I would like to focus on three things:
> - Introduce typed (generic) queries, because at this moment streams of
Object's are not quite useful.
Yes, we should certainly do this. I just made some Session methods
generic, so this certainly fits. But this is not Java 8 specific in any
> - Enable registration of LocaleDate, LocalTime as query params.
You can do that now:
* org.hibernate.Query#setParameter(java.lang.String, java.lang.Object)
* org.hibernate.Query#setParameter(java.lang.String, java.lang.Object,
I assume you mean adding method signatures accepting those specific types?
> - Custom type handlers for LocalDate, LocalTime
> - Custom type handlers for Optional<?>
No idea what you mean here. What do you mean by "type handler"?
And as far as using LocalDate, etc as query parameters... you can do that
now. Do you mean specific methods accepting them?
> If you could help me with one thing on the regarding the Optional
attributes mapping. As my understanding is correct the relation/entity
mapping are being handled by OneToOneType, ManyToOneType classes. To your
knowledge would it be possible to provide an overrided version of them, or
it would rather require to alter them directly in Hibernate core?
As I understand it (I have not looked overly deeply yet), we would
essentially need a Type representing Optional that wraps an "underlying
Type". It is very similar to the idea of AttributeConverters. We could
handle it at a lower lecvel here too like we do for AttributeConverter.
it seems thath the worflow for WEBSITE on JIRA does not include the state
"PULL REQUEST SENT" (like in Search and OGM, for example)
I'd like to have it so that I can have a quick overview from JIRA of the
issues that are "almost" done.
Would it be ok to add it?