Picking automatic module names for Hibernate Search isn't going to be
straight-forward as our two main jars (hibernate-search-engine &
hibernate-search-orm) suffer from split package among them.
We can't really fix the split package problem without breaking all
users, so if we want to consider that, we can debate it but that will
need to happen at another round as we're doing a minor release, so
let's focus on:
# Which names to pick
# Should we pick the names at all
# Which modules should have a name
For a great background on the possible strategies and pitfalls I
recommend reading Stephen Colebourne's blog on this subject .
He persuaded me there are good reasons to use the "reverse DNS, the
top level package", however since we have the split package problem we
can't simply go with that.
Still, we can respect the principles he recommends with a small
variation. It's fair to assume that the `org.hibernate.search` prefix
is "ours"; since the nature of the suggestion is focused on making
sure there are no misunderstandings in the community about which names
you can choose - as there is no central authority making sure module
names aren't clashing - we should be fine within Hibernate projects
with any `org.hibernate.X` prefix, as long as we coordinate and reach
an agreement on this list.
So, I propose we use:
ORM integration module:
JGroups, JMS backends:
[ no automatic module name ? Excepting some "guidelines" in the JMS
module, these are not public API so nobody would benefit from it -
also we think we might want to phase out the name "backend" in the
Elasticsearch integration module [hibernate-search-elasticsearch.jar]:
Elasticsearch / AWS security integration:
[ no automatic module name: no public API ]
Serialization / Avro
[ no automatic module name: no public API ]
We could also pick names for the ones which I've listed as "no public
API" but I see no point: as we're only assigning an "Automatic Module
Name" we won't be able to explicitly state that the other modules
depend on these. So nobody will use them, and things are a bit in flux
anyway in this area because of Hibernate Search 6 plans.
Another optional altogether: since we have split packages which we'll
have to resolve before we can actually transform these into fully
fledged modules, I think an acceptable position is also to say we
won't be publishing any automatic module name yet. Personally I'm
inclined to go with the names suggested above, at least some others
can start making baby steps, even if it's not all there.
# What I don't like:
For one, that the typical application will need to import both
`org.hibernate.search.engine` and `org.hibernate.search.orm`, and
likely more as well (e.g. Elasticsearch API, Lucene API module is
Maybe similar to BOM's today we could publish a module which
statically imports multiple of these, that could be nicer to use but
we risk needing to publish (and document) one for each of a selection
of combinations. So let's not start with such things yet.
on HipChat you've asked why hibernate-jcache + Infinispan's
implementation of JCache is not enough. It ultimately boils down to
where (2) can be fine with some providers but then (1) suffers.
Infinispan offers transactional mode, but it's not serializable (gosh,
sometimes it's even read_uncommitted) and has its quirks. The
performance can't be as good as with non-tx mode, too. That's why the
native transactional caches support will be dropped in 5.3 and we'll
just emit a warning to update their configuration (and continue with
As a demonstration of this we can use the putFromLoad. If you implement
this as a (ideal serializable) transactional cache putIfAbsent, the
provider must either
a) lock the key (pessimistic approach) - but we don't want to block
other nodes removing data from the cache (on write) or putFromLoading
b) resolve conflicts when the transaction is committing: you figure out
that there are two concurrent updates and rollback one of the
transactions - that's not acceptable to us either, as any failure in
cache should not affect DB transaction. And there's a risk of blocking
between the 2 phases of commit, too.
Theoretically you could just wipe any modified data on conflict - I
don't know if anyone does that, 'drop everything and proceed with
commit' is not something I'd expect from a general-purpose (NoSQL) DB. I
recall Alex's JCache implementation (for 5.2) storing some 'lock'
objects in the cache, and you probably don't want to wipe those.
Interaction with evictAll/removeAll could be also problematic: not sure
about the other providers but Infinispan's clear() operation is
non-transactional even on tx cache (since Infinispan 7 or so) because
it's impractical to resolve all conflicts. I don't know details how
others provide that operation but there may be a hidden problem.
Last but not least, you assume that the provider is transactional and it
provides JCache interface. JCache does not define interaction with JTA,
because it was hard to get agreement on non-tx behaviour (why did it
take 13 years to complete the JSR?) and it would be even harder for JTA.
So what you expect is just your extrapolation or wishful thinking, and
it's up to integrators to verify that the unwritten contract is
fulfilled within the scope of hibernate-jcache module use. Not that SPI
implementors would be in a better position, but at least we are aware
that (for us) it's not enough to implement those 3 classes and job's done.
Of course the correctness aspect may be ignored with 'it's just a cache'
implying 'users expect stale/uncommitted data' as Sanne (who is much
closer to the customers than me) keeps repeating. However this is not
what 2LC promises as I understand it: the same results as DB would do.
I am really grateful that in 5.3 you've provided the
CacheTransactionSynchronization that will help us boost (1) even further
by allowing us to execute all operations in parallel. And it's good that
you've made the SPI more expressive with the intent; there'll be a bunch
of TODOs in the 5.3 implementation to cover use cases that were not
handled in previous versions but now are obvious.
Radim Vansa <rvansa(a)redhat.com>
JBoss Performance Team
Today, we released the 6.0.9.Final version of Hibernate Validator.
It comes with some nice performance improvements, some changes to the new
constraint validator payload feature and a couple bugfixes.
Time to upgrade!
More info here:
Have a nice day.
There were lots of differences in the compatibility report, so as a first
step, I've excluded packages/classes that I considered SPI, internal, or
"grey area". This reduced the the differences to a more manageable amount.
You can see a summary of the incompatibilities along with suggested
mitigation at .
The report is attached to , along with a zip with instructions for
running the report.
I believe there are some "false positives" in the report, and I have
documented them in the section, "False Positives?".
Feel free to comment on the article.
this one is a very desirable feature, yet tricky as there's high
chances of ambiguity and confusion for end users.
# Infinispan Remote indexing
Infinispan embeds the Hibernate Search engine, and uses it to index
data being inserted in any cache having indexing enabled. As you know
Infinispan can be used to store Java POJOs, which get serialized using
JBoss Marshalling - or encoded into Protobuf entries using Infinispan
Protostream as helper layer.
Hibernate OGM supports both modes, one meant for "Infinispan Embedded"
and one for "Infinispan Remote" as that's what each encoding strategy
is suited for.
# Protobuf & indexing
Protobuf is a well defined format with plenty of documentation which
focuses on a "schema" of the encoding; Hibernate OGM is able to
generate such schemas dynamically and will generate encoders and
decoders which follow the encoding guidelines for Java objects.
The meta schema of protobuf is not super flexible, yet there's the
option of annotating the Protobuf schema elements using "annotations"
Protostream allows inserting Hibernate Search annotations directly in
these comments and will use them to generate the server side indexing
configuration, implicitly also allowing such properties to be queried
For example you might have this string literally within the comments:
"@Field(store = Store.YES, analyze = Analyze.YES)"
A full example of schema can be found here .
(The Infinispan documentation is a bit sparse on this as they
encourage people to use another code gen tool, best refer to tests as
examples when working for OGM)
# What should OGM users experience?
A naive solution would be to allow people to use the Hibernate Search
annotations on their JPA entities, and we have OGM copy these into the
generated schema; there's a number of problems with that:
- not all such annotations "translate" equally well 
- there's a mismatch between JPA properties and underlying encoding fields
- if I run a FullTextQuery do I expect it to work remotely?
- what if I want to use Hibernate Search locally as well or instead?
- references to local classes obviously won't work (custom
fieldbridges, analyzers, etc..)
An alternative is to look at these as "indexes" of the underlying
store, so we'd use them to hint the Infinispan server about user
provided hints such as those generated by `javax.persistence.Index`.
I do think this is the cleaner approach, yet has two drawbacks:
A- I guess ORM might implicitly generate some indexes in its metadata
which the user might not have explicitly asked; e.g. accelerate unique
constraints and foreign keys; it's possible these might not be as
useful as expected in the Infinispan case.
B- we won't be able to leverage the awesome full-text capabilities :-(
I believe A is something we could ignore for now and revisit if
there's actual demand.
B is also not urgent, yet disappointing limitation as this capability
is a distinguishing feature of this NoSQL. Would we agree that
exposing such full-text capabilities would best be let to an ad-hoc
backend in Hibernate Search 6?
1 - http://blog.infinispan.org/2018/02/restful-queries-coming-to-infinispan-9...
2 - https://github.com/infinispan/infinispan/blob/master/remote-query/remote-...
While talking to a few bloggers from the Java ecosphere at JavaLand last
week, the question came up why we have the date in the URL of blog posts.
Arguably, it doesn't add value there (we show the date on the actual posts
themselves), and makes the URLs slightly worse to read. In particular, we
don't allow for browsing posts by year or month (e.g.
http://in.relation.to/2018/), so it's even a bit misleading. Omitting the
date would also make the original idea of the URL fly again ("in relation
Anyone with thoughts whether we should change the scheme (keeping existing
ones of course)?
That all said, I've no idea whether the date in there is good to have or
not in terms of SEO. I suppose it doesn't matter.
would it be possible to tag a CR2 as soon as the Caching SPI changes
That would help the Infinispan team so they can get started
immediately on their side of things.
Thoughts on allowing certain services to be defaulted to the single
implementor registered with the `StrategySelector` service when there is
For example, when resolving the RegionFactory unless both (a)
`hibernate.cache.region.factory_class` and (b) one of
`hibernate.cache.use_query_cache` are defined caching support will be
disabled. What I am proposing would kick in in this case and check to see
if there is just a single RegionFactory registered with the
StrategySelector and if so use that one.
It would allow Hibernate to more seamlessly operate as a JPA provider too,
as currently to use caching with JPA and Hibernate users have to do the
normal JPA stuff and then also define these Hibernate settings. It would
be nicer if they could just do the JPA stuff.