[hibernate-dev] [Hibernate Search] Donating some of our source code to Infinispan

Gunnar Morling gunnar at hibernate.org
Mon Jan 26 06:42:29 EST 2015

2015-01-26 11:53 GMT+01:00 Sanne Grinovero <sanne at hibernate.org>:

> On 26 January 2015 at 10:00, Gunnar Morling <gunnar at hibernate.org> wrote:
> > Hi,
> >
> > +1 for addressing this, the cyclic dependency has been a pain point for a
> > long time.
> >
> > Moving the directory provider code out of HS is the right thing to do.
> But
> > moving it into ISPN creates the release dependency you describe: After a
> HS
> > release there will be a time where people potentially cannot use it with
> the
> > ISPN directory provider until there has been a new ISPN release which
> adapts
> > to the HS (and, transitively, Lucene) changes. So it kind of ties the
> > release cycle to HS, although ISPN itself might be fine using the
> previousi
> > HSEARCH version for its own querying purposes.
> No, it "unties". Currently Infinispan is strictly required to follow
> up with a Search upgrade on each minor change, and vice-versa.
> Having flexibility such as the option to not upgrade is a huge benefit
> we don't have today.

But how realistic is the option not to upgrade?

It means a new HS (which e.g. upgrades Lucene) cannot be used with the ISPN
directory unless there has been a new release of ISPN altogether. So there
seems to be quite a high motivation to do an ISPN release right after HS
has been released. But if that's alright for ISPN, then go for it.

To me it'd appear beneficial though to narrow down the pieces which would
have to be released in such a timely manner to the minimum. A separate
component may easier and quicker be updated and released than the entire
project which e.g. may contain other work in progress which may not be in
releasable state at the point of time where HS is out and just would need a
one-line fix in the directory implementation.

> Let's assume for simplicity that, while planning a component upgrade:
>  - upgrading the Directory (in Infinispan) takes 5 days of work
>  - upgrading the Infinispan Query to latest Search APIs takes 5 days of
> work
>  - upgrading the Search engine to latest Lucene takes 5 days of work
> == total is 15 days of work
> It does't matter how you shuffle these tasks around, it's still 15
> days of work to update all the code. Currently - in a perfect world in
> which each project would release exactly whenever I need it (like
> never has regressions or trouble releasing), and in which a release
> does take ZERO seconds (as you'd need many releases to get there)
> performing a release upgrading these would take you 15 days.
> So that's 15 days a user would need to wait to get benefit of any new
> Lucene feature, or 15 days before he can download our stuff and start
> evaluating an upgrade.
> In the new model, he has the choice to:
>  - download the Search release after 5 days, as he doesn't need/use
> Infinispan
>  - download the same complete integration in 15 days
> So: really the user isn't waiting more. Even in the worst case, our
> "time to market" would be faster because of the missing intermediate
> releases as in reality it doesn't take us zero seconds to publish one.

Sure, I'm not arguing about the requiring working days. Only that it may be
easier to release a separate component after a HS release than the entire
ISPN which may not be in a good position to do the release e.g. to other
ongoing changes.

> > To me the problem appears to be that ISPN a) uses HS (for querying) but
> b)
> > also extends it (as index storage) at the same time. For a) ISPN could
> live
> > with older HS versions AFAIU, whereas for b) following HS releases
> closely
> > is needed in order to keep that integration usable as HS (and Lucene)
> > advances.
> Right
> > So alternatively the ISPN Lucene directory as well as the HS directory
> (and
> > related integration tests) could be moved into a separate repository,
> with
> > its own release lifecycle. Then when a new HS release comes out, only
> that
> > separate integration project needs to be adapted and released (which of
> > course could be prepared during Alpha/Beta phases, so it could be
> released
> > on the same day as HS), whereas no new version of ISPN itself would be
> > needed. Also the dependency cycle would be nicely untangled:
> We already discussed this. It doesn't work as nicely as the new module
> would have strict coupling to 2 different project's versions.

Yes, that may be a challenge. How big it is depends on the stability of the
used APIs/SPIs. I'd hope the integration could work e.g. compatibly across
the ISPNs of one major release family (say 7.x)?

> > * ISPN depends on HS
> > * "Lucene directory + provider" depends on HS and ISPN
> > * Users wishing to store their indexes in ISPN would add the "Lucene
> > directory + provider" dependency
> >
> > One issue I can see though is that the "Lucene directory + provider"
> > component could not be easily part of a ISPN/JDG distribution as the
> > dependency would be the other way around. Not sure whether that's
> actually a
> > problem.
> That's a serious problem, and not open to debate: we need to have
> these integrations distributed by the Infinispan download.

Out of interest, why is that?

As a HS user working with the ISPN directory, I'd find it appealing if
updating HS would only require me to get an update to such integration
module rather than updating the entire ISPN as well (maybe only the
integration bits within ISPN have actually changed, but I couldn't easily
really know). Less moving pieces.

Btw. a combined download still could assembled btw., admittedly more
difficult though.

Anyways, just my 2ct. Both approaches are better than what we have now.


> Which is why I think it's essential to follow the Infinispan versioning
> scheme.
> >
> >> - It will be easier to have new Infinispan code take advantage of
> >> features exposed on the Infinispan Lucene Directory
> >
> > This makes me curious; There is ISPN code which directly depends on that
> > directory implementation? What is this about, I assumed no such
> dependency
> > to exist. If so, having the directory and provider in the ISPN repo
> itself
> > indeed seems the only way, with the consequence of ISPN having to follow
> HS
> > releases in a timely manner.
> We recently had several improvements applied to the Infinispan
> Directory implementation, which then needed a one-liner change in the
> Search module to "active" such a feature or pass-on a configuration,
> essentially translating from the properties in Search to the DSL of
> Infinispan.
> Each of these had to be backported, to allow an older version of
> Search to be able to be combined with a newer version of Infinispan
> without silently deactivating the feature - which is ultimately an
> Infinispan feature.
> In some cases Gustavo had to resort on reflection to invoke the right
> methods, as we're not upgrading Infinispan for branches 4.4 and 4.5 of
> Search, so some of these methods which need to be invoked are not
> existing, and not testable.
> An example of what this is leading to:
> https://github.com/hibernate/hibernate-search/blob/4.5/infinispan/src/main/java/org/hibernate/search/infinispan/impl/InfinispanDirectoryProvider.java#L139
> It turns out that's not an isolated case, I'm now tinkering with some
> more awesome possibilities, but having to adjust that module is making
> it more convenient to use my own copy in Infinispan rather than reuse
> it. Ultimately this all just doesn't make much sense, ans is resulting
> in missing improvements in both projects. For example, what I'm
> evolving in Infinispan isn't anymore Search compatible, while I'd wish
> both could benefit from it.
> -- Sanne
> >
> > --Gunnar
> >
> >
> > 2015-01-25 22:19 GMT+01:00 Sanne Grinovero <sanne at hibernate.org>:
> >>
> >> All,
> >> we've discussed several times the issues we have because of the
> >> circular dependency between Hibernate Search and Infinispan.
> >> Normally the pain point we aim to address with such discussions is the
> >> need to carefully coordinate between releases [of Search and
> >> Infinispan] in our quest for a final stable release, and brings some
> >> release pressure.
> >>
> >> More recently it has also become a compatibility problem, as the two
> >> projects target different platforms/environments, and have different
> >> life-cycle expectations - creating more maintenance work such as lots
> >> of back-porting of patches.. distracting from our goals.
> >>
> >> We already discussed some solutions, but none too convincing. I have a
> >> new fresh proposal which I feel is more interesting:
> >>
> >> # New plan
> >>
> >> 1) the module "/infinispan" from the Hibernate Search source tree is
> >> moved to the Infinispan project into some "hibernate-search-directory"
> >> Maven module.
> >>
> >> 2) Hibernate Search drops any dependency to Infinispan
> >>
> >> Reminder: this "/infinispan" module we have contains just a couple of
> >> classes, and represents the "DirectoryProvider" implementation which
> >> integrates with the Hibernate Search autodiscovery and creates a
> >> Directory instance (whose implementation always lived in Infinispan).
> >> Most of the code is about applying configuration properties,
> >> integration tests.
> >>
> >> It's a very simple plan, but has some valuable consequences:
> >>  - Search can move on without ever needing to wait for Infinispan -
> >> and vice versa.
> >>  - There is no longer a circular dependency
> >>  - Search would be in control of its Lucene dependency, and can
> >> upgrade as needed. We could experiment with different Lucene versions
> >> without necessarily wait for Infinispan to solve compatiblity issues
> >> first - and often more complex as that means then for Infinispan to be
> >> able to guarantee a compatibility with a *range* of Lucene versions,
> >> to include both the target Lucene and the version currently consumed
> >> via Infinispan Query / Hibernate Search Engine.
> >>  - Infinispan can *opt* to stick with an older version of Search - or
> >> update - provided it can satisfy both a) integration with possible
> >> changes to the Search SPI b) an update for possible new requirements
> >> of the new Lucene version (the Lucene Directory might need
> >> compatibility fixes)
> >>  - The dependency structure would better match the one as provided by
> >> our Enterprise Products. For example, it's the JDG distribution - not
> >> Hibernate Search - to provide these integration bits, so it makes more
> >> sense to build the Directory against the specific version of
> >> Infinispan than against the specific version of Search.
> >>  - It will be easier to have new Infinispan code take advantage of
> >> features exposed on the Infinispan Lucene Directory - if all changed
> >> parts are in the same [Infinispan] repository. This is actually being
> >> a problem right now, as it's holding back a POC meant to deliver great
> >> improvements with the Directory implementation.
> >>
> >> And not least we'll have faster builds ;-)
> >>
> >> # Drawbacks
> >>
> >> First one is we'll probably have our users need to change some details
> >> of how they build.
> >> Infinispan might need to fix some SPI related code before being able
> >> to upgrade, but historically our SPI has been extremely stable.
> >> The real problem is that when such a thing happens, after we've
> >> released a version of Search there might be some time before a
> >> compatible version of Infinispan is made available as well.
> >>
> >> In practice this means the gap of time in which we have to catch up on
> >> API changes is "exposed" to end users wanting to use our latest and
> >> possibly blocked - but while they would then see the tip of the
> >> iceberg of our integration, I believe it would still take the same
> >> amount of waiting time in terms of calendar dates in which the working
> >> duo is available to them - as with the current model in such a
> >> situation we need to wait for the same Infinispan release to happen
> >> before we can release ours.
> >> So: same time, but we'd have a leaner process, and possibly quicker
> >> releases for all users not interested in that - or just benefits in
> >> all those scenarios in which we don't break APIs which is very common.
> >> I've not identified other problems, so my opinion is that these are
> >> well worth the benefits.
> >>
> >> # Consequences for our users
> >>
> >> Not much. Even today we expect our users to depend on several jar
> >> files provided by the Infinispan team; this would be just one more.
> >> Opens some questions though:
> >>
> >>  A) Should the Maven group id be changed? I'd expect it to be
> >> transferred to "org.infinispan" group at least, and probably need a
> >> better artifact id too.
> >>
> >>  B) License. Our code is LGPL, most of Infinispan is ASL - but not all
> >> of it. So I expect it would be possible to keep the existing license
> >> at least for now, and defer eventual license changes as a separate
> >> step (if people feel need for any change at all).
> >>
> >>  C) Documentation. Besides the needed updates in Maven coordinates /
> >> download sources, I don't expect much of a difference: we'd still
> >> explain how to set this integration up.
> >>
> >>  D) Distribution. Today we distribute this module, and its
> >> dependencies, in our release bundles. Which implies we distribute a
> >> copy of various Infinispan jars.
> >> I think we should drop these from our distribution - even though it
> >> might seem counter-intuitive:
> >> while it might seem convenient to have these included, the whole point
> >> of the change would be that there would be more flexibility in which
> >> versions of Infinispan would work with Search. And actually the
> >> integration tests and this specific knowledge would be responsibility
> >> of Infinispan.
> >>
> >> Am I failing to see a more critical issue?
> >> How would you all feel about our code being transferred to the
> >> different project?
> >>
> >> Sanne
> >> _______________________________________________
> >> hibernate-dev mailing list
> >> hibernate-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >
> >

More information about the hibernate-dev mailing list