Re: [hibernate-dev] [Hibernate Search] Donating some of our source code to Infinispan

Monday, 26 January 2015

2015-01-26 11:53 GMT+01:00 Sanne Grinovero <sanne(a)hibernate.org&gt;:

...
 On 26 January 2015 at 10:00, Gunnar Morling
<gunnar(a)hibernate.org&gt; wrote:
 > Hi,
 >
 > +1 for addressing this, the cyclic dependency has been a pain point for a
 > long time.
 >
 > Moving the directory provider code out of HS is the right thing to do.
 But
 > moving it into ISPN creates the release dependency you describe: After a
 HS
 > release there will be a time where people potentially cannot use it with
 the
 > ISPN directory provider until there has been a new ISPN release which
 adapts
 > to the HS (and, transitively, Lucene) changes. So it kind of ties the
 ISPN
 > release cycle to HS, although ISPN itself might be fine using the
 previousi
 > HSEARCH version for its own querying purposes.

 No, it "unties". Currently Infinispan is strictly required to follow
 up with a Search upgrade on each minor change, and vice-versa.
 Having flexibility such as the option to not upgrade is a huge benefit
 we don't have today.

But how realistic is the option not to upgrade?

It means a new HS (which e.g. upgrades Lucene) cannot be used with the ISPN
directory unless there has been a new release of ISPN altogether. So there
seems to be quite a high motivation to do an ISPN release right after HS
has been released. But if that's alright for ISPN, then go for it.

To me it'd appear beneficial though to narrow down the pieces which would
have to be released in such a timely manner to the minimum. A separate
component may easier and quicker be updated and released than the entire
project which e.g. may contain other work in progress which may not be in
releasable state at the point of time where HS is out and just would need a
one-line fix in the directory implementation.

...
 Let's assume for simplicity that, while planning a component
upgrade:
  - upgrading the Directory (in Infinispan) takes 5 days of work
  - upgrading the Infinispan Query to latest Search APIs takes 5 days of
 work
  - upgrading the Search engine to latest Lucene takes 5 days of work
 == total is 15 days of work

 It does't matter how you shuffle these tasks around, it's still 15
 days of work to update all the code. Currently - in a perfect world in
 which each project would release exactly whenever I need it (like
 never has regressions or trouble releasing), and in which a release
 does take ZERO seconds (as you'd need many releases to get there)
 performing a release upgrading these would take you 15 days.
 So that's 15 days a user would need to wait to get benefit of any new
 Lucene feature, or 15 days before he can download our stuff and start
 evaluating an upgrade.

 In the new model, he has the choice to:
  - download the Search release after 5 days, as he doesn't need/use
 Infinispan
  - download the same complete integration in 15 days

 So: really the user isn't waiting more. Even in the worst case, our
 "time to market" would be faster because of the missing intermediate
 releases as in reality it doesn't take us zero seconds to publish one. 

Sure, I'm not arguing about the requiring working days. Only that it may be
easier to release a separate component after a HS release than the entire
ISPN which may not be in a good position to do the release e.g. to other
ongoing changes.

...

 > To me the problem appears to be that ISPN a) uses HS (for querying) but
 b)
 > also extends it (as index storage) at the same time. For a) ISPN could
 live
 > with older HS versions AFAIU, whereas for b) following HS releases
 closely
 > is needed in order to keep that integration usable as HS (and Lucene)
 > advances.

 Right

 > So alternatively the ISPN Lucene directory as well as the HS directory
 (and
 > related integration tests) could be moved into a separate repository,
 with
 > its own release lifecycle. Then when a new HS release comes out, only
 that
 > separate integration project needs to be adapted and released (which of
 > course could be prepared during Alpha/Beta phases, so it could be
 released
 > on the same day as HS), whereas no new version of ISPN itself would be
 > needed. Also the dependency cycle would be nicely untangled:

 We already discussed this. It doesn't work as nicely as the new module
 would have strict coupling to 2 different project's versions.

Yes, that may be a challenge. How big it is depends on the stability of the
used APIs/SPIs. I'd hope the integration could work e.g. compatibly across
the ISPNs of one major release family (say 7.x)?

...

 > * ISPN depends on HS
 > * "Lucene directory + provider" depends on HS and ISPN
 > * Users wishing to store their indexes in ISPN would add the "Lucene
 > directory + provider" dependency
 >
 > One issue I can see though is that the "Lucene directory + provider"
 > component could not be easily part of a ISPN/JDG distribution as the
 > dependency would be the other way around. Not sure whether that's
 actually a
 > problem.

 That's a serious problem, and not open to debate: we need to have
 these integrations distributed by the Infinispan download.

Out of interest, why is that?

As a HS user working with the ISPN directory, I'd find it appealing if
updating HS would only require me to get an update to such integration
module rather than updating the entire ISPN as well (maybe only the
integration bits within ISPN have actually changed, but I couldn't easily
really know). Less moving pieces.

Btw. a combined download still could assembled btw., admittedly more
difficult though.

Anyways, just my 2ct. Both approaches are better than what we have now.

--Gunnar

...
 Which is why I think it's essential to follow the Infinispan
versioning
 scheme.

 >
 >> - It will be easier to have new Infinispan code take advantage of
 >> features exposed on the Infinispan Lucene Directory
 >
 > This makes me curious; There is ISPN code which directly depends on that
 > directory implementation? What is this about, I assumed no such
 dependency
 > to exist. If so, having the directory and provider in the ISPN repo
 itself
 > indeed seems the only way, with the consequence of ISPN having to follow
 HS
 > releases in a timely manner.

 We recently had several improvements applied to the Infinispan
 Directory implementation, which then needed a one-liner change in the
 Search module to "active" such a feature or pass-on a configuration,
 essentially translating from the properties in Search to the DSL of
 Infinispan.
 Each of these had to be backported, to allow an older version of
 Search to be able to be combined with a newer version of Infinispan
 without silently deactivating the feature - which is ultimately an
 Infinispan feature.
 In some cases Gustavo had to resort on reflection to invoke the right
 methods, as we're not upgrading Infinispan for branches 4.4 and 4.5 of
 Search, so some of these methods which need to be invoked are not
 existing, and not testable.

 An example of what this is leading to:

https://github.com/hibernate/hibernate-search/blob/4.5/infinispan/src/mai...

 It turns out that's not an isolated case, I'm now tinkering with some
 more awesome possibilities, but having to adjust that module is making
 it more convenient to use my own copy in Infinispan rather than reuse
 it. Ultimately this all just doesn't make much sense, ans is resulting
 in missing improvements in both projects. For example, what I'm
 evolving in Infinispan isn't anymore Search compatible, while I'd wish
 both could benefit from it.

 -- Sanne

 >
 > --Gunnar
 >
 >
 > 2015-01-25 22:19 GMT+01:00 Sanne Grinovero <sanne(a)hibernate.org&gt;:
 >>
 >> All,
 >> we've discussed several times the issues we have because of the
 >> circular dependency between Hibernate Search and Infinispan.
 >> Normally the pain point we aim to address with such discussions is the
 >> need to carefully coordinate between releases [of Search and
 >> Infinispan] in our quest for a final stable release, and brings some
 >> release pressure.
 >>
 >> More recently it has also become a compatibility problem, as the two
 >> projects target different platforms/environments, and have different
 >> life-cycle expectations - creating more maintenance work such as lots
 >> of back-porting of patches.. distracting from our goals.
 >>
 >> We already discussed some solutions, but none too convincing. I have a
 >> new fresh proposal which I feel is more interesting:
 >>
 >> # New plan
 >>
 >> 1) the module "/infinispan" from the Hibernate Search source tree is
 >> moved to the Infinispan project into some
"hibernate-search-directory"
 >> Maven module.
 >>
 >> 2) Hibernate Search drops any dependency to Infinispan
 >>
 >> Reminder: this "/infinispan" module we have contains just a couple of
 >> classes, and represents the "DirectoryProvider" implementation which
 >> integrates with the Hibernate Search autodiscovery and creates a
 >> Directory instance (whose implementation always lived in Infinispan).
 >> Most of the code is about applying configuration properties,
 >> integration tests.
 >>
 >> It's a very simple plan, but has some valuable consequences:
 >>  - Search can move on without ever needing to wait for Infinispan -
 >> and vice versa.
 >>  - There is no longer a circular dependency
 >>  - Search would be in control of its Lucene dependency, and can
 >> upgrade as needed. We could experiment with different Lucene versions
 >> without necessarily wait for Infinispan to solve compatiblity issues
 >> first - and often more complex as that means then for Infinispan to be
 >> able to guarantee a compatibility with a *range* of Lucene versions,
 >> to include both the target Lucene and the version currently consumed
 >> via Infinispan Query / Hibernate Search Engine.
 >>  - Infinispan can *opt* to stick with an older version of Search - or
 >> update - provided it can satisfy both a) integration with possible
 >> changes to the Search SPI b) an update for possible new requirements
 >> of the new Lucene version (the Lucene Directory might need
 >> compatibility fixes)
 >>  - The dependency structure would better match the one as provided by
 >> our Enterprise Products. For example, it's the JDG distribution - not
 >> Hibernate Search - to provide these integration bits, so it makes more
 >> sense to build the Directory against the specific version of
 >> Infinispan than against the specific version of Search.
 >>  - It will be easier to have new Infinispan code take advantage of
 >> features exposed on the Infinispan Lucene Directory - if all changed
 >> parts are in the same [Infinispan] repository. This is actually being
 >> a problem right now, as it's holding back a POC meant to deliver great
 >> improvements with the Directory implementation.
 >>
 >> And not least we'll have faster builds ;-)
 >>
 >> # Drawbacks
 >>
 >> First one is we'll probably have our users need to change some details
 >> of how they build.
 >> Infinispan might need to fix some SPI related code before being able
 >> to upgrade, but historically our SPI has been extremely stable.
 >> The real problem is that when such a thing happens, after we've
 >> released a version of Search there might be some time before a
 >> compatible version of Infinispan is made available as well.
 >>
 >> In practice this means the gap of time in which we have to catch up on
 >> API changes is "exposed" to end users wanting to use our latest and
 >> possibly blocked - but while they would then see the tip of the
 >> iceberg of our integration, I believe it would still take the same
 >> amount of waiting time in terms of calendar dates in which the working
 >> duo is available to them - as with the current model in such a
 >> situation we need to wait for the same Infinispan release to happen
 >> before we can release ours.
 >> So: same time, but we'd have a leaner process, and possibly quicker
 >> releases for all users not interested in that - or just benefits in
 >> all those scenarios in which we don't break APIs which is very common.
 >> I've not identified other problems, so my opinion is that these are
 >> well worth the benefits.
 >>
 >> # Consequences for our users
 >>
 >> Not much. Even today we expect our users to depend on several jar
 >> files provided by the Infinispan team; this would be just one more.
 >> Opens some questions though:
 >>
 >>  A) Should the Maven group id be changed? I'd expect it to be
 >> transferred to "org.infinispan" group at least, and probably need a
 >> better artifact id too.
 >>
 >>  B) License. Our code is LGPL, most of Infinispan is ASL - but not all
 >> of it. So I expect it would be possible to keep the existing license
 >> at least for now, and defer eventual license changes as a separate
 >> step (if people feel need for any change at all).
 >>
 >>  C) Documentation. Besides the needed updates in Maven coordinates /
 >> download sources, I don't expect much of a difference: we'd still
 >> explain how to set this integration up.
 >>
 >>  D) Distribution. Today we distribute this module, and its
 >> dependencies, in our release bundles. Which implies we distribute a
 >> copy of various Infinispan jars.
 >> I think we should drop these from our distribution - even though it
 >> might seem counter-intuitive:
 >> while it might seem convenient to have these included, the whole point
 >> of the change would be that there would be more flexibility in which
 >> versions of Infinispan would work with Search. And actually the
 >> integration tests and this specific knowledge would be responsibility
 >> of Infinispan.
 >>
 >> Am I failing to see a more critical issue?
 >> How would you all feel about our code being transferred to the
 >> different project?
 >>
 >> Sanne
 >> _______________________________________________
 >> hibernate-dev mailing list
 >> hibernate-dev(a)lists.jboss.org
 >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
 >
 >

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] [Hibernate Search] Donating some of our source code to Infinispan