[hibernate-dev] [HSEARCH] Implementing mass indexer with JSR 352 batch application
Emmanuel Bernard
emmanuel at hibernate.org
Wed Feb 24 06:01:35 EST 2016
I’ve seen customers essentially rewriting the mass indexer logic in Spring Batch because it offered retries, was more aligned with what they used and they needed a tiny query twist in the current mass indexer code that does not expose enough.
So I am not too concerned about a performance gap.
We would hopefully be able to reuse bits of mass indexer. Unless you massive perf improvements make it a big monolithic thing. But private branches are as good as proprietary software, they are not relevant ;)
Emmanuel
> On 24 Feb 2016, at 11:45, Sanne Grinovero <sanne at hibernate.org> wrote:
>
> It sounds interesting, but bear with me I'm not familiar with it so
> I'll throw out some doubts.
>
> Will it work without a JEE container?
> I guess some implementations might be embeddable, but still how simple
> would that be for the user?
>
> Note that we already allow rebuilding the index via JMX commands,
> that's quite standard too and it's hooked up by other integrators such
> as Infinispan or the WildFly admin console, and both expose a CLI too.
> I guess other stacks will hook up their favourite approach easily too.
>
> The idea sounds good but I'm not convinced on this to have a
> compelling benefits/efforts ratio.. especially not if it turns out to
> be slower than our current implementation (heck it's not as fast as
> I'd want it yet but it's ok, and I know how to make it much better as
> long as we control the details)
>
> Performance of this component is important. On the same database I've
> had MassIndexer POCs which would take 6 months to complete, a week to
> complete, or 3 minutes. The current implementation compared to those
> tests is the one which takes approximately 4 hours as I could never
> make the last optimisations generic enough for the general purpose but
> it would be just a matter of fixing open JIRAs for a couple of very
> concrete points.
> My point being that people often need to be able to reindex in a
> couple of hours - no matter the size. 2 hours is of course very human
> arbitrary but it seems to be the general acceptable threshold for such
> an operation: beyond a developer's tool it's also a tool for recovery
> from critical failures so it's absolutely unacceptable that a system
> potentially needs weeks to recover from an issue.
>
>
> On 24 February 2016 at 09:56, Vlad Mihalcea <mihalcea.vlad at gmail.com> wrote:
>> +1
>>
>> Sounds like a good idea.
>>
>> On Wed, Feb 24, 2016 at 11:39 AM, Gunnar Morling <gunnar at hibernate.org>
>> wrote:
>>
>>> Hi,
>>>
>>> I've been contemplating the idea of creating a JSR-352-style batch
>>> application for re-indexing one or more entity types in Hibernate
>>> Search.
>>>
>>> Functionally, it'd be the same as the current mass indexer, but using
>>> JSR 352 would provide some nice benefits:
>>>
>>> * Operation through standard batch interfaces (e.g. CLI, web console
>>> or whatever servers provide)
>>> * Standardized monitoring and logging
>>> * Standardized error handling, restartability after failures
>>>
>>> I thought this might be an interesting GSoC idea.
>>>
>>> It's very isolated and also should not be too complex to do. If the
>>> student is quick, one further idea could be to provide some UI
>>> functionality for controlling this (I'm not sure what's already
>>> available in the WF web console).
>>>
>>> Any thoughts?
>>>
>>> Thanks,
>>>
>>> --Gunnar
>>> _______________________________________________
>>> hibernate-dev mailing list
>>> hibernate-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
More information about the hibernate-dev
mailing list