Improving the performance of index writers

Friday, 17 October 2014

Hi all,
 we have been breaking down the problem of latency during Index
Writing into smaller manageable tasks, you can find the general
overview JIRA here :

 - https://issues.jboss.org/browse/ISPN-4847

As you can see some minor improvements have been fixed already, and
while each of them provides only minor 10% to 30% improvements, some
provide more and combined the composite ratio is getting interesting.

While these minor issues (even combined) won't give us the many orders
of magnitude performance improvements that we'd like to see, they are
important as they are paving the road to the more significant
efficiency improvements.

I documented the main idea here, as it belongs into the Hibernate Search engine:

 https://hibernate.atlassian.net/browse/HSEARCH-1699

I don't expect that to be implemented overnight, but Gustavo already
sent a PR for the ASYNC case, which is based on the same principle of
avoiding the commits but is simpler to implement:

  https://hibernate.atlassian.net/browse/HSEARCH-1693

We expect this one to be a proof of concept for the performance that
we'll get from HSEARCH-1699, and also I think it's very useful on its
own: previously users of ASYNC indexing were forced into a "very
async" architecture which might have been a bit too hard to manage,
while now being able to set a maximum delay for the async operation I
also expect that to be an acceptable compromise for a much wider range
of use cases.

Essentially this will decouple the achievable throughput of indexed
caches from the RPC latency, although obviously this latency will
still be the limiting factor for some dimensions, especially the
response time for a single synchronous indexed write will still be
affected primarily by the ability of Infinispan to improve the number
of blocking RPCs needed for a single write.

Feedback very welcome!

Sanne

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009