[infinispan-dev] Improving the performance of index writers

Sanne Grinovero sanne at infinispan.org
Mon Oct 20 08:10:47 EDT 2014


On 20 October 2014 12:59, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
> HSEARCH-1699 looks good. A few comments.
>
> Maybe from a user point of you we want to expose the number of ms the user is ok to delay a commit due to indexing. Which would mean that you can wait up to that number before calling it a day and emptying the queue. The big question I have which you elude too is whether this mechanism should have some kind of back pressure mechanism by also caping the queue size.

Gustavo is implementing that for the ASYNC backend, but the SYNC
backend will always block the user thread until the commit is done
(and some commit is going to be done ASAP).
About to write a mail to hibernate-dev to discuss the ASYNC backend
property name and exact semantics.

>
> BTW,  in the following paragraph, either you lost me or you are talking non sense:
>
>> Systems with an high degree of parallelism will benefit from this, and the performance should converge to the performance you would have without every doing a commit; however if the frequency of commits is apparoching to zero, it also means that the average latency of each operation will get significantly higher. Still, in such situations assuming we are for example stacking up a million changesets between each commit, that implies this solution would be approximately a million times faster than the existing design (A million would not be realistic of course as it implies a million of parallel requests).
>
> I think you can only converge to an average of 1/2 * (commit + configured delay time) latency wise. I am assuming latency is what  people are interested in, not the average CPU / memory load of indexing.

I'm sorry I'm confused. There is no configured delay time for the SYNC
backend discussed on HSEARCH-1699, are you talking about the Async
one? But my paragraph above is strictly referring tot the strategy
meant to be applied for the Sync one.

Thanks for the feedback! BTW I didn't cross-post to hibernate-dev as
this was meant as a heads up for the Infinispan team otherwise not
having visibility on what we're planning, but I should really start a
discussion thread for the details on hibernate-dev.

Infinispan developers: if you're interested in following this subject,
please comment on the JIRAs or join the hibernate-dev mailing list.

Sanne

>
> Emmanuel
>
> On 17 Oct 2014, at 20:15, Sanne Grinovero <sanne at infinispan.org> wrote:
>
>> Hi all,
>> we have been breaking down the problem of latency during Index
>> Writing into smaller manageable tasks, you can find the general
>> overview JIRA here :
>>
>> - https://issues.jboss.org/browse/ISPN-4847
>>
>> As you can see some minor improvements have been fixed already, and
>> while each of them provides only minor 10% to 30% improvements, some
>> provide more and combined the composite ratio is getting interesting.
>>
>> While these minor issues (even combined) won't give us the many orders
>> of magnitude performance improvements that we'd like to see, they are
>> important as they are paving the road to the more significant
>> efficiency improvements.
>>
>> I documented the main idea here, as it belongs into the Hibernate Search engine:
>>
>> https://hibernate.atlassian.net/browse/HSEARCH-1699
>>
>> I don't expect that to be implemented overnight, but Gustavo already
>> sent a PR for the ASYNC case, which is based on the same principle of
>> avoiding the commits but is simpler to implement:
>>
>>  https://hibernate.atlassian.net/browse/HSEARCH-1693
>>
>> We expect this one to be a proof of concept for the performance that
>> we'll get from HSEARCH-1699, and also I think it's very useful on its
>> own: previously users of ASYNC indexing were forced into a "very
>> async" architecture which might have been a bit too hard to manage,
>> while now being able to set a maximum delay for the async operation I
>> also expect that to be an acceptable compromise for a much wider range
>> of use cases.
>>
>> Essentially this will decouple the achievable throughput of indexed
>> caches from the RPC latency, although obviously this latency will
>> still be the limiting factor for some dimensions, especially the
>> response time for a single synchronous indexed write will still be
>> affected primarily by the ability of Infinispan to improve the number
>> of blocking RPCs needed for a single write.
>>
>> Feedback very welcome!
>>
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev



More information about the infinispan-dev mailing list