[infinispan-dev] Improving the performance of index writers

Mon Oct 20 09:55:58 EDT 2014

On 20 Oct 2014, at 14:10, Sanne Grinovero <sanne at infinispan.org> wrote:

> On 20 October 2014 12:59, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>> HSEARCH-1699 looks good. A few comments.
>> 
>> Maybe from a user point of you we want to expose the number of ms the user is ok to delay a commit due to indexing. Which would mean that you can wait up to that number before calling it a day and emptying the queue. The big question I have which you elude too is whether this mechanism should have some kind of back pressure mechanism by also caping the queue size.
> 
> Gustavo is implementing that for the ASYNC backend, but the SYNC
> backend will always block the user thread until the commit is done
> (and some commit is going to be done ASAP).
> About to write a mail to hibernate-dev to discuss the ASYNC backend
> property name and exact semantics.

I understand that the sync mode will block until the commit is done. what I am saying is that for HSEARCH-1699 (SYNC) (and probably also for the ASYNC mode), you can ask the user “how much more” is he willing to wait for the index to be committed compared to “as fast as possible”. That becomes your window of aggregation. Does that make sense?

> 
>> 
>> BTW,  in the following paragraph, either you lost me or you are talking non sense:
>> 
>>> Systems with an high degree of parallelism will benefit from this, and the performance should converge to the performance you would have without every doing a commit; however if the frequency of commits is apparoching to zero, it also means that the average latency of each operation will get significantly higher. Still, in such situations assuming we are for example stacking up a million changesets between each commit, that implies this solution would be approximately a million times faster than the existing design (A million would not be realistic of course as it implies a million of parallel requests).
>> 
>> I think you can only converge to an average of 1/2 * (commit + configured delay time) latency wise. I am assuming latency is what  people are interested in, not the average CPU / memory load of indexing.
> 
> I'm sorry I'm confused. There is no configured delay time for the SYNC
> backend discussed on HSEARCH-1699, are you talking about the Async
> one? But my paragraph above is strictly referring tot the strategy
> meant to be applied for the Sync one.

There is a delay. it is what you call the "target frequency of commits“. And my alternative that i proposed is not su much a frequency rather than how much more you delay a flush in the hope of getting more work in.

In your model of a fixed frequency, they the average delay is 1/2 * 1/frequency + commit time.
Or do you have something different in mind?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141020/7d32ff75/attachment-0001.html