[infinispan-dev] Improving the performance of index writers

Mon Oct 20 13:09:57 EDT 2014

On 20 October 2014 17:49, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
> So assuming an idle index loop, the first commit would lead to the execution of the indexing work and flush. If two or more commits come during that flush time, then the queue would be > 1 and “batching” would occur. Correct?

Yes that's the idea.

As soon as the IndexWriter thread is done with that first commit, it
gets back to see if more work was queued up and takes *all* of them
for processing.

The interesting part should be why taking *all* should not be a scary
concept here:

Memory-wise it's not a problem as all those changesets already are on
the stack, and they still prevent further work to be created (as
producers are blocked) so there is no additional memory cost (compared
to previous approach).

Time-wise, applying one or "all" makes no (significant) difference, as
we've seen that each changeset to be applied is very efficient, while
the commit is what introduces a more significant delay (orders of
magnitude difference).

Only if we have millions of changesets in the batch there would be
some noticeable change of latency, as the first threads having
enqueued something would have to wait slightly longer than normal; but
even then, the average latency of all producers would be lower than
the current approach as all other producers are not waiting in a long
line.

Sanne

>
>
>
> On 20 Oct 2014, at 16:57, Sanne Grinovero <sanne at infinispan.org> wrote:
>
>> On 20 October 2014 14:55, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>>
>>> On 20 Oct 2014, at 14:10, Sanne Grinovero <sanne at infinispan.org> wrote:
>>>
>>> On 20 October 2014 12:59, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>>
>>> HSEARCH-1699 looks good. A few comments.
>>>
>>> Maybe from a user point of you we want to expose the number of ms the user
>>> is ok to delay a commit due to indexing. Which would mean that you can wait
>>> up to that number before calling it a day and emptying the queue. The big
>>> question I have which you elude too is whether this mechanism should have
>>> some kind of back pressure mechanism by also caping the queue size.
>>>
>>>
>>> Gustavo is implementing that for the ASYNC backend, but the SYNC
>>> backend will always block the user thread until the commit is done
>>> (and some commit is going to be done ASAP).
>>> About to write a mail to hibernate-dev to discuss the ASYNC backend
>>> property name and exact semantics.
>>>
>>>
>>> I understand that the sync mode will block until the commit is done. what I
>>> am saying is that for HSEARCH-1699 (SYNC) (and probably also for the ASYNC
>>> mode), you can ask the user “how much more” is he willing to wait for the
>>> index to be committed compared to “as fast as possible”. That becomes your
>>> window of aggregation. Does that make sense?
>>>
>>>
>>>
>>>
>>> BTW,  in the following paragraph, either you lost me or you are talking non
>>> sense:
>>>
>>> Systems with an high degree of parallelism will benefit from this, and the
>>> performance should converge to the performance you would have without every
>>> doing a commit; however if the frequency of commits is apparoching to zero,
>>> it also means that the average latency of each operation will get
>>> significantly higher. Still, in such situations assuming we are for example
>>> stacking up a million changesets between each commit, that implies this
>>> solution would be approximately a million times faster than the existing
>>> design (A million would not be realistic of course as it implies a million
>>> of parallel requests).
>>>
>>>
>>> I think you can only converge to an average of 1/2 * (commit + configured
>>> delay time) latency wise. I am assuming latency is what  people are
>>> interested in, not the average CPU / memory load of indexing.
>>>
>>>
>>> I'm sorry I'm confused. There is no configured delay time for the SYNC
>>> backend discussed on HSEARCH-1699, are you talking about the Async
>>> one? But my paragraph above is strictly referring tot the strategy
>>> meant to be applied for the Sync one.
>>>
>>>
>>> There is a delay. it is what you call the "target frequency of commits“. And
>>> my alternative that i proposed is not su much a frequency rather than how
>>> much more you delay a flush in the hope of getting more work in.
>>
>> No there is no delay, in case there is a constant flow of incoming
>> write operations, the write loop will degenerate in something like
>> (pseudo code and overly simplified):
>>
>> while (true) {
>>  apply(getNextChangeset())
>>  commit();
>> }
>>
>> So it's a busy loop with no waits: the "target frequency of commits"
>> will naturally match the maximum frequency of commits which the
>> storage can handle, as we've said that applying the changes is not a
>> cost, it's essentially the same as
>>
>> while (true) {
>>  commit();
>> }
>>
>> That code will loop faster if the commits are quick. The point being
>> that the number of changes which we can apply in period T, does not
>> depend on the time it taks to do commit operations on the underlying
>> storage.
>>
>> The real code will need to be a bit more complex, for example to
>> handle this case:
>>
>> while (true) {
>> changeset = getNextChangeset();
>> if (changeset.isEmpty) {
>>    waitWithoutBurningCPU();
>> }
>> else {
>>    apply(all pending changes)
>>   commit();
>> }
>>
>>
>>>
>>> In your model of a fixed frequency, they the average delay is 1/2 *
>>> 1/frequency + commit time.
>>> Or do you have something different in mind?
>>
>> I hope the above example clarifies. It's not a fixed frequency, it's
>> "as fast as it can", but with latency not better than what can be
>> performed by a single commit. What I'm attempting to explain when
>> comparing "frequency" is that this is the optimal speed for each
>> situation, especially compared to current solution, and regardless of
>> queueing up.
>> There is an inherent form of back pressure: it's limited by the cost
>> of the single commit, which will delay further changesets in the
>> queue.. but the queue depth doesn't get larger than 1 and we don't
>> risk running out of space as it blocks producers, blocking the
>> application.
>>
>> Sanne
>>
>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev