[infinispan-dev] CR3: async stores, again

Sanne Grinovero sanne.grinovero at gmail.com
Tue Aug 31 06:57:57 EDT 2010


2010/8/31 Mircea Markus <mircea.markus at jboss.com>:
>
> On 27 Aug 2010, at 19:37, Sanne Grinovero wrote:
>
>> Some more news about this:
>> Mircea attached a patch he was working on to JIRA, but he hadn't time
>> to fix all tests, so I'm continuing.
>>
>> There are two more issues:
>>
>> 1)keeping an ordered list of transactions works fine for transactions
>> only, but modifications done in a transaction are wrongly reordered
>> with operations not being done in a transaction.
> ah I see. The operations within a transaction do not get reordered though, right?

That's right, if I only use transactions it works as a charm, ignoring
the corner-case of the clear() operation as explained in next
paragraph.
The problem with Lucene (to show a use case) is that it sometimes
starts a batch for performance reasons; other one-shot operations are
missing the batch.

>
>>
>> 2) Even in the committed current code, I see that a clear() operation
>> is sent async together with all other operations. Then, when the state
>> changes are performed, the clear() causes the previously found
>> operations to be canceled, but "previous" applies here on the iterator
>> on the state map, not on the order of application of those state
>> Modifications, so the clear() is removing random Modifications.
> yes
>>
>> 3) Also my attempt to reuse some code from AbstractCacheStore though
>> inheritance is not going to make it as this implementation is doing
>> too much for the needs of AsyncStore: it cares for async Purge, and
>> all methods I wanted to commit are slightly different because of that.
>>
>> Proposal:
>> 1- Out-of-transaction modifications are pushed into a BlockingDeque
>> instead of a Map.
> do we need a deque  for this? Also do we want it blocking? Perhaps we can make this configurable and operate on an interface directly?

any kind of order-mantaining will do, I did it with an unbounded
blocking deque as that's nice to wait on for new work to appear, to
avoid starvation and provide a clean road for shutdown.

>>
>> 2- A prepare(tx, list) sets the list aside in the "transactions" Map.
>>
>> 3- A commit(tx) takes the list of operations from the "transactions"
>> and pushes the list on the same BlockingDeque as all other operations,
>> guaranteeing same order of application.
> looks good
>>
>> 4- a clear operation clears the deque and pushes a Clear() operation on it
>>
>> 5- a purge operation is sent though the same deque: only risk I see is
>> being removed by a clear, but I bet there won't be much to purge after
>> that :) Also we don't care if the purge is going to be handled async
>> in the other store, that's a store implementation detail from what I
>> see in AbstractCacheStore.
Not sure we need to process purge in an async way: it is already
called asynchronously by the eviction thread, so we can just delegate
the call to the underlaying call.
Right, I got inspired by previous version. I have it done now by the
AsyncStoreCoordinator, not spawning additional AsyncProcessors, as
it's assumed to be async and return quickly; this way this kind of
operation gets a sort of "priority" towards other modifications, but
sill async from the point of view of the caller.

>>
>> 6- A single processor of AsyncStore drains blocks of a fixed
>> configurable amount, puts them in a local map, applies the coalescing
>> algorithm, and puts the results on a new state map.
>> It seems to me that this worker needs to be exactly one, or reordering
>> problems arise again in the order of the blocks taken from the deque.
> +1

Actually that was the initial idea, but then I had to remove the
"blocks of fixed configurable amount" to keep this one as simple as
possible, we can add that later on: it seems to make sense with the
write-behind delay of Galder's wiki link.

>>
>> 6/a - When receiving a clear() - exceptional case - we clear the map
>> of operations, and issue the clear() synchronously: not adding
>> anything to the map, and aquiring a write lock "clearop" to avoid race
>> conditions with the workers which might already be busy but happen to
>> finish after the clear().
> aren't we using only one worker as per the previous paragraph?
One AsyncStoreCoordinator, N AsyncProcessors. The one manages the
queue, guarantees ordering and performs the clear() and purge(), when
it finds Store or Remove operations it pushes these to the
AsyncProcessors to deal with the (slow?) Store in async way.

>>
>> 6/b - When receiving a purge() this bypasses other logic and is
>> directly passed to the decorated Store.
> +1
>>
>> 7- Now here it becomes tricky, we removed the transactions problem,
>> the clear operation and the purge from the table. But there still is a
>> map of Modifications to be applied, by N threads, and to which another
>> thread might write. This looks like the original problem in simplified
>> form, but reorderings are still possible as worker A might get some
>> work on key K1, then worker B kicks in, gets new work about K1 and
>> happens to perform it before A gets priority again.
>> Clearly this must be performed either by one thread or using very fine
>> grained locking: as we're only interested on a per-key lock, we don't
>> care on the order of operations on other keys: we could use a second
>> map to store the keys which "are being processed": when a worker takes
>> some work it uses atomic operations to check if somebody is working on
>> the same stuff, if it is it could either wait or put it back (if it
>> wasn't changed, otherwise discard) and pick another work to do.
>>
> It might be good to use the same approach as with async marshalling: use one thread per default which would guarantee ordering. Allow a user to specify more that one thread, and clearly explain that reordering might happen: there might be situations where one would not care about ordering after all.

I think we solved it nicely, guaranteeing correct ordering and still
doing most of the work in parallel.
If problems arise, we'll advise to use a single thread.

> One more thing I've just noticed: AsyncStore processes Commit but doesn't do the same with Rollback. On rollback it should remove the tx from the transaction(as described at 2) ) map to avoid memory leaks.

Right, doing that.

[...more pages were here...]

Cheers,
Sanne



More information about the infinispan-dev mailing list