[infinispan-dev] CR3: async stores, again

Tue Aug 31 06:24:14 EDT 2010

On 27 Aug 2010, at 19:37, Sanne Grinovero wrote:

> Some more news about this:
> Mircea attached a patch he was working on to JIRA, but he hadn't time
> to fix all tests, so I'm continuing.
> 
> There are two more issues:
> 
> 1)keeping an ordered list of transactions works fine for transactions
> only, but modifications done in a transaction are wrongly reordered
> with operations not being done in a transaction.
ah I see. The operations within a transaction do not get reordered though, right?

> 
> 2) Even in the committed current code, I see that a clear() operation
> is sent async together with all other operations. Then, when the state
> changes are performed, the clear() causes the previously found
> operations to be canceled, but "previous" applies here on the iterator
> on the state map, not on the order of application of those state
> Modifications, so the clear() is removing random Modifications.
yes
> 
> 3) Also my attempt to reuse some code from AbstractCacheStore though
> inheritance is not going to make it as this implementation is doing
> too much for the needs of AsyncStore: it cares for async Purge, and
> all methods I wanted to commit are slightly different because of that.
> 
> Proposal:
> 1- Out-of-transaction modifications are pushed into a BlockingDeque
> instead of a Map.
do we need a deque  for this? Also do we want it blocking? Perhaps we can make this configurable and operate on an interface directly? 
> 
> 2- A prepare(tx, list) sets the list aside in the "transactions" Map.
> 
> 3- A commit(tx) takes the list of operations from the "transactions"
> and pushes the list on the same BlockingDeque as all other operations,
> guaranteeing same order of application.
looks good
> 
> 4- a clear operation clears the deque and pushes a Clear() operation on it
> 
> 5- a purge operation is sent though the same deque: only risk I see is
> being removed by a clear, but I bet there won't be much to purge after
> that :) Also we don't care if the purge is going to be handled async
> in the other store, that's a store implementation detail from what I
> see in AbstractCacheStore.
Not sure we need to process purge in an async way: it is already called asynchronously by the eviction thread, so we can just delegate the call to the underlaying call.
> 
> 6- A single processor of AsyncStore drains blocks of a fixed
> configurable amount, puts them in a local map, applies the coalescing
> algorithm, and puts the results on a new state map.
> It seems to me that this worker needs to be exactly one, or reordering
> problems arise again in the order of the blocks taken from the deque.
+1 
> 
> 6/a - When receiving a clear() - exceptional case - we clear the map
> of operations, and issue the clear() synchronously: not adding
> anything to the map, and aquiring a write lock "clearop" to avoid race
> conditions with the workers which might already be busy but happen to
> finish after the clear().
aren't we using only one worker as per the previous paragraph?
> 
> 6/b - When receiving a purge() this bypasses other logic and is
> directly passed to the decorated Store.
+1
> 
> 7- Now here it becomes tricky, we removed the transactions problem,
> the clear operation and the purge from the table. But there still is a
> map of Modifications to be applied, by N threads, and to which another
> thread might write. This looks like the original problem in simplified
> form, but reorderings are still possible as worker A might get some
> work on key K1, then worker B kicks in, gets new work about K1 and
> happens to perform it before A gets priority again.
> Clearly this must be performed either by one thread or using very fine
> grained locking: as we're only interested on a per-key lock, we don't
> care on the order of operations on other keys: we could use a second
> map to store the keys which "are being processed": when a worker takes
> some work it uses atomic operations to check if somebody is working on
> the same stuff, if it is it could either wait or put it back (if it
> wasn't changed, otherwise discard) and pick another work to do.
> 
It might be good to use the same approach as with async marshalling: use one thread per default which would guarantee ordering. Allow a user to specify more that one thread, and clearly explain that reordering might happen: there might be situations where one would not care about ordering after all.
One more thing I've just noticed: AsyncStore processes Commit but doesn't do the same with Rollback. On rollback it should remove the tx from the transaction(as described at 2) ) map to avoid memory leaks. 

> Comments?
> 
> AFAIR the issue I mention on point 7 is a known limitation of Async,
> so I could implement that in future as an improvement? I would be fine
> in having just one worker.
> 
> I wish I had time to do it all soon :|
> Sanne
> 


> 
> 2010/8/26 Sanne Grinovero <sanne.grinovero at gmail.com>:
>> 2010/8/26 Manik Surtani <manik at jboss.org>:
>>> 
>>> On 26 Aug 2010, at 12:22, Sanne Grinovero wrote:
>>> 
>>>> Hi manik, thanks for reading through my long mail.
>>>> 
>>>> though I didn't understand what you think I should do now :)
>>>> Regarding to "Like I said we should still support commit on a separate thread.",
>>>> you mean my approach is not correct?
>>>> But then you say "Definitely" relating to my question if this is an improvement.
>>> 
>>> I'm not sure how your approach prevents a commit on a separate thread.  :)
>> 
>> because the caller of AsyncStore.prepare(..) - T1 - will have the
>> modifications list added to the "transactions" juc.ConcurrentHashMap,
>> using the TX as a key, and this same values will be pushed to "state"
>> by AsyncStore.commit(..), nothing else is done, the writing to the
>> decorated store is done by T2 using the current logic which takes
>> modifications from "state", totally independent and unaware to the
>> existing transaction used by T1.
>> 
>>> 
>>>> BTW why should you support "commit" on a separate thread, if all what
>>>> the commit does is move a single value from one concurrent map to
>>>> another one? the heavy work of sending the data down to the real store
>>>> is performed in the separate thread.
>>> 
>>> Ah sorry - I think I have misunderstood you.  I thought what you meant was that T1 may call prepare() and T2 may call commit(), imposing restrictions on how your impl may make use of, say, a ThreadLocal.
>> 
>> that's what is currently possible as the Executor in asyncStore might
>> have >1 threads, on turn taking some of the operations, so some
>> prepares() could go to thread X and some commits() to thread Y, so it
>> seems to me that thread X might perform the prepare for TX1 and the
>> thread Y receive the commit for the same TX1 (they might even try to
>> perform them in inverted order).
>> 
>>> 
>>> You are absolutely correct in that the work done in commit() does not need to be taken offline/performed asynchronously.
>> 
>> Ok
>> 
>>> 
>>>> If you think this is not going in a good direction, I'll leave this
>>>> alone, otherwise I could try cleaning
>>>> up my current experiments and propose a patch.
>>> 
>>> No, I think this is definitely a good direction.  :)
>> 
>> Ok, I'll try cleaning up and send you a patch. I assume this should go
>> in 4.1.0.Final, so we should handle it with care.. please continue
>> supporting me as this code is all new for me, I just make assumptions
>> on some parts of it.
>> 
>> Sanne
>> 
>>>> 
>>>> 2010/8/26 Manik Surtani <manik at jboss.org>:
>>>>> Hi Sanne
>>>>> 
>>>>> Thanks for the detailed analysis!
>>>>> 
>>>>> On 26 Aug 2010, at 09:05, Sanne Grinovero wrote:
>>>>> 
>>>>>> After (finally!) being able to isolate the issue with a core-only unit
>>>>>> test I changed the name of the issue to
>>>>>> "AsyncStore fails to save all values when batching"
>>>>>> As:
>>>>>> * it's not related to the JDBC stores but the Async wrapper
>>>>>> * I can only reproduce it while batching
>>>>>> 
>>>>>> I hacked a fix for me, which appears to work fine both with the unit
>>>>>> tests and fix the staging issues in my project, but this changed quite
>>>>>> a bit so we should talk about it:
>>>>>> 
>>>>>> I see the current AsyncStore implementation is "coalescing" all
>>>>>> changes being part of the same transaction during the prepare-commit
>>>>>> phase, and then async-delegating the same prepare-commit to the
>>>>>> decorated store. After that, it might also async-delegate a commit or
>>>>>> a rollback.
>>>>>> So if I understood it well enough, this means that:
>>>>>> 
>>>>>>   a) it's not merging changes from different transactions
>>>>>> 
>>>>>>   b) a commit (or rollback) could be received from the decorated
>>>>>> store implementation with a noticeable delay (a full cycle of
>>>>>> collected changes), and most importantly it seems to me that it might
>>>>>> receive a commit in a different thread than the prepare (is that
>>>>>> legal?).
>>>>> 
>>>>> I believe some TMs do allow prepare-and-commit on different threads.  Definitely prepare-and-rollback.
>>>>> 
>>>>>> So I copied the transaction managing code from AbstractCacheStore (as
>>>>>> AsyncStore doesn't inherit from it) and replaced the prepare, commit
>>>>>> and rollback methods of AbstractCacheStore to the same as the
>>>>>> AbstractCacheStore implementations:
>>>>>> this way uncommitted changes are never inserted in the "state" map
>>>>>> (nor are commit and rollback operations), and it seems that I end up
>>>>>> sending to the delegate store only committed changes;
>>>>>> best part of it is that I'm aggregating in a single "coalesced" state
>>>>>> all changes from all transactions (only committed), this seems to me a
>>>>>> good improvement?
>>>>> 
>>>>> Definitely.
>>>>> 
>>>>>> The code I'm having is not that clean, as it copied several lines of
>>>>>> code from the AbstractCacheStore; to clean it up I think that the
>>>>>> AbstractDelegatingStore should inherit from AbstractCacheStore, and
>>>>>> avoid delegating the methods relating to transactions (in fact
>>>>>> implementing what I did for the AsyncStore for all  decorated stores).
>>>>> 
>>>>> Yes, much cleaner that way.
>>>>> 
>>>>>> Also some unit tests will need fixing, as there are explicit tests
>>>>>> asserting that the commit is done in another thread: could I delete
>>>>>> these?
>>>>> 
>>>>> Like I said we should still support commit on a separate thread.
>>>>> 
>>>>>> Finally, as the delegated store is not being used transactionally
>>>>>> anymore (it would never receive a prepare or commit), wouldn't it make
>>>>>> sense to wrap all blocks of changes in a single transaction? rollback
>>>>>> from a failure in the async store is, as far as I understood, not
>>>>>> supported even with current code.
>>>>> 
>>>>> Yes, true.
>>>>> 
>>>>>> I'm eager to hear you opinions, especially what I would break with this design.
>>>>>> 
>>>>>> Cheers,
>>>>>> Sanne
>>>>>> 
>>>>>> 
>>>>>> 2010/8/25 Sanne Grinovero <sanne.grinovero at gmail.com>:
>>>>>>> 2010/8/25 Manik Surtani <manik at jboss.org>:
>>>>>>>> 
>>>>>>>> On 25 Aug 2010, at 14:48, Sanne Grinovero wrote:
>>>>>>>> 
>>>>>>>>> Hello,
>>>>>>>>> I'm testing CR3, I got it so far that I'm testing the stop-restart
>>>>>>>>> application cycle now, I'm using the JdbcStringBasedCacheStore in
>>>>>>>>> async mode,
>>>>>>>>> and every time I stop the application - which issues a clean Stop() on
>>>>>>>>> the CacheManager, it appears that some values are lost.
>>>>>>>>> This is fixed if I remove the async.
>>>>>>>>> 
>>>>>>>>> Is this expected? I hope not, as this way I can't shutdown the cluster
>>>>>>>>> when using async.
>>>>>>>>> This time I can reproduce it with a UT.
>>>>>>>> 
>>>>>>>> Cool, a unit test would help.  Is this only with the JDBC store?
>>>>>>> 
>>>>>>> I understand this is not expected and created ISPN-618; you can find
>>>>>>> my failing unit test attached to the issue.
>>>>>>> 
>>>>>>> I will try to reproduce the same issue depending on core only to
>>>>>>> understand if it's related to JDBC.
>>>>>>> 
>>>>>>> Sanne
>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Manik Surtani
>>>>>>>> manik at jboss.org
>>>>>>>> Lead, Infinispan
>>>>>>>> Lead, JBoss Cache
>>>>>>>> http://www.infinispan.org
>>>>>>>> http://www.jbosscache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> infinispan-dev mailing list
>>>>>>>> infinispan-dev at lists.jboss.org
>>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> 
>>>>> --
>>>>> Manik Surtani
>>>>> manik at jboss.org
>>>>> Lead, Infinispan
>>>>> Lead, JBoss Cache
>>>>> http://www.infinispan.org
>>>>> http://www.jbosscache.org
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev