[infinispan-dev] ISPN-12 Optimize transactions spanning caches from the same cache manager

Tue Jun 9 08:59:36 EDT 2009

Manik Surtani wrote:
>
> On 9 Jun 2009, at 11:18, Mircea Markus wrote:
>
>> Manik Surtani wrote:
>>>
>>> On 8 Jun 2009, at 21:02, Mircea Markus wrote:
>>>
>>>> Hi *,
>>>>
>>>> Some thoughts I have about above JIRA.
>>>>
>>>> Background:
>>>>
>>>> CacheManager cm;
>>>> Cache c1 = cm.getCache("replicatedCacheConfig1");
>>>> Cache c2 = cm.getCache("replicatedCacheConfig2");
>>>>
>>>> transactionManager.start();
>>>> c1.put(a,b);
>>>> c2.put(b,c);
>>>> transactionManager.commit(); // at this point there we replicate 
>>>> two PrepareCommand and two CommitCommands , the JIRA is about 
>>>> aggregating all prepares into one (same about commits) => less 
>>>> roundtrip
>>>>
>>>>
>>>> Considerations:
>>>> REPL: This functionality only makes sense if we have at least two 
>>>> active replicated caches.
>>>> DIST: with DIST things can still be optimized in order to group 
>>>> Prepare/Commit/RollbackCommands about to be replicated to the same 
>>>> nodes
>>>
>>> I assume any optimisation with DIST is determined at the time of 
>>> prepare?
>> yes.
>>> If so, how can you tell, e.g., cache1's asked to prepare.  Cache2's 
>>> prepare call hasn't come in yet (since most TMs do this 
>>> sequentially).  How do you know whether to optimize cache1's prepare?
>> At this moment I already know the number of caches that participate 
>> in transaction (see bellow GlobalTxRepository.transactionRegistered). 
>> Basically I'll wait until prepare is called on all participating 
>> caches, and only the last one will trigger the replication.
>
> Yes, you know which named caches, but not which remote instances.  You 
> only know that you can optimize cache1's prepare if cache2's prepare 
> are going to the same recipients in the cluster.
>
>>>
>>>> If we have a tx spreading an ASYNC and an SYNC cache the 
>>>> optimization is not supported for now.
>>>>
>>>> Design notes:
>>>> GlobalTxRepository is (basically) a Map<Transaction, AtomicInt>: 
>>>> for each Transaction it keeps an count(AtomicInt) of the caches 
>>>> that were affected by a given Transaction.
>>>>
>>>> Operations:
>>>> 1. GlobalTxRepository.transactionRegistered(Transaction) will be 
>>>> called whenever a TransactionXaAdapter is created. This will 
>>>> increment Transaction's associated participant count. Call made 
>>>> from TransactionTable.
>>>> 2. GlobalTxRepository.aboutToPrepare(Transaction) will be called 
>>>> when a transaction is about to prepare, and will decrement the 
>>>> Transaction's associated participant count. Call made from 
>>>> TxInterceptor
>>>> 3. GlobalTxRepository.aboutToFinish(Transaction) will be called 
>>>> before a transaction commits or rollbacks. Call made from 
>>>> TxInterceptor
>>>>
>>>> ReplicationInterceptor and DistributionInterceptor:
>>>> In visitPrepare/visitCommit/visitRollback - will check if the 
>>>> participant count associated with given tx is 0, if so will trigger 
>>>> replication. This will make sure that only last participant will do 
>>>> the actual replication.
>>>
>>> But this may break based on the TM's implementation.  E.g., if c1 
>>> and c2 both participate in the same tx, and for whatever reason c1's 
>>> remote prepare() would fail, we would only see this when c2 attempts 
>>> to prepare.  Don't know if this will cause problems with the TM 
>>> identifying the wrong resource as having failed.
>> That's true, TM will be informed that c2 prepare failed, and will 
>> initiate an rollback on all resources. The outcome of the transaction 
>> should be the same though.
>>>
>>>> Performance costs:
>>>> - there is a set of operations this functionality requires(map 
>>>> lookups, atomic increments/decrements) which might be totally 
>>>> useless if user won't use such transactions. There are some ways to 
>>>> reduce/avoid performance costs:
>>>> a) only expose this functionality through Flags (extra Flag 
>>>> required. Flag.OPTIMIZE_TX_REPLICATION ?)
>>>> b) allow the user to enable it statically through a configuration 
>>>> option
>>>> c) use a sort of ergonomics, for each 10th/n-th finishing 
>>>> transaction, check weather it spread over more than one caches: if 
>>>> so enable the functionality and keep it enabled for good
>>>>
>>>> My personal vote is for b) for simplicity.
>>>
>>> Well, this could differ based on access pattern.  E.g., if some 
>>> transactions only touch one cache and others touch > 1 cache, so the 
>>> only sensible approach would be a combination of (a) and (b).
>>>
>>> In general though, here's my thought: while I think such an 
>>> optimization to minimize roundtrips is a good thing (heck, I raised 
>>> the JIRA!) the more I think about it, the more I think it will be 
>>> unnecessarily complex, and only applicable to a few edge cases (REPL 
>>> where all caches share the same SYNC/ASYNC setting - keep in mind 
>>> that we expect DIST to be the default or most popular cache mode.  
>>> Or with DIST where the participants all match - very unlikely).  So 
>>> in general I suggest we don't bother with this - certainly not for 
>>> now.  :)
>> If I'm not missing something with DIST, this is still applicable for 
>> that one as well.
>
> With DIST, I think it is pretty unlikely that a set of keys involved 
> in the same transaction on 2 different caches would map to the same 
> set of servers.  Unless your cluster is pretty small (in which case 
> you'd may as well use REPL).  Or your keys have a really poor 
> hashcode() impl, in which case you have much bigger bottlenecks 
> anyway.  :)
Fair enough. I've close it with  Won't Fix
>
>> Agreed that this is rather complex, and might get even trickier when 
>> it comes to implement it, so if DIST is not applicable I also think 
>> this should be dropped.
>
>
>
>>>
>>> Cheers.
>>> -- 
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>>
>>>
>>>
>>>
>>
>
> -- 
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>