Re: [infinispan-dev] ISPN-12 Optimize transactions spanning caches from the same cache manager

Tuesday, 9 June 2009

On 8 Jun 2009, at 21:02, Mircea Markus wrote:

...
 Hi *,

 Some thoughts I have about above JIRA.

 Background:

 CacheManager cm;
 Cache c1 = cm.getCache("replicatedCacheConfig1");
 Cache c2 = cm.getCache("replicatedCacheConfig2");

 transactionManager.start();
 c1.put(a,b);
 c2.put(b,c);
 transactionManager.commit(); // at this point there we replicate two  
 PrepareCommand and two CommitCommands , the JIRA is about  
 aggregating all prepares into one (same about commits) => less  
 roundtrip

 Considerations:
 REPL: This functionality only makes sense if we have at least two  
 active replicated caches.
 DIST: with DIST things can still be optimized in order to group  
 Prepare/Commit/RollbackCommands about to be replicated to the same  
 nodes 
I assume any optimisation with DIST is determined at the time of  
prepare?  If so, how can you tell, e.g., cache1's asked to prepare.   
Cache2's prepare call hasn't come in yet (since most TMs do this  
sequentially).  How do you know whether to optimize cache1's prepare?

...
 If we have a tx spreading an ASYNC and an SYNC cache the  
 optimization is not supported for now.

 Design notes:
 GlobalTxRepository is (basically) a Map<Transaction, AtomicInt>: for  
 each Transaction it keeps an count(AtomicInt) of the caches that  
 were affected by a given Transaction.

 Operations:
 1. GlobalTxRepository.transactionRegistered(Transaction) will be  
 called whenever a TransactionXaAdapter is created. This will  
 increment Transaction's associated participant count. Call made from  
 TransactionTable.
 2. GlobalTxRepository.aboutToPrepare(Transaction) will be called  
 when a transaction is about to prepare, and will decrement the  
 Transaction's associated participant count. Call made from  
 TxInterceptor
 3. GlobalTxRepository.aboutToFinish(Transaction) will be called  
 before a transaction commits or rollbacks. Call made from  
 TxInterceptor

 ReplicationInterceptor and DistributionInterceptor:
 In visitPrepare/visitCommit/visitRollback - will check if the  
 participant count associated with given tx is 0, if so will trigger  
 replication. This will make sure that only last participant will do  
 the actual replication. 
But this may break based on the TM's implementation.  E.g., if c1 and  
c2 both participate in the same tx, and for whatever reason c1's  
remote prepare() would fail, we would only see this when c2 attempts  
to prepare.  Don't know if this will cause problems with the TM  
identifying the wrong resource as having failed.

...
 Performance costs:
 - there is a set of operations this functionality requires(map  
 lookups, atomic increments/decrements) which might be totally  
 useless if user won't use such transactions. There are some ways to  
 reduce/avoid performance costs:
 a) only expose this functionality through Flags (extra Flag  
 required. Flag.OPTIMIZE_TX_REPLICATION ?)
 b) allow the user to enable it statically through a configuration  
 option
 c) use a sort of ergonomics, for each 10th/n-th finishing  
 transaction, check weather it spread over more than one caches: if  
 so enable the functionality and keep it enabled for good

 My personal vote is for b) for simplicity. 
Well, this could differ based on access pattern.  E.g., if some  
transactions only touch one cache and others touch > 1 cache, so the  
only sensible approach would be a combination of (a) and (b).

In general though, here's my thought: while I think such an  
optimization to minimize roundtrips is a good thing (heck, I raised  
the JIRA!) the more I think about it, the more I think it will be  
unnecessarily complex, and only applicable to a few edge cases (REPL  
where all caches share the same SYNC/ASYNC setting - keep in mind that  
we expect DIST to be the default or most popular cache mode.  Or with  
DIST where the participants all match - very unlikely).  So in general  
I suggest we don't bother with this - certainly not for now.  :)

Cheers.
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] ISPN-12 Optimize transactions spanning caches from the same cache manager