Re: [infinispan-dev] API behaviour when using DIST

Tuesday, 21 April 2009

On 21 Apr 2009, at 16:54, Mircea Markus wrote:

...
 Manik Surtani wrote:
> Ok, after discussing this on IRC and the clustering conf call, here  
> is what we will do:
>
> 1.  Non-transactional
>
> * Only support DIST in SYNC mode.  Throw a configuration exception  
> if DIST and ASYNC are detected.
> * Allow DIST_ASYNC if an unsafe flag is set to true  
> (breakApi=true?) in the config, so that the user is explicitly  
> aware that API return values will be unreliable.
> * Make sure this is documented in the Javadocs as well as the FAQs
 I like that the user needs to explicitly enable that, i.e. to ack  
 that he/she won't rely in return values.
 One more thing (and I think this might be applied in 2 and 3)
 what if the user previously retrieved the values, in some sort of  
 batch operation. I.e.:
 cache.retrieve("key1", "key2", "keyn");
 //then operate on these keys only oldVal =  
 cache.replace("key1","newVal");//at this point the data would  
 already be in the context/L1 and the API is not broken.
 If user tries to use an "keyX" that was not fetched, then fail with  
 an exception (unless breakApi is true). 
The breakApi workaround (calling this unreliableReturnValues) is a  
configuration sanity check flag.  It only applies when the cache  
starts up, and not while it is running.

...
 the advantage of retreive operation would be that it minimize  
 network round trips, i,e. "key1" and "key2" might be on the same  
 node, so only one roundtrip.
 (we can use cache.startBatch() and cache.endBatch() instead of the  
 cache.retrieve(),  though  I think  latter is more descriptive). 
Not so sure I understand.  Return values need to be provided on a per- 
invocation basis.  How would you preempt which keys would be  
needed?  :-)  Or are you suggesting make this an explicit API call  
that the user would invoke before calling the write methods?  Sounds  
pretty clunky though ...

...
> 2.  Transactional - Async
>
> * As above, do not support unless breakApi = true.  In which case  
> this is simple - no return values, only broadcast modifications in  
> prepare as we do with REPL_SYNC.
>
> 3.  Transactional - Sync
>
> * Offer 2 options here.  Option 1 is simple and we already have  
> this.  Option 2 will need to be written.
>
> Option 1:  break API.  This will behave the same as async, in that  
> we don't care about return values, but we still use a 2-phase  
> commit protocol for the transaction boundary.  Users will still  
> need to specify breakApi = true so this is explicit.
>
> Option 2: Before any write RPC call is broadcast, a remote get is  
> performed.  This pulls back the necessary values so that a reliable  
> return value is available.
 guess this will also use a 2pc? Also, are we locking progressively  
 or at commit time only, when broadcasting the modifications? 
Yes, 2PC.  And yes, when the prepare command is broadcast.

Although now that I think about this, this could result in  
inconsistent return values again!  E.g.,

Cluster = {A, B, C, D}
K maps to A and B

on C
----
1.  tx.begin()
2.  C.remove(K) // does a remote get from A or B, no locks yet on A or  
B.  Return value accordingly based on local operation.
3.  tx.commit() // broadcast, acquire locks, etc

on D
----
1.  tx.begin()
2.  D.remove(K) // this happens *before* C commits.  So we still see  
the old value before C's remove is applied to D
3.  tx.commit()

I guess this is a classic read-committed case which is permitted, or a  
write-skew case in repeatable_read, which could even happen in local  
mode.

The other option is eager command RPC but that is inefficient for a  
number of reasons (locks held unnecessarily longer, plus extra work in  
fitting this with our existing transaction processing sequences)

Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] API behaviour when using DIST