[infinispan-dev] API behaviour when using DIST
Mircea Markus
mircea.markus at jboss.com
Tue Apr 21 13:01:24 EDT 2009
Manik Surtani wrote:
>
> On 21 Apr 2009, at 16:54, Mircea Markus wrote:
>
>> Manik Surtani wrote:
>>> Ok, after discussing this on IRC and the clustering conf call, here
>>> is what we will do:
>>>
>>> 1. Non-transactional
>>>
>>> * Only support DIST in SYNC mode. Throw a configuration exception
>>> if DIST and ASYNC are detected.
>>> * Allow DIST_ASYNC if an unsafe flag is set to true (breakApi=true?)
>>> in the config, so that the user is explicitly aware that API return
>>> values will be unreliable.
>>> * Make sure this is documented in the Javadocs as well as the FAQs
>> I like that the user needs to explicitly enable that, i.e. to ack
>> that he/she won't rely in return values.
>> One more thing (and I think this might be applied in 2 and 3)
>> what if the user previously retrieved the values, in some sort of
>> batch operation. I.e.:
>> cache.retrieve("key1", "key2", "keyn");
>> //then operate on these keys only oldVal =
>> cache.replace("key1","newVal");//at this point the data would already
>> be in the context/L1 and the API is not broken.
>> If user tries to use an "keyX" that was not fetched, then fail with
>> an exception (unless breakApi is true).
>
> The breakApi workaround (calling this unreliableReturnValues) is a
> configuration sanity check flag. It only applies when the cache
> starts up, and not while it is running.
>
>> the advantage of retreive operation would be that it minimize network
>> round trips, i,e. "key1" and "key2" might be on the same node, so
>> only one roundtrip.
>> (we can use cache.startBatch() and cache.endBatch() instead of the
>> cache.retrieve(), though I think latter is more descriptive).
>
> Not so sure I understand. Return values need to be provided on a
> per-invocation basis. How would you preempt which keys would be
> needed? :-) Or are you suggesting make this an explicit API call
> that the user would invoke before calling the write methods?
yes, indeed. that's what cache.retrieve("key1", "key2", "keyn"...);
would do, fetch all the remote values at once (multiple keys mapped to
one node will result in one aggregated get).
> Sounds pretty clunky though ...
Might be. I think it is easy to grasp though, and can have significant
benefits for clients that know all the key set they will manipulate in
one session.
>
>>> 2. Transactional - Async
>>>
>>> * As above, do not support unless breakApi = true. In which case
>>> this is simple - no return values, only broadcast modifications in
>>> prepare as we do with REPL_SYNC.
>>>
>>> 3. Transactional - Sync
>>>
>>> * Offer 2 options here. Option 1 is simple and we already have
>>> this. Option 2 will need to be written.
>>>
>>> Option 1: break API. This will behave the same as async, in that
>>> we don't care about return values, but we still use a 2-phase commit
>>> protocol for the transaction boundary. Users will still need to
>>> specify breakApi = true so this is explicit.
>>>
>>> Option 2: Before any write RPC call is broadcast, a remote get is
>>> performed. This pulls back the necessary values so that a reliable
>>> return value is available.
>> guess this will also use a 2pc? Also, are we locking progressively or
>> at commit time only, when broadcasting the modifications?
>
> Yes, 2PC. And yes, when the prepare command is broadcast.
>
> Although now that I think about this, this could result in
> inconsistent return values again! E.g.,
>
> Cluster = {A, B, C, D}
> K maps to A and B
>
> on C
> ----
> 1. tx.begin()
> 2. C.remove(K) // does a remote get from A or B, no locks yet on A or
> B. Return value accordingly based on local operation.
> 3. tx.commit() // broadcast, acquire locks, etc
>
> on D
> ----
> 1. tx.begin()
> 2. D.remove(K) // this happens *before* C commits. So we still see
> the old value before C's remove is applied to D
> 3. tx.commit()
>
> I guess this is a classic read-committed case which is permitted, or a
> write-skew case in repeatable_read, which could even happen in local
> mode.
>
> The other option is eager command RPC but that is inefficient for a
> number of reasons (locks held unnecessarily longer, plus extra work in
> fitting this with our existing transaction processing sequences)
>
> Cheers
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
More information about the infinispan-dev
mailing list