[infinispan-dev] Doubts about TxDistributionInterceptor and possible break in transaction isolation

William Burns mudokonman at gmail.com
Mon Jun 17 11:35:27 EDT 2013


On Mon, Jun 17, 2013 at 11:11 AM, Dan Berindei <dan.berindei at gmail.com>wrote:

>
>
>
> On Mon, Jun 17, 2013 at 3:58 PM, Pedro Ruivo <pedro at infinispan.org> wrote:
>
>>
>>
>> On 06/17/2013 12:56 PM, Mircea Markus wrote:
>> >
>> > On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro at infinispan.org> wrote:
>> >
>> >> I've been looking at TxDistributionInterceptor and I have a couple of
>> >> questions (assuming REPEATABLE_READ isolation level):
>> >>
>> >> #1. why are we doing a remote get each time we write on a key? (huge
>> >> perform impact if the key was previously read)
>> > indeed this is suboptimal for transactions that write the same key
>> repeatedly and repeatable read. Can you please create a JIRA for this?
>>
>> created: https://issues.jboss.org/browse/ISPN-3235
>>
>>
> Oops... when I fixed https://issues.jboss.org/browse/ISPN-3124 I removed
> the SKIP_REMOTE_LOOKUP, thinking that the map is already in the invocation
> context so there shouldn't be any perf penalty. I can't put the
> SKIP_REMOTE_LOOKUP flag back, otherwise delta writes won't have the
> previous value during state transfer, so +1 to fixing ISPN-3235.
>
>
>
>> >>
>> >> #2. why are we doing a dataContainer.get() if the remote get returns a
>> >> null value? Shouldn't the interactions with data container be performed
>> >> only in the (Versioned)EntryWrappingInterceptor?
>> > This was added in the scope of ISPN-2688 and covers the scenario in
>> which a state transfer is in progress, the remote get returns null as the
>> remote value was dropped (no longer owner) and this node has become the
>> owner in between.
>> >
>>
>> ok :)
>>
>>
> Yeah, this should be correct as long as we check if we already have the
> key in the invocation context before doing the remote + local get.
>
>
>
>> >>
>> >> #3. (I didn't verify this) why are we acquire the lock is the remote
>> get
>> >> is performed for a write? This looks correct for pessimistic locking
>> but
>> >> not for optimistic...
>> > I think that, given that the local node is not owner, the lock
>> acquisition is redundant even for pessimistic caches.
>> > Mind creating a test to check if dropping that lock acquisition doesn't
>> break things?
>>
>> I created a JIRA with low priority since it does not affect the
>> transaction outcome/isolation and I believe the performance impact
>> should be lower (you can increase the priority if you want).
>>
>> https://issues.jboss.org/browse/ISPN-3237
>>
>
> If we don't lock the L1 entry, I think something like this could happen:
>
> tx1 at A: remote get(k1) from B - stores k1=v1 in invocation context
> tx2 at A: write(k1, v2)
> tx2 at A: commit - writes k1=v2 in L1
> tx1 at A: commit - overwrites k1=v1 in L1
>
This one is just like here: referenced in
https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12779780

And even locking doesn't help in this case since it doesn't lock the key
for a remote get only a remote get in the context of a write - which means
the L1 could be updated concurrently in either order - causing possibly an
inconsistency.  This will be solved when I port the same fix I have for
https://issues.jboss.org/browse/ISPN-3197 for tx caches.

>
>
> >>
>> >> After this analysis, it is possible to break the isolation between
>> >> transaction if I do a get on the key that does not exist:
>> >>
>> >> tm.begin()
>> >> cache.get(k) //returns null
>> >> //in the meanwhile a transaction writes on k and commits
>> >> cache.get(k) //return the new value. IMO, this is not valid for
>> >> REPEATABLE_READ isolation level!
>> >
>> > Indeed sounds like a bug, well spotted.
>> > Can you please add a UT to confirm it and raise a JIRA?
>>
>> created: https://issues.jboss.org/browse/ISPN-3236
>>
>> IMO, this should be the correct behaviour (I'm going to add the test
>> cases later):
>>
>> tm.begin()
>> cache.get(k) //returns null (op#1)
>> //in the meanwhile a transaction writes on k and commits
>> write operation performed:
>> * put: must return the same value as op#1
>> * conditional put //if op#1 returns null the operation should be always
>> successful (i.e. the key is updated, return true). Otherwise, the key
>> remains unchanged (return false)
>> * replace: must return the same value as op#1
>> * conditional replace: replace should be successful if checked with the
>> op#1 return value (return true). Otherwise, the key must remain
>> unchanged (return false).
>> * remote: must return the same value as op#1
>> * conditional remove: the key should be removed if checked with the op#1
>> return value (return true). Otherwise, the key must remain unchanged
>> (return false)
>>
>> Also, the description above should be valid after a removal of a key.
>>
>> >
>> > Cheers,
>> >
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130617/3735e6fa/attachment.html 


More information about the infinispan-dev mailing list