Doubts about TxDistributionInterceptor and possible break in transaction isolation

New CacheStore: CouchBase

Retrieval operations with the...

Pedro Ruivo

Monday, 17 June 2013 Mon, 17 Jun '13

5:52 a.m.

Hi guys, I've been looking at TxDistributionInterceptor and I have a couple of questions (assuming REPEATABLE_READ isolation level): #1. why are we doing a remote get each time we write on a key? (huge perform impact if the key was previously read) #2. why are we doing a dataContainer.get() if the remote get returns a null value? Shouldn't the interactions with data container be performed only in the (Versioned)EntryWrappingInterceptor? #3. (I didn't verify this) why are we acquire the lock is the remote get is performed for a write? This looks correct for pessimistic locking but not for optimistic... After this analysis, it is possible to break the isolation between transaction if I do a get on the key that does not exist: tm.begin() cache.get(k) //returns null //in the meanwhile a transaction writes on k and commits cache.get(k) //return the new value. IMO, this is not valid for REPEATABLE_READ isolation level! wdyt? Thanks. Cheers, Pedro Ruivo

Show replies by date

Mircea Markus

Monday, 17 June Mon, 17 Jun

6:56 a.m.

On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote:

...

I've been looking at TxDistributionInterceptor and I have a couple of questions (assuming REPEATABLE_READ isolation level): #1. why are we doing a remote get each time we write on a key? (huge perform impact if the key was previously read)

indeed this is suboptimal for transactions that write the same key repeatedly and repeatable read. Can you please create a JIRA for this?

...

#2. why are we doing a dataContainer.get() if the remote get returns a null value? Shouldn't the interactions with data container be performed only in the (Versioned)EntryWrappingInterceptor?

This was added in the scope of ISPN-2688 and covers the scenario in which a state transfer is in progress, the remote get returns null as the remote value was dropped (no longer owner) and this node has become the owner in between.

...

#3. (I didn't verify this) why are we acquire the lock is the remote get is performed for a write? This looks correct for pessimistic locking but not for optimistic...

I think that, given that the local node is not owner, the lock acquisition is redundant even for pessimistic caches. Mind creating a test to check if dropping that lock acquisition doesn't break things?

...

After this analysis, it is possible to break the isolation between transaction if I do a get on the key that does not exist: tm.begin() cache.get(k) //returns null //in the meanwhile a transaction writes on k and commits cache.get(k) //return the new value. IMO, this is not valid for REPEATABLE_READ isolation level!

Indeed sounds like a bug, well spotted. Can you please add a UT to confirm it and raise a JIRA? Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)

Pedro Ruivo

7:58 a.m.

On 06/17/2013 12:56 PM, Mircea Markus wrote:

...

On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote: > I've been looking at TxDistributionInterceptor and I have a couple of > questions (assuming REPEATABLE_READ isolation level): > > #1. why are we doing a remote get each time we write on a key? (huge > perform impact if the key was previously read) indeed this is suboptimal for transactions that write the same key repeatedly and repeatable read. Can you please create a JIRA for this?

created: https://issues.jboss.org/browse/ISPN-3235

...

> > #2. why are we doing a dataContainer.get() if the remote get returns a > null value? Shouldn't the interactions with data container be performed > only in the (Versioned)EntryWrappingInterceptor? This was added in the scope of ISPN-2688 and covers the scenario in which a state transfer is in progress, the remote get returns null as the remote value was dropped (no longer owner) and this node has become the owner in between.

ok :)

...

> > #3. (I didn't verify this) why are we acquire the lock is the remote get > is performed for a write? This looks correct for pessimistic locking but > not for optimistic... I think that, given that the local node is not owner, the lock acquisition is redundant even for pessimistic caches. Mind creating a test to check if dropping that lock acquisition doesn't break things?

I created a JIRA with low priority since it does not affect the transaction outcome/isolation and I believe the performance impact should be lower (you can increase the priority if you want). https://issues.jboss.org/browse/ISPN-3237

...

> > After this analysis, it is possible to break the isolation between > transaction if I do a get on the key that does not exist: > > tm.begin() > cache.get(k) //returns null > //in the meanwhile a transaction writes on k and commits > cache.get(k) //return the new value. IMO, this is not valid for > REPEATABLE_READ isolation level! Indeed sounds like a bug, well spotted. Can you please add a UT to confirm it and raise a JIRA?

created: https://issues.jboss.org/browse/ISPN-3236 IMO, this should be the correct behaviour (I'm going to add the test cases later): tm.begin() cache.get(k) //returns null (op#1) //in the meanwhile a transaction writes on k and commits write operation performed: * put: must return the same value as op#1 * conditional put //if op#1 returns null the operation should be always successful (i.e. the key is updated, return true). Otherwise, the key remains unchanged (return false) * replace: must return the same value as op#1 * conditional replace: replace should be successful if checked with the op#1 return value (return true). Otherwise, the key must remain unchanged (return false). * remote: must return the same value as op#1 * conditional remove: the key should be removed if checked with the op#1 return value (return true). Otherwise, the key must remain unchanged (return false) Also, the description above should be valid after a removal of a key.

...

Cheers,

Mircea Markus

8:46 a.m.

On 17 Jun 2013, at 13:58, Pedro Ruivo <pedro(a)infinispan.org> wrote:

...

>> >> After this analysis, it is possible to break the isolation between >> transaction if I do a get on the key that does not exist: >> >> tm.begin() >> cache.get(k) //returns null >> //in the meanwhile a transaction writes on k and commits >> cache.get(k) //return the new value. IMO, this is not valid for >> REPEATABLE_READ isolation level! > > Indeed sounds like a bug, well spotted. > Can you please add a UT to confirm it and raise a JIRA? created: https://issues.jboss.org/browse/ISPN-3236 IMO, this should be the correct behaviour (I'm going to add the test cases later): tm.begin() cache.get(k) //returns null (op#1) //in the meanwhile a transaction writes on k and commits write operation performed: * put: must return the same value as op#1 * conditional put //if op#1 returns null the operation should be always successful (i.e. the key is updated, return true). Otherwise, the key remains unchanged (return false) * replace: must return the same value as op#1 * conditional replace: replace should be successful if checked with the op#1 return value (return true). Otherwise, the key must remain unchanged (return false).

all the conditional operation should consider as existing value whatever was previously read (op#1) or more correctly whatever it is in the context: e.g. //start k = null tx.begin() cache.put(k,v1); cache.replace(k,v1, v2) -> should succeed as the context sees v1 associated to k

...

* remote: must return the same value as op#1

you mean remove? remove should use whatever is in the context

...

* conditional remove: the key should be removed if checked with the op#1 return value (return true). Otherwise, the key must remain unchanged (return false)

same..

...

Also, the description above should be valid after a removal of a key.

Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)

Dan Berindei

10:11 a.m.

On Mon, Jun 17, 2013 at 3:58 PM, Pedro Ruivo <pedro(a)infinispan.org> wrote:

...

On 06/17/2013 12:56 PM, Mircea Markus wrote: > > On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote: > >> I've been looking at TxDistributionInterceptor and I have a couple of >> questions (assuming REPEATABLE_READ isolation level): >> >> #1. why are we doing a remote get each time we write on a key? (huge >> perform impact if the key was previously read) > indeed this is suboptimal for transactions that write the same key repeatedly and repeatable read. Can you please create a JIRA for this? created: https://issues.jboss.org/browse/ISPN-3235

Oops... when I fixed https://issues.jboss.org/browse/ISPN-3124 I removed the SKIP_REMOTE_LOOKUP, thinking that the map is already in the invocation context so there shouldn't be any perf penalty. I can't put the SKIP_REMOTE_LOOKUP flag back, otherwise delta writes won't have the previous value during state transfer, so +1 to fixing ISPN-3235.

...

>> >> #2. why are we doing a dataContainer.get() if the remote get returns a >> null value? Shouldn't the interactions with data container be performed >> only in the (Versioned)EntryWrappingInterceptor? > This was added in the scope of ISPN-2688 and covers the scenario in which a state transfer is in progress, the remote get returns null as the remote value was dropped (no longer owner) and this node has become the owner in between. > ok :)

Yeah, this should be correct as long as we check if we already have the key in the invocation context before doing the remote + local get.

...

>> >> #3. (I didn't verify this) why are we acquire the lock is the remote get >> is performed for a write? This looks correct for pessimistic locking but >> not for optimistic... > I think that, given that the local node is not owner, the lock acquisition is redundant even for pessimistic caches. > Mind creating a test to check if dropping that lock acquisition doesn't break things? I created a JIRA with low priority since it does not affect the transaction outcome/isolation and I believe the performance impact should be lower (you can increase the priority if you want). https://issues.jboss.org/browse/ISPN-3237

If we don't lock the L1 entry, I think something like this could happen: tx1@A: remote get(k1) from B - stores k1=v1 in invocation context tx2@A: write(k1, v2) tx2@A: commit - writes k1=v2 in L1 tx1@A: commit - overwrites k1=v1 in L1

...

> >> After this analysis, it is possible to break the isolation between >> transaction if I do a get on the key that does not exist: >> >> tm.begin() >> cache.get(k) //returns null >> //in the meanwhile a transaction writes on k and commits >> cache.get(k) //return the new value. IMO, this is not valid for >> REPEATABLE_READ isolation level! > > Indeed sounds like a bug, well spotted. > Can you please add a UT to confirm it and raise a JIRA? created: https://issues.jboss.org/browse/ISPN-3236 IMO, this should be the correct behaviour (I'm going to add the test cases later): tm.begin() cache.get(k) //returns null (op#1) //in the meanwhile a transaction writes on k and commits write operation performed: * put: must return the same value as op#1 * conditional put //if op#1 returns null the operation should be always successful (i.e. the key is updated, return true). Otherwise, the key remains unchanged (return false) * replace: must return the same value as op#1 * conditional replace: replace should be successful if checked with the op#1 return value (return true). Otherwise, the key must remain unchanged (return false). * remote: must return the same value as op#1 * conditional remove: the key should be removed if checked with the op#1 return value (return true). Otherwise, the key must remain unchanged (return false) Also, the description above should be valid after a removal of a key. > > Cheers, > _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

William Burns

10:35 a.m.

On Mon, Jun 17, 2013 at 11:11 AM, Dan Berindei <dan.berindei(a)gmail.com>wrote:

...

On Mon, Jun 17, 2013 at 3:58 PM, Pedro Ruivo <pedro(a)infinispan.org> wrote: > > > On 06/17/2013 12:56 PM, Mircea Markus wrote: > > > > On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote: > > > >> I've been looking at TxDistributionInterceptor and I have a couple of > >> questions (assuming REPEATABLE_READ isolation level): > >> > >> #1. why are we doing a remote get each time we write on a key? (huge > >> perform impact if the key was previously read) > > indeed this is suboptimal for transactions that write the same key > repeatedly and repeatable read. Can you please create a JIRA for this? > > created: https://issues.jboss.org/browse/ISPN-3235 > > Oops... when I fixed https://issues.jboss.org/browse/ISPN-3124 I removed the SKIP_REMOTE_LOOKUP, thinking that the map is already in the invocation context so there shouldn't be any perf penalty. I can't put the SKIP_REMOTE_LOOKUP flag back, otherwise delta writes won't have the previous value during state transfer, so +1 to fixing ISPN-3235. > >> > >> #2. why are we doing a dataContainer.get() if the remote get returns a > >> null value? Shouldn't the interactions with data container be performed > >> only in the (Versioned)EntryWrappingInterceptor? > > This was added in the scope of ISPN-2688 and covers the scenario in > which a state transfer is in progress, the remote get returns null as the > remote value was dropped (no longer owner) and this node has become the > owner in between. > > > > ok :) > > Yeah, this should be correct as long as we check if we already have the key in the invocation context before doing the remote + local get. > >> > >> #3. (I didn't verify this) why are we acquire the lock is the remote > get > >> is performed for a write? This looks correct for pessimistic locking > but > >> not for optimistic... > > I think that, given that the local node is not owner, the lock > acquisition is redundant even for pessimistic caches. > > Mind creating a test to check if dropping that lock acquisition doesn't > break things? > > I created a JIRA with low priority since it does not affect the > transaction outcome/isolation and I believe the performance impact > should be lower (you can increase the priority if you want). > > https://issues.jboss.org/browse/ISPN-3237 > If we don't lock the L1 entry, I think something like this could happen: tx1@A: remote get(k1) from B - stores k1=v1 in invocation context tx2@A: write(k1, v2) tx2@A: commit - writes k1=v2 in L1 tx1@A: commit - overwrites k1=v1 in L1

This one is just like here: referenced in https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&p... And even locking doesn't help in this case since it doesn't lock the key for a remote get only a remote get in the context of a write - which means the L1 could be updated concurrently in either order - causing possibly an inconsistency. This will be solved when I port the same fix I have for https://issues.jboss.org/browse/ISPN-3197 for tx caches.

...

>> > >> After this analysis, it is possible to break the isolation between > >> transaction if I do a get on the key that does not exist: > >> > >> tm.begin() > >> cache.get(k) //returns null > >> //in the meanwhile a transaction writes on k and commits > >> cache.get(k) //return the new value. IMO, this is not valid for > >> REPEATABLE_READ isolation level! > > > > Indeed sounds like a bug, well spotted. > > Can you please add a UT to confirm it and raise a JIRA? > > created: https://issues.jboss.org/browse/ISPN-3236 > > IMO, this should be the correct behaviour (I'm going to add the test > cases later): > > tm.begin() > cache.get(k) //returns null (op#1) > //in the meanwhile a transaction writes on k and commits > write operation performed: > * put: must return the same value as op#1 > * conditional put //if op#1 returns null the operation should be always > successful (i.e. the key is updated, return true). Otherwise, the key > remains unchanged (return false) > * replace: must return the same value as op#1 > * conditional replace: replace should be successful if checked with the > op#1 return value (return true). Otherwise, the key must remain > unchanged (return false). > * remote: must return the same value as op#1 > * conditional remove: the key should be removed if checked with the op#1 > return value (return true). Otherwise, the key must remain unchanged > (return false) > > Also, the description above should be valid after a removal of a key. > > > > > Cheers, > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Dan Berindei

Tuesday, 18 June Tue, 18 Jun

8:23 a.m.

On Mon, Jun 17, 2013 at 6:35 PM, William Burns <mudokonman(a)gmail.com> wrote:

...

On Mon, Jun 17, 2013 at 11:11 AM, Dan Berindei <dan.berindei(a)gmail.com>wrote: > > > > On Mon, Jun 17, 2013 at 3:58 PM, Pedro Ruivo <pedro(a)infinispan.org>wrote: > >> >> >> On 06/17/2013 12:56 PM, Mircea Markus wrote: >> > >> > On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote: >> > >> >> I've been looking at TxDistributionInterceptor and I have a couple of >> >> questions (assuming REPEATABLE_READ isolation level): >> >> >> >> #1. why are we doing a remote get each time we write on a key? (huge >> >> perform impact if the key was previously read) >> > indeed this is suboptimal for transactions that write the same key >> repeatedly and repeatable read. Can you please create a JIRA for this? >> >> created: https://issues.jboss.org/browse/ISPN-3235 >> >> > Oops... when I fixed https://issues.jboss.org/browse/ISPN-3124 I removed > the SKIP_REMOTE_LOOKUP, thinking that the map is already in the invocation > context so there shouldn't be any perf penalty. I can't put the > SKIP_REMOTE_LOOKUP flag back, otherwise delta writes won't have the > previous value during state transfer, so +1 to fixing ISPN-3235. > > > >> >> >> >> #2. why are we doing a dataContainer.get() if the remote get returns a >> >> null value? Shouldn't the interactions with data container be >> performed >> >> only in the (Versioned)EntryWrappingInterceptor? >> > This was added in the scope of ISPN-2688 and covers the scenario in >> which a state transfer is in progress, the remote get returns null as the >> remote value was dropped (no longer owner) and this node has become the >> owner in between. >> > >> >> ok :) >> >> > Yeah, this should be correct as long as we check if we already have the > key in the invocation context before doing the remote + local get. > > > >> >> >> >> #3. (I didn't verify this) why are we acquire the lock is the remote >> get >> >> is performed for a write? This looks correct for pessimistic locking >> but >> >> not for optimistic... >> > I think that, given that the local node is not owner, the lock >> acquisition is redundant even for pessimistic caches. >> > Mind creating a test to check if dropping that lock acquisition >> doesn't break things? >> >> I created a JIRA with low priority since it does not affect the >> transaction outcome/isolation and I believe the performance impact >> should be lower (you can increase the priority if you want). >> >> https://issues.jboss.org/browse/ISPN-3237 >> > > If we don't lock the L1 entry, I think something like this could happen: > > tx1@A: remote get(k1) from B - stores k1=v1 in invocation context > tx2@A: write(k1, v2) > tx2@A: commit - writes k1=v2 in L1 > tx1@A: commit - overwrites k1=v1 in L1 > This one is just like here: referenced in https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&p...

Yep, it's the same thing.

...

And even locking doesn't help in this case since it doesn't lock the key for a remote get only a remote get in the context of a write - which means the L1 could be updated concurrently in either order - causing possibly an inconsistency. This will be solved when I port the same fix I have for https://issues.jboss.org/browse/ISPN-3197 for tx caches.

I thought the locking happened for all remote gets, and that's how I think it should work. We don't have to keep the lock for the entire duration of the transaction, though. If we write the L1 entry to the data container during the remote get, like you suggested in your comment, then we could release the L1 lock immediately and remote invalidation commands would be free to remove the entry.

...

> > >> >> >> After this analysis, it is possible to break the isolation between >> >> transaction if I do a get on the key that does not exist: >> >> >> >> tm.begin() >> >> cache.get(k) //returns null >> >> //in the meanwhile a transaction writes on k and commits >> >> cache.get(k) //return the new value. IMO, this is not valid for >> >> REPEATABLE_READ isolation level! >> > >> > Indeed sounds like a bug, well spotted. >> > Can you please add a UT to confirm it and raise a JIRA? >> >> created: https://issues.jboss.org/browse/ISPN-3236 >> >> IMO, this should be the correct behaviour (I'm going to add the test >> cases later): >> >> tm.begin() >> cache.get(k) //returns null (op#1) >> //in the meanwhile a transaction writes on k and commits >> write operation performed: >> * put: must return the same value as op#1 >> * conditional put //if op#1 returns null the operation should be always >> successful (i.e. the key is updated, return true). Otherwise, the key >> remains unchanged (return false) >> * replace: must return the same value as op#1 >> * conditional replace: replace should be successful if checked with the >> op#1 return value (return true). Otherwise, the key must remain >> unchanged (return false). >> * remote: must return the same value as op#1 >> * conditional remove: the key should be removed if checked with the op#1 >> return value (return true). Otherwise, the key must remain unchanged >> (return false) >> >> Also, the description above should be valid after a removal of a key. >> >> > >> > Cheers, >> > >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

William Burns

8:41 a.m.

On Tue, Jun 18, 2013 at 9:23 AM, Dan Berindei <dan.berindei(a)gmail.com>wrote:

...

On Mon, Jun 17, 2013 at 6:35 PM, William Burns <mudokonman(a)gmail.com>wrote: > > > > On Mon, Jun 17, 2013 at 11:11 AM, Dan Berindei <dan.berindei(a)gmail.com>wrote: > >> >> >> >> On Mon, Jun 17, 2013 at 3:58 PM, Pedro Ruivo <pedro(a)infinispan.org>wrote: >> >>> >>> >>> On 06/17/2013 12:56 PM, Mircea Markus wrote: >>> > >>> > On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote: >>> > >>> >> I've been looking at TxDistributionInterceptor and I have a couple of >>> >> questions (assuming REPEATABLE_READ isolation level): >>> >> >>> >> #1. why are we doing a remote get each time we write on a key? (huge >>> >> perform impact if the key was previously read) >>> > indeed this is suboptimal for transactions that write the same key >>> repeatedly and repeatable read. Can you please create a JIRA for this? >>> >>> created: https://issues.jboss.org/browse/ISPN-3235 >>> >>> >> Oops... when I fixed https://issues.jboss.org/browse/ISPN-3124 I >> removed the SKIP_REMOTE_LOOKUP, thinking that the map is already in the >> invocation context so there shouldn't be any perf penalty. I can't put the >> SKIP_REMOTE_LOOKUP flag back, otherwise delta writes won't have the >> previous value during state transfer, so +1 to fixing ISPN-3235. >> >> >> >>> >> >>> >> #2. why are we doing a dataContainer.get() if the remote get returns >>> a >>> >> null value? Shouldn't the interactions with data container be >>> performed >>> >> only in the (Versioned)EntryWrappingInterceptor? >>> > This was added in the scope of ISPN-2688 and covers the scenario in >>> which a state transfer is in progress, the remote get returns null as the >>> remote value was dropped (no longer owner) and this node has become the >>> owner in between. >>> > >>> >>> ok :) >>> >>> >> Yeah, this should be correct as long as we check if we already have the >> key in the invocation context before doing the remote + local get. >> >> >> >>> >> >>> >> #3. (I didn't verify this) why are we acquire the lock is the remote >>> get >>> >> is performed for a write? This looks correct for pessimistic locking >>> but >>> >> not for optimistic... >>> > I think that, given that the local node is not owner, the lock >>> acquisition is redundant even for pessimistic caches. >>> > Mind creating a test to check if dropping that lock acquisition >>> doesn't break things? >>> >>> I created a JIRA with low priority since it does not affect the >>> transaction outcome/isolation and I believe the performance impact >>> should be lower (you can increase the priority if you want). >>> >>> https://issues.jboss.org/browse/ISPN-3237 >>> >> >> If we don't lock the L1 entry, I think something like this could happen: >> >> tx1@A: remote get(k1) from B - stores k1=v1 in invocation context >> tx2@A: write(k1, v2) >> tx2@A: commit - writes k1=v2 in L1 >> tx1@A: commit - overwrites k1=v1 in L1 >> > This one is just like here: referenced in > https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&p... > > Yep, it's the same thing. > And even locking doesn't help in this case since it doesn't lock the key > for a remote get only a remote get in the context of a write - which means > the L1 could be updated concurrently in either order - causing possibly an > inconsistency. This will be solved when I port the same fix I have for > https://issues.jboss.org/browse/ISPN-3197 for tx caches. > I thought the locking happened for all remote gets, and that's how I think it should work.

When I was talking about locking, I was actually referring to the remote lock. We do acquire local L1 locks for all remote gets - as far as I have seen, the problem about only acquiring the local L1 lock without additional checks is you can get updates in the wrong order to the L1, such as getting an invalidation for your current get applied before the get itself - which is what the Jira is about. I actually will be sending out a dev list email soon about the changes I was thinking for this.

...

We don't have to keep the lock for the entire duration of the transaction, though. If we write the L1 entry to the data container during the remote get, like you suggested in your comment, then we could release the L1 lock immediately and remote invalidation commands would be free to remove the entry.

Unfortunately the fix I proposed in the Jira still has some possibly inconsistencies since you could still get a L1 cache invalidation/update in between remote get and commit into the L1 (since we don't want to lock the L1 cache key for the duration of the remote get - only while updating). The simple change would improve throughput and reduce the chance of seeing an inconsistency.

...

>> >> >> >>> >> After this analysis, it is possible to break the isolation between >>> >> transaction if I do a get on the key that does not exist: >>> >> >>> >> tm.begin() >>> >> cache.get(k) //returns null >>> >> //in the meanwhile a transaction writes on k and commits >>> >> cache.get(k) //return the new value. IMO, this is not valid for >>> >> REPEATABLE_READ isolation level! >>> > >>> > Indeed sounds like a bug, well spotted. >>> > Can you please add a UT to confirm it and raise a JIRA? >>> >>> created: https://issues.jboss.org/browse/ISPN-3236 >>> >>> IMO, this should be the correct behaviour (I'm going to add the test >>> cases later): >>> >>> tm.begin() >>> cache.get(k) //returns null (op#1) >>> //in the meanwhile a transaction writes on k and commits >>> write operation performed: >>> * put: must return the same value as op#1 >>> * conditional put //if op#1 returns null the operation should be always >>> successful (i.e. the key is updated, return true). Otherwise, the key >>> remains unchanged (return false) >>> * replace: must return the same value as op#1 >>> * conditional replace: replace should be successful if checked with the >>> op#1 return value (return true). Otherwise, the key must remain >>> unchanged (return false). >>> * remote: must return the same value as op#1 >>> * conditional remove: the key should be removed if checked with the op#1 >>> return value (return true). Otherwise, the key must remain unchanged >>> (return false) >>> >>> Also, the description above should be valid after a removal of a key. >>> >>> > >>> > Cheers, >>> > >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Dan Berindei

Wednesday, 19 June Wed, 19 Jun

2:04 a.m.

On Tue, Jun 18, 2013 at 4:41 PM, William Burns <mudokonman(a)gmail.com> wrote:

...

On Tue, Jun 18, 2013 at 9:23 AM, Dan Berindei <dan.berindei(a)gmail.com>wrote: > > > > On Mon, Jun 17, 2013 at 6:35 PM, William Burns <mudokonman(a)gmail.com>wrote: > >> >> >> >> On Mon, Jun 17, 2013 at 11:11 AM, Dan Berindei <dan.berindei(a)gmail.com>wrote: >> >>> >>> >>> >>> On Mon, Jun 17, 2013 at 3:58 PM, Pedro Ruivo <pedro(a)infinispan.org>wrote: >>> >>>> >>>> >>>> On 06/17/2013 12:56 PM, Mircea Markus wrote: >>>> > >>>> > On 17 Jun 2013, at 11:52, Pedro Ruivo <pedro(a)infinispan.org> wrote: >>>> > >>>> >> I've been looking at TxDistributionInterceptor and I have a couple >>>> of >>>> >> questions (assuming REPEATABLE_READ isolation level): >>>> >> >>>> >> #1. why are we doing a remote get each time we write on a key? (huge >>>> >> perform impact if the key was previously read) >>>> > indeed this is suboptimal for transactions that write the same key >>>> repeatedly and repeatable read. Can you please create a JIRA for this? >>>> >>>> created: https://issues.jboss.org/browse/ISPN-3235 >>>> >>>> >>> Oops... when I fixed https://issues.jboss.org/browse/ISPN-3124 I >>> removed the SKIP_REMOTE_LOOKUP, thinking that the map is already in the >>> invocation context so there shouldn't be any perf penalty. I can't put the >>> SKIP_REMOTE_LOOKUP flag back, otherwise delta writes won't have the >>> previous value during state transfer, so +1 to fixing ISPN-3235. >>> >>> >>> >>>> >> >>>> >> #2. why are we doing a dataContainer.get() if the remote get >>>> returns a >>>> >> null value? Shouldn't the interactions with data container be >>>> performed >>>> >> only in the (Versioned)EntryWrappingInterceptor? >>>> > This was added in the scope of ISPN-2688 and covers the scenario in >>>> which a state transfer is in progress, the remote get returns null as the >>>> remote value was dropped (no longer owner) and this node has become the >>>> owner in between. >>>> > >>>> >>>> ok :) >>>> >>>> >>> Yeah, this should be correct as long as we check if we already have the >>> key in the invocation context before doing the remote + local get. >>> >>> >>> >>>> >> >>>> >> #3. (I didn't verify this) why are we acquire the lock is the >>>> remote get >>>> >> is performed for a write? This looks correct for pessimistic >>>> locking but >>>> >> not for optimistic... >>>> > I think that, given that the local node is not owner, the lock >>>> acquisition is redundant even for pessimistic caches. >>>> > Mind creating a test to check if dropping that lock acquisition >>>> doesn't break things? >>>> >>>> I created a JIRA with low priority since it does not affect the >>>> transaction outcome/isolation and I believe the performance impact >>>> should be lower (you can increase the priority if you want). >>>> >>>> https://issues.jboss.org/browse/ISPN-3237 >>>> >>> >>> If we don't lock the L1 entry, I think something like this could happen: >>> >>> tx1@A: remote get(k1) from B - stores k1=v1 in invocation context >>> tx2@A: write(k1, v2) >>> tx2@A: commit - writes k1=v2 in L1 >>> tx1@A: commit - overwrites k1=v1 in L1 >>> >> This one is just like here: referenced in >> https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&p... >> >> > Yep, it's the same thing. > > >> And even locking doesn't help in this case since it doesn't lock the >> key for a remote get only a remote get in the context of a write - which >> means the L1 could be updated concurrently in either order - causing >> possibly an inconsistency. This will be solved when I port the same fix I >> have for https://issues.jboss.org/browse/ISPN-3197 for tx caches. >> > > I thought the locking happened for all remote gets, and that's how I > think it should work. > When I was talking about locking, I was actually referring to the remote lock. We do acquire local L1 locks for all remote gets - as far as I have seen, the problem about only acquiring the local L1 lock without additional checks is you can get updates in the wrong order to the L1, such as getting an invalidation for your current get applied before the get itself - which is what the Jira is about. I actually will be sending out a dev list email soon about the changes I was thinking for this.

If the get command acquired the L1 lock before issuing the remote call, then any invalidation command would be blocked and could only delete the L1 entry after the get command wrote the L1 entry and released the lock.

...

> We don't have to keep the lock for the entire duration of the > transaction, though. If we write the L1 entry to the data container during > the remote get, like you suggested in your comment, then we could release > the L1 lock immediately and remote invalidation commands would be free to > remove the entry. > Unfortunately the fix I proposed in the Jira still has some possibly inconsistencies since you could still get a L1 cache invalidation/update in between remote get and commit into the L1 (since we don't want to lock the L1 cache key for the duration of the remote get - only while updating). The simple change would improve throughput and reduce the chance of seeing an inconsistency.

...

> > >>> >>> >> >>>> >> After this analysis, it is possible to break the isolation between >>>> >> transaction if I do a get on the key that does not exist: >>>> >> >>>> >> tm.begin() >>>> >> cache.get(k) //returns null >>>> >> //in the meanwhile a transaction writes on k and commits >>>> >> cache.get(k) //return the new value. IMO, this is not valid for >>>> >> REPEATABLE_READ isolation level! >>>> > >>>> > Indeed sounds like a bug, well spotted. >>>> > Can you please add a UT to confirm it and raise a JIRA? >>>> >>>> created: https://issues.jboss.org/browse/ISPN-3236 >>>> >>>> IMO, this should be the correct behaviour (I'm going to add the test >>>> cases later): >>>> >>>> tm.begin() >>>> cache.get(k) //returns null (op#1) >>>> //in the meanwhile a transaction writes on k and commits >>>> write operation performed: >>>> * put: must return the same value as op#1 >>>> * conditional put //if op#1 returns null the operation should be always >>>> successful (i.e. the key is updated, return true). Otherwise, the key >>>> remains unchanged (return false) >>>> * replace: must return the same value as op#1 >>>> * conditional replace: replace should be successful if checked with the >>>> op#1 return value (return true). Otherwise, the key must remain >>>> unchanged (return false). >>>> * remote: must return the same value as op#1 >>>> * conditional remove: the key should be removed if checked with the >>>> op#1 >>>> return value (return true). Otherwise, the key must remain unchanged >>>> (return false) >>>> >>>> Also, the description above should be valid after a removal of a key. >>>> >>>> > >>>> > Cheers, >>>> > >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Mircea Markus

Monday, 17 June Mon, 17 Jun

11 a.m.

On 17 Jun 2013, at 16:11, Dan Berindei <dan.berindei(a)gmail.com> wrote:

...

> I think that, given that the local node is not owner, the lock acquisition is redundant even for pessimistic caches. > Mind creating a test to check if dropping that lock acquisition doesn't break things? I created a JIRA with low priority since it does not affect the transaction outcome/isolation and I believe the performance impact should be lower (you can increase the priority if you want). https://issues.jboss.org/browse/ISPN-3237 If we don't lock the L1 entry, I think something like this could happen:

There is a lock happening *without* L1 enabled.

...

tx1@A: remote get(k1) from B - stores k1=v1 in invocation context tx2@A: write(k1, v2) tx2@A: commit - writes k1=v2 in L1 tx1@A: commit - overwrites k1=v1 in L1

Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)

Dan Berindei

Tuesday, 18 June Tue, 18 Jun

8:16 a.m.

On Mon, Jun 17, 2013 at 7:00 PM, Mircea Markus <mmarkus(a)redhat.com> wrote:

...

On 17 Jun 2013, at 16:11, Dan Berindei <dan.berindei(a)gmail.com> wrote: > > I think that, given that the local node is not owner, the lock acquisition is redundant even for pessimistic caches. > > Mind creating a test to check if dropping that lock acquisition doesn't break things? > > I created a JIRA with low priority since it does not affect the > transaction outcome/isolation and I believe the performance impact > should be lower (you can increase the priority if you want). > > https://issues.jboss.org/browse/ISPN-3237 > > If we don't lock the L1 entry, I think something like this could happen: There is a lock happening *without* L1 enabled.

Nope, tx1 doesn't lock k1 on B because it doesn't do a put(k1, v3) - it only reads the value from B. So even if tx2 does lock k1 on B, it doesn't add any synchronization between tx1 and tx2. But tx1 does write the entry to L1 on A, so it should acquire an "L1 lock" on A - and tx2 should also acquire the same lock.

...

> > tx1@A: remote get(k1) from B - stores k1=v1 in invocation context > tx2@A: write(k1, v2) > tx2@A: commit - writes k1=v2 in L1 > tx1@A: commit - overwrites k1=v1 in L1 > > Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Mircea Markus

Wednesday, 19 June Wed, 19 Jun

5:40 a.m.

On 18 Jun 2013, at 14:16, Dan Berindei <dan.berindei(a)gmail.com> wrote:

...

On Mon, Jun 17, 2013 at 7:00 PM, Mircea Markus <mmarkus(a)redhat.com> wrote: On 17 Jun 2013, at 16:11, Dan Berindei <dan.berindei(a)gmail.com> wrote: > > I think that, given that the local node is not owner, the lock acquisition is redundant even for pessimistic caches. > > Mind creating a test to check if dropping that lock acquisition doesn't break things? > > I created a JIRA with low priority since it does not affect the > transaction outcome/isolation and I believe the performance impact > should be lower (you can increase the priority if you want). > > https://issues.jboss.org/browse/ISPN-3237 > > If we don't lock the L1 entry, I think something like this could happen: There is a lock happening *without* L1 enabled. Nope, tx1 doesn't lock k1 on B because it doesn't do a put(k1, v3) - it only reads the value from B. So even if tx2 does lock k1 on B, it doesn't add any synchronization between tx1 and tx2.

A lock is being acquired even without L1 enabled on A: https://github.com/an1310/infinispan/blob/master/core/src/main/java/org/i...

...

But tx1 does write the entry to L1 on A, so it should acquire an "L1 lock" on A - and tx2 should also acquire the same lock.

Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org)

4676

days inactive

4678

days old

infinispan-dev@lists.jboss.org

Manage subscription

11 comments

4 participants

tags (0)

participants (4)

Dan Berindei
Mircea Markus
Pedro Ruivo
William Burns

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Doubts about TxDistributionInterceptor and possible break in transaction isolation