[infinispan-dev] SingleJoinTest#testTransactional failure

Mircea Markus mircea.markus at jboss.com
Tue Nov 30 07:34:10 EST 2010


On 24 Nov 2010, at 15:36, Manik Surtani wrote:

> Right, I've spotted it.  The test failure itself is intermittent due to the way addresses are organised in the hash wheel, so you are correct that it is a timing issue.  Anyway, it still is a very real problem.  Just to re-iterate and to make sure we are talking about the same thing:
> 
> 1.  View is {A, B, C}
> 2.  K is mapped to {A, B}
> 3.  A tx starts to update K, and is prepared.  Locks now held for K on {A, B}
> 4.  D joins.  D is placed on the hash wheel between A and B.  So the new view is {A, D, B, C}
> 5.  As per the test (artificial, I know, but could still happen), the tx waits for a long time before committing.  In the case of the test, artificially waits until D has finished joining before committing, by use of a latch.
> 6.  D never joins as even though it receives the prepare for the tx and could potentially commit itself (as a new owner), it fails as it is unable to invalidate K on B.
> 
> There are a few solutions here:
> 
> 1)  This is pretty easy to detect.  Attempt to acquire the lock with a smaller lock acquisition timeout and if the transaction is still stuck, abort the transaction and proceed with the join.
> 2)  If the blocking node is *not* the transaction originator (as in this case: the tx was started on A), then just force lock removal and tx rollback on B *only*.  Let the tx complete on A, since the new joiner will receive the transactional event and will be able to apply it as a new owner.
What I'm saying It might be very wrong, but trying :)
Isn't it possible to make the invalidation of K on B part of the transaction commit? I.e. the invalidation on B sees that K is locked by an tx that is not committed and it skips the invalidation. When tx commits (second phase) it multicasts the commit message to all the lock owners _including_ B (all the owners == {A, B, D}). When CommitCOmmand is received,  B checks weather or not it still is an data owner for each key: if it is the it applies it, otherwise it removes it. 

> My vote is to go for solution 1 - a bit more crude, but 2 would be very complex to implement.  And even then, would only solve for  the invalidation being blocked on a node that did not originate the transaction.  E.g., the tx originated on A but the lock issue was on B.  If, however, the tx originated on B, *and* B no longer owns the entry in question, then 2 is no longer a solution and the only solution would be 1.
> 
> Thoughts?
> 
> Cheers
> Manik
> 
> PS: Do we have a JIRA for this?
> 
> On 23 Nov 2010, at 13:13, Vladimir Blagojevic wrote:
> 
>> On 10-11-23 8:11 AM, Manik Surtani wrote:
>>> Is this related?
>>> 
>>> https://jira.jboss.org/browse/ISPN-595
>>> 
>>> Lets have a chat when you're online later today...
>>> 
>> 
>> No, this is different I believe. I do not see from his stack trace that InvalidateCommand is involved.
> 
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev




More information about the infinispan-dev mailing list