[infinispan-issues] [JBoss JIRA] Commented: (ISPN-860) Rehashing into a running cluster causes lock timeouts and lock cleanup errors

Monday, 10 January 2011

    [
https://issues.jboss.org/browse/ISPN-860?page=com.atlassian.jira.plugin.s...
] 

Manik Surtani commented on ISPN-860:
------------------------------------

I think this may be the culprit -
https://github.com/infinispan/infinispan/blob/master/core/src/main/java/o...

Happens when a prepare() occurs on, say, nodes {A, B, C} and the commit is sent to {A, B,
D} since D joins between the prepare and commit, and D takes ownership of the key.

Do you see this exception as a signature of this failure, prior to seeing timeout
exceptions?

java.lang.IllegalStateException: Can not commit since DldGlobalTransaction{coinToss=NNNNN,
isMarkedForRollback=false, lockIntention=null, affectedKeys=[], locksAtOrigin=[K]}
GlobalTransaction:<address>:port:local was prepared on [C1, C2, C3] nodes while it
is being committed to [C1, C2, C4]

...
 Rehashing into a running cluster causes lock timeouts and lock
cleanup errors
 -----------------------------------------------------------------------------

                 Key: ISPN-860
                 URL: https://issues.jboss.org/browse/ISPN-860
             Project: Infinispan
          Issue Type: Bug
    Affects Versions: 4.2.0.Final
            Reporter: Erik Salter
            Assignee: Manik Surtani
             Fix For: 4.2.1.Final

         Attachments: multinode-rehash.zip

 We are seeing some severe issues with a new node joining a cluster running transactions. 
Specifically, when a new node added to the system, some transactions running against the
previous nodes will fail due to locks never being released.  There will be a lot of lock
timeouts as well.
 All of our caches are in DIST mode.  The number of owners is 3.  We are also making
liberal use of the new "eagerLockSingleNode" flag.
 The attached test case illustrates the lock timeout problem.   
-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] Commented: (ISPN-860) Rehashing into a running cluster causes lock timeouts and lock cleanup errors