[
https://issues.jboss.org/browse/ISPN-860?page=com.atlassian.jira.plugin.s...
]
Manik Surtani commented on ISPN-860:
------------------------------------
I think this may be the culprit -
https://github.com/infinispan/infinispan/blob/master/core/src/main/java/o...
Happens when a prepare() occurs on, say, nodes {A, B, C} and the commit is sent to {A, B,
D} since D joins between the prepare and commit, and D takes ownership of the key.
Do you see this exception as a signature of this failure, prior to seeing timeout
exceptions?
java.lang.IllegalStateException: Can not commit since DldGlobalTransaction{coinToss=NNNNN,
isMarkedForRollback=false, lockIntention=null, affectedKeys=[], locksAtOrigin=[K]}
GlobalTransaction:<address>:port:local was prepared on [C1, C2, C3] nodes while it
is being committed to [C1, C2, C4]
Rehashing into a running cluster causes lock timeouts and lock
cleanup errors
-----------------------------------------------------------------------------
Key: ISPN-860
URL:
https://issues.jboss.org/browse/ISPN-860
Project: Infinispan
Issue Type: Bug
Affects Versions: 4.2.0.Final
Reporter: Erik Salter
Assignee: Manik Surtani
Fix For: 4.2.1.Final
Attachments: multinode-rehash.zip
We are seeing some severe issues with a new node joining a cluster running transactions.
Specifically, when a new node added to the system, some transactions running against the
previous nodes will fail due to locks never being released. There will be a lot of lock
timeouts as well.
All of our caches are in DIST mode. The number of owners is 3. We are also making
liberal use of the new "eagerLockSingleNode" flag.
The attached test case illustrates the lock timeout problem.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira