[
https://issues.jboss.org/browse/ISPN-1602?page=com.atlassian.jira.plugin....
]
Dan Berindei commented on ISPN-1602:
------------------------------------
I got a similar situation in the test suite, but with 2 view changes:
{noformat}
Initial cluster view is A2 [A, B]
tx_A: sends CommitCommand cc to B
C: sends REQUEST_JOIN
A: sends PREPARE_VIEW(3, [A, B, C]) to B, C
A: blocks write commands for cache view 3
B: blocks write commands for cache view 3
tx_B: rejects cc with a StateTransferInProgress
tx_A: waits for state transfer 3 to end
A: sends COMMIT_VIEW(3) to B, C
A: unblocks transactions
tx_A: resends cc to B
tx_B: rejects cc again, because state transfer hasn't ended on B yet
tx_A: waits for state transfer 3 to end
but since state transfer 3 has already ended on A, it doesn't actually wait at
all
tx_A: resends cc to B
tx_B: rejects cc the 3rd time
B: finally receives the COMMIT_VIEW(3) command and unblocks transactions, but it's too
late for cc
{noformat}
My pull req for ISPN-1581 will fix this because it will retry commands for much longer
(indefinitely for commit commands). There will be some extra
StateTransferInProgressExceptions in the log, but that's about it.
Single view change causes stale locks
-------------------------------------
Key: ISPN-1602
URL:
https://issues.jboss.org/browse/ISPN-1602
Project: Infinispan
Issue Type: Bug
Components: Core API
Affects Versions: 5.1.0.CR1
Reporter: Erik Salter
Assignee: Dan Berindei
Priority: Critical
Fix For: 5.1.0.CR2
Attachments: erm_tcp.xml, session_udp.xml
During load testing of 5.1.0.CR1, we're encountering JGroups 3.x dropping views. We
know due to ISPN-1581, if the number of view changes > 3, there could be a stale lock
on a failed commit. However, we're seeing stale locks occur on a single view change.
In the following logs, the affected cluster is the erm-cluster-xxxx
(We also don't know why JGroups 3.x is unstable. We suspected FLUSH and incorrect FD
settings, but we removed them, and we're still getting dropped messages)
The trace logs (It isn't long at all before the issue occurs) are at:
http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht10/server.l...
http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht11/server.l...
http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht12/server.l...
http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht13/server.l...
http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht14/server.l...
http://dl.dropbox.com/u/50401510/5.1.0.CR1/dec08viewchange/dht15/server.l...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira