[
https://issues.jboss.org/browse/ISPN-2081?page=com.atlassian.jira.plugin....
]
Mircea Markus commented on ISPN-2081:
-------------------------------------
My initial suggestion with waiting for all responses before sending the rollback is
already in place as the ResponseMode is set to GET_ALL for prepare commands.
The only situation I can think of in which the problem can appear is when we get a
replication timeout before the prepare message reaches all the participants. Then
RpcManager throws a timeout exception and the rollback is sent async. Now this rollback
might still get processed *before* the prepare on some nodes.
I'll solve this as per a recommandation from Sebastiano Peluso:
- the rollback creates the RemoteTransaction and marks it as rollback
- when the prepare eventually arrives it checks to see if the transaction is not already
marked for rollback. If so it removes it.
- if the prepare never arrives it means that the originator has crashed and the
transaction should be cleaned by the remote lookup service
Transaction leak caused by reordering between prepare and rollback
------------------------------------------------------------------
Key: ISPN-2081
URL:
https://issues.jboss.org/browse/ISPN-2081
Project: Infinispan
Issue Type: Bug
Components: Transactions
Affects Versions: 5.1.5.FINAL
Reporter: Mircea Markus
Assignee: Mircea Markus
Priority: Critical
Labels: jdg, jdg6
Fix For: 5.2.0.CR1, 5.2.0.Final
Attachments: DistL1WriteSkewTest.txt, RollbackNotSentBeforePrepareTest.java
There's no ordering between the prepare and commit/rollback messages, as the later
are sent OOB.
With this in mind, the following transaction leak might happen:
Tx1 send prepare on nodes {A,B}
1. the message reaches A and timeouts but hasn't yet been processed on B
2. The transaction originator reacts immediately to the timeout received from A without
waiting the response from B and sends a rollback request
3. The rollback request is processed on A and B
4. The initial prepare is then processed on B
At this point we have an orphan transaction prepare on B.
Whilst this is not causing any inconsistencies, it keeps keys locked indefinitely and is
a memory leak.
The solution would be to wait at 2 for all the prepare messages *before* sending the
rollback.
Attached is a unit test to reproduce the issue.
Related mailing list thread:
http://infinispan.markmail.org/search/#query:%20list%3Aorg.jboss.lists.in...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira