[infinispan-issues] [JBoss JIRA] (ISPN-8195) Transaction fails to commit when a node crashes

Dan Berindei (JIRA) issues at jboss.org
Fri Aug 25 07:21:00 EDT 2017


    [ https://issues.jboss.org/browse/ISPN-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454030#comment-13454030 ] 

Dan Berindei commented on ISPN-8195:
------------------------------------

I have also seen this in {{ConcurrentNonOverlappingLeaveTest}}. The problem is that {{TxDistributionInterceptor}} doesn't throw an {{OutdatedTopologyException}} if the current topology is not the same as the command topology set in {{StateTransferInterceptor}}, which makes it possible for {{A}} to process the commit command without waiting to to receive the transaction data (or even to become an owner, but that would be much harder to reproduce).

> Transaction fails to commit when a node crashes
> -----------------------------------------------
>
>                 Key: ISPN-8195
>                 URL: https://issues.jboss.org/browse/ISPN-8195
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core, Transactions
>    Affects Versions: 9.1.0.Final
>            Reporter: Radim Vansa
>            Assignee: Dan Berindei
>         Attachments: log.zip
>
>
> Nodes ABC, key is owned by BC:
> 1. C prepares a transaction modifying only one key [BC], prepare succeeds on both
> 2. B crashes
> 3. C tries to send CommitCommand to B and gets {{CacheNotFoundResponse}}
> 4. C throws OTE, which gets handled by STI and retried
> 5. A becomes an owner of key in the next topology
> 6. C sends CommitCommand to all owners, including A
> 7. A does not find the transaction prepared and throws {{IllegalStateException: Remote transaction not found: GlobalTx:test-NodeC-45028:1}}
> 8. C fails the transaction because of the {{IllegalStateException}}
> Usually A should request transactions during state transfer, but the CommitCommand is sent in the first topology with higher id - in this case it's the "Hey we've lost B!" topology which does not start rebalance yet.



--
This message was sent by Atlassian JIRA
(v7.2.3#72005)


More information about the infinispan-issues mailing list