[infinispan-issues] [JBoss JIRA] (ISPN-4137) Transaction executed multiple times due to forwarded CommitCommand
Dan Berindei (JIRA)
issues at jboss.org
Tue Mar 25 06:52:13 EDT 2014
[ https://issues.jboss.org/browse/ISPN-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12955943#comment-12955943 ]
Dan Berindei commented on ISPN-4137:
------------------------------------
While trying to write a test for the issue, I realized that the issue isn't actually related to state transfer. The only link with state transfer is that it may be more likely for a commit command to time out if it's waiting for a node to install a new topology, or forwarding the commit to a new node which is itself waiting for a new topology.
Let's say we have a transaction with a put(k, v) command, the originator is A, and the key owners are B (primary) and C (backup).
Let's also assume the local commit on either node can't fail, the only possible failure is replication timeout.
If the commit command sent from A to C times out, A will send a rollback command to B and C, and there are two cases:
1. C applies the commit before receiving the rollback command, and writes {{k=v}} in the cache without B holding the lock on {{k}} - allowing it to overwrite another transaction.
2. C receives the rollback command and skips the commit command, leaving {{k=v}} on B and {{k=null}} on C.
The only way out of this is to not send the rollback command at all, and use recovery to force the commit or rollback on A - but blocking any transactions that want to write to {{k}} in the meantime. When recovery is enabled, this is what my fix does, but I'm not sure if holding the lock on {{k}} for in-doubt transactions is ok. [~mircea.markus], WDYT?
There is a slightly different problem that my PR does fix: if the commit succeeds on both B and C, but A sees a topology change, it will re-send the commit command to both B and C. Without the change, B and C will both replay both the prepare and the commit, allowing for inconsistencies. But with the change, the transaction is seen as already completed and B and C do nothing.
> Transaction executed multiple times due to forwarded CommitCommand
> ------------------------------------------------------------------
>
> Key: ISPN-4137
> URL: https://issues.jboss.org/browse/ISPN-4137
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer, Transactions
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> When the {{StateTransferInterceptor}} forwards a CommitCommand for the new topology, multiple CommitCommands may be broadcast across the cluster. If the command (forwarded already from originator) times out, the transaction may be correctly finished by the first one and the application considers TX as succeeded (useSynchronizations=true), although one more Rollback is sent as well.
> Then, again in STI, when the CommitCommand arrives with higher topologyId than the one used for the first TX execution, another artificial Prepare (followed by the commit) is executed - see {{STI.visitCommitCommand}}.
> However, this execution may be delayed a lot and originator may have already executed another TX on the same entries. Then, this forwarded Commit will overwrite the already updated entries, causing inconsistency of data.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list