[
https://issues.jboss.org/browse/ISPN-5158?page=com.atlassian.jira.plugin....
]
Dan Berindei edited comment on ISPN-5158 at 1/19/15 12:39 PM:
--------------------------------------------------------------
The solution would be to throw an exception in
{{TxInterceptor.invokeNextInterceptorAndVerifyTransaction()}} instead of returning null,
when the originator is missing.
However, while implementing this change I noticed another problem: {{ClusteredGetCommand}}
sometimes needs to acquire a remote lock, and it will use a {{LockControlCommand}} to do
that. However, the {{LockControlCommand}} is not properly initialized and it also goes
through the "originator is missing" branch, effectively undoing the lock. The
same problem appears when the replication queue is enabled in a DIST_ASYNC/REPL_ASYNC
cache: the commands are replicates inside a {{MultipleRpcCommand}}, and their origin is
never set.
Correction: Async replication uses 1-phase prepare commands, so the rollback didn't
actually break anything. So the only problem is with
{{cache.withFlags(FORCE_LOCK).get(k)}}.
was (Author: dan.berindei):
The solution would be to throw an exception in
{{TxInterceptor.invokeNextInterceptorAndVerifyTransaction()}} instead of returning null,
when the originator is missing.
However, while implementing this change I noticed another problem: {{ClusteredGetCommand}}
sometimes needs to acquire a remote lock, and it will use a {{LockControlCommand}} to do
that. However, the {{LockControlCommand}} is not properly initialized and it also goes
through the "originator is missing" branch, effectively undoing the lock. The
same problem appears when the replication queue is enabled in a DIST_ASYNC/REPL_ASYNC
cache: the commands are replicates inside a {{MultipleRpcCommand}}, and their origin is
never set.
Transaction rolled back but returns successful response
-------------------------------------------------------
Key: ISPN-5158
URL:
https://issues.jboss.org/browse/ISPN-5158
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 7.1.0.Beta1
Reporter: Radim Vansa
Assignee: Dan Berindei
Priority: Critical
Attachments: tx.txt, views.txt
When the cluster is merging, it is possible that a node is removed from the view although
it is still responsive. Eventually the cluster is merged correctly, but since the node is
reported as missing from the view, transaction originating from this node is rolled back.
{code}
10:01:36,116 TRACE [org.infinispan.interceptors.TxInterceptor] (remote-thread-151)
Rolling back remote transaction GlobalTransaction:<edg-perf02-39415>:28106:remote
because either already completed(false) or originator no longer in the cluster(true).
{code}
However, even after this a successful response is sent to the originator:
{code}
10:01:36,119 TRACE [org.infinispan.remoting.InboundInvocationHandlerImpl]
(remote-thread-151) About to send back response null for command PrepareCommand
{modifications=[PutKeyValueCommand{key=key_0000000000001318, value=[19 #1: 1195, ],
flags=[SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP], putIfAbsent=false, valueMatcher=MATCH_ALWAYS,
metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false,
gtx=GlobalTransaction:<edg-perf02-39415>:28106:remote,
cacheName='testCache', topologyId=47}
{code}
Originator then expects that the transaction was successfully prepared:
{code}
10:01:36,124 TRACE [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher]
(DefaultStressor-9) Responses: [sender=edg-perf01-36235, received=true, suspected=false]
[sender=edg-perf03-24110, received=true, suspected=false]
10:01:36,135 TRACE [org.infinispan.transaction.TransactionCoordinator]
(DefaultStressor-9) Committing transaction
GlobalTransaction:<edg-perf02-39415>:28106:local
{code}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)