[infinispan-issues] [JBoss JIRA] (ISPN-2410) A PrepareCommand forwarded back to the originator can time out waiting on a key already locked by itself
Mircea Markus (JIRA)
jira-events at lists.jboss.org
Wed Oct 31 07:37:01 EDT 2012
[ https://issues.jboss.org/browse/ISPN-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730357#comment-12730357 ]
Mircea Markus commented on ISPN-2410:
-------------------------------------
[~dan.berindei] re: renaming, they already inherit from AbstractCacheTransaction, I think any logic can that is common to both remote and local transactions can be moved there. Unless you want to implement it as I suggested, then you might not need to do this.
> A PrepareCommand forwarded back to the originator can time out waiting on a key already locked by itself
> --------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2410
> URL: https://issues.jboss.org/browse/ISPN-2410
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.0.Beta2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.0.CR1
>
>
> If a rebalance happens while a prepare command is executing on a remote node, and the originator has become an owner, it makes sense to forward the command back to the originator to lock the keys (or just add them to the backup locks list).
> However, we don't keep the old consistent hashes around, so we don't know if the originator became an owner after invoking the remote command or was already an owner. So if the topology changed, we always forward the prepare back to the owner.
> Back on the originator, minTxTopologyId < currentTopologyId, so the prepare command has to wait for all the backup locks from pending transactions to be released. The problem is that we wait for the current transaction as well, causing a deadlock.
> Seen in OnePhaseXATest:
> {noformat}
> 18:07:46,873 TRACE (testng-OnePhaseXATest:TestCache) [RpcManagerImpl] NodeA-46125 broadcasting call PrepareCommand {modifications=[PutKeyValueCommand{key=key0, value=value, flags=null, putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-46125>:4353:local, cacheName='TestCache', topologyId=-1} to recipient list null
> 18:07:46,873 DEBUG (transport-thread-2,NodeA:TestCache) [LocalTopologyManagerImpl] Updating local consistent hash(es) for cache TestCache: new topology = CacheTopology{id=2, currentCH=ReplicatedConsistentHash{members=[NodeA-46125, NodeB-49450]}, pendingCH=null}
> 18:07:46,894 TRACE (OOB-1,ISPN,NodeB-49450:TestCache) [StateTransferManagerImpl] Forwarding command PrepareCommand {modifications=[PutKeyValueCommand{key=key0, value=value, flags=null, putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-46125>:4353:remote, cacheName='TestCache', topologyId=2} to new targets [NodeA-46125]
> 18:07:46,935 TRACE (OOB-3,ISPN,NodeA-46125:TestCache) [StateTransferInterceptor] handleTopologyAffectedCommand for command PrepareCommand {modifications=[PutKeyValueCommand{key=key0, value=value, flags=null, putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1}], onePhaseCommit=false, gtx=GlobalTransaction:<NodeA-46125>:4353:remote, cacheName='TestCache', topologyId=2}, originLocal=false
> 18:07:46,935 TRACE (OOB-3,ISPN,NodeA-46125:TestCache) [AbstractCacheTransaction] Transaction gtx=GlobalTransaction:<NodeA-46125>:4353:local potentially locks key key0? true
> 18:08:16,874 TRACE (testng-OnePhaseXATest:TestCache) [RpcManagerImpl] replication exception:
> org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to NodeB-49450
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list