[infinispan-issues] [JBoss JIRA] (ISPN-2778) When a cache is restarted, the LEAVE and JOIN commands are not ordered

Dan Berindei (JIRA) jira-events at lists.jboss.org
Thu Jan 31 07:04:51 EST 2013


     [ https://issues.jboss.org/browse/ISPN-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dan Berindei updated ISPN-2778:
-------------------------------

              Status: Pull Request Sent  (was: Open)
    Git Pull Request: https://github.com/infinispan/infinispan/pull/1630


Make the LEAVE command synchronous.
                
> When a cache is restarted, the LEAVE and JOIN commands are not ordered
> ----------------------------------------------------------------------
>
>                 Key: ISPN-2778
>                 URL: https://issues.jboss.org/browse/ISPN-2778
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State transfer
>    Affects Versions: 5.2.0.CR3
>            Reporter: Dan Berindei
>            Assignee: Dan Berindei
>             Fix For: 5.2.0.Final
>
>
> The LEAVE command is sent asynchronously, so if the cache is restarted it is possible for the new JOIN command to be processed before the LEAVE command on the coordinator.
> This doesn't work out very well: as the joining node is already present in the consistent hash during join, it won't do any state transfer. After that, it will receive a topology update with itself removed from the consistent hash.
> I have seen one failure because of this in {{StateTransferFunctionalTest.testInitialStateTransferAfterRestart}}:
> {noformat}
> 03:25:36,749 TRACE (testng-StateTransferFunctionalTest:) [JGroupsTransport] dests=[NodeG-42396], command=CacheTopologyControlCommand{cache=nbst, type=LEAVE, sender=NodeH-44562, joinInfo=null, topologyId=0, currentCH=null, pendingCH=null, throwable=null, viewId=1}, mode=ASYNCHRONOUS_WITH_SYNC_MARSHALLING, timeout=0
> 03:25:36,770 TRACE (testng-StateTransferFunctionalTest:) [JGroupsTransport] dests=[NodeG-42396], command=CacheTopologyControlCommand{cache=nbst, type=JOIN, sender=NodeH-44562, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.ReplicatedConsistentHashFactory at 335703e5, hashFunction=org.infinispan.commons.hash.MurmurHash3 at 64b6f0a5, numSegments=60, numOwners=2, timeout=240000}, topologyId=0, currentCH=null, pendingCH=null, throwable=null, viewId=1}, mode=SYNCHRONOUS, timeout=240000
> 03:25:36,771 TRACE (OOB-1,ISPN,NodeG-42396:) [CommandAwareRpcDispatcher] Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=nbst, type=JOIN, sender=NodeH-44562, joinInfo=CacheJoinInfo{consistentHashFactory=org.infinispan.distribution.ch.ReplicatedConsistentHashFactory at 3aea6b42, hashFunction=org.infinispan.commons.hash.MurmurHash3 at 7427d845, numSegments=60, numOwners=2, timeout=240000}, topologyId=0, currentCH=null, pendingCH=null, throwable=null, viewId=1} [sender=NodeH-44562]
> 03:25:36,771 TRACE (testng-StateTransferFunctionalTest:) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=2, currentCH=ReplicatedConsistentHash{members=[NodeG-42396, NodeH-44562]}, pendingCH=null} on cache nbst
> 03:25:36,782 TRACE (OOB-2,ISPN,NodeG-42396:) [CommandAwareRpcDispatcher] Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=nbst, type=LEAVE, sender=NodeH-44562, joinInfo=null, topologyId=0, currentCH=null, pendingCH=null, throwable=null, viewId=1} [sender=NodeH-44562]
> 03:25:36,840 TRACE (OOB-2,ISPN,NodeG-42396:nbst nbst) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=3, currentCH=ReplicatedConsistentHash{members=[NodeG-42396]}, pendingCH=null} on cache nbst
> 03:25:36,852 TRACE (OOB-2,ISPN,NodeH-44562:nbst) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=3, currentCH=ReplicatedConsistentHash{members=[NodeG-42396]}, pendingCH=null} on cache nbst
> {noformat}
> The solution is be to make the LEAVE command synchronous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list