[infinispan-issues] [JBoss JIRA] (ISPN-4575) Map/Reduce incorrect results with a non-shared non-tx intermediate cache

Thu Jul 31 02:44:30 EDT 2014

    [ https://issues.jboss.org/browse/ISPN-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989366#comment-12989366 ] 

Dan Berindei commented on ISPN-4575:
------------------------------------

We don't have a good wait to do that, the closest we have in the code base is {{TestingUtil.waitForRehashToComplete}}.

We don't have to do all that stuff though, we just have to wait for the cache to have the expected number of members and for the pending CH to be null (meaning the rebalance is over, and the cluster is stable). 

{code}
// assume StateTransferManager is injected as stm
while (stm.getCacheTopology().getMembers() != expectedSize && stm.getCacheTopology().getPendingCH() != null) {
   Thread.sleep(50);
}
{code}

> Map/Reduce incorrect results with a non-shared non-tx intermediate cache
> ------------------------------------------------------------------------
>
>                 Key: ISPN-4575
>                 URL: https://issues.jboss.org/browse/ISPN-4575
>             Project: Infinispan
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: Core, Distributed Execution and Map/Reduce
>    Affects Versions: 7.0.0.Alpha5
>            Reporter: Dan Berindei
>            Assignee: Vladimir Blagojevic
>            Priority: Blocker
>              Labels: testsuite_stability
>             Fix For: 7.0.0.Beta1
>
>
> In a non-tx cache, if a command is started with topology id {{T}}, and when it is replicated on another node the distribution interceptor sees topology {{T+1}}, it throws an {{OutdatedTopologyException}}. The originator of the command will then retry the command, setting topology {{T+1}}.
> When this happens with a {{PutKeyValueCommand(k, MapReduceManagerImpl.DeltaAwareList)}}, it can lead to duplicate intermediate values.
> Say _A_ is the primary owner of {{k}} in {{T}}, _B_ is a backup owner both in {{T}} and {{T+1}}, and _C_ is the backup owner in {{T}} and the primary owner in {{T+1}} (i.e. _C_ just joined and a rebalance is in progress during {{T}} - see {{NonTxBackupOwnerBecomingPrimaryOwnerTest}}).
> _A_ starts the {{PutKeyValueCommand}} and replicates it to _B_ and _C_. _C_ applies the command, but _B_ already has topology {{T+1}} and throws an {{OutdatedTopologyException}}. _A_ installs topology {{T+1}}, sends the command to _C_ (as the new primary owner), which replicates it to _B_ and then applies it locally a second time.
> This scenario can happen during a M/R task even without nodes joining or leaving. That's because {{CreateCacheCommand}} only calls {{getCache()}} on each member, it doesn't wait for the cache to have a certain number of members or for state transfer to be complete for all the members. The last member to join the intermediate cache is guaranteed to have topology {{T+1}}, but the others may have topology {{T}} by the time the combine phase starts inserting values in the intermediate cache.
> I have seen the {{OutdatedTopologyException}} happen pretty often during the test suite, especially after I removed the duplicate {{invokeRemotely}} call in {{MapReduceTask.executeTaskInit()}}. Most of them were harmless, but there was one failure in CI: http://ci.infinispan.org/viewLog.html?buildId=9811&tab=buildResultsDiv&buildTypeId=bt8
> A short-term fix would be to wait for all the members to finish joining in {{CreateCacheCommand}}. Long-term, M/R tasks should be resilient to topology changes, so we should investigate making {{PutKeyValue(k, DeltaAwareList)}} handle {{OutdatedTopologyException}} s. 

--
This message was sent by Atlassian JIRA
(v6.2.6#6264)