[JBoss JIRA] (ISPN-4575) Map/Reduce incorrect results with a non-shared non-tx intermediate cache

Thursday, 31 July 2014

    [
https://issues.jboss.org/browse/ISPN-4575?page=com.atlassian.jira.plugin....
] 

Dan Berindei commented on ISPN-4575:
------------------------------------

We don't have a good wait to do that, the closest we have in the code base is
{{TestingUtil.waitForRehashToComplete}}.

We don't have to do all that stuff though, we just have to wait for the cache to have
the expected number of members and for the pending CH to be null (meaning the rebalance is
over, and the cluster is stable). 

{code}
// assume StateTransferManager is injected as stm
while (stm.getCacheTopology().getMembers() != expectedSize &&
stm.getCacheTopology().getPendingCH() != null) {
   Thread.sleep(50);
}
{code}

...
 Map/Reduce incorrect results with a non-shared non-tx intermediate
cache
 ------------------------------------------------------------------------

                 Key: ISPN-4575
                 URL: https://issues.jboss.org/browse/ISPN-4575
             Project: Infinispan
          Issue Type: Bug
      Security Level: Public(Everyone can see) 
          Components: Core, Distributed Execution and Map/Reduce
    Affects Versions: 7.0.0.Alpha5
            Reporter: Dan Berindei
            Assignee: Vladimir Blagojevic
            Priority: Blocker
              Labels: testsuite_stability
             Fix For: 7.0.0.Beta1

 In a non-tx cache, if a command is started with topology id {{T}}, and when it is
replicated on another node the distribution interceptor sees topology {{T+1}}, it throws
an {{OutdatedTopologyException}}. The originator of the command will then retry the
command, setting topology {{T+1}}.
 When this happens with a {{PutKeyValueCommand(k, MapReduceManagerImpl.DeltaAwareList)}},
it can lead to duplicate intermediate values.
 Say _A_ is the primary owner of {{k}} in {{T}}, _B_ is a backup owner both in {{T}} and
{{T+1}}, and _C_ is the backup owner in {{T}} and the primary owner in {{T+1}} (i.e. _C_
just joined and a rebalance is in progress during {{T}} - see
{{NonTxBackupOwnerBecomingPrimaryOwnerTest}}).
 _A_ starts the {{PutKeyValueCommand}} and replicates it to _B_ and _C_. _C_ applies the
command, but _B_ already has topology {{T+1}} and throws an {{OutdatedTopologyException}}.
_A_ installs topology {{T+1}}, sends the command to _C_ (as the new primary owner), which
replicates it to _B_ and then applies it locally a second time.
 This scenario can happen during a M/R task even without nodes joining or leaving.
That's because {{CreateCacheCommand}} only calls {{getCache()}} on each member, it
doesn't wait for the cache to have a certain number of members or for state transfer
to be complete for all the members. The last member to join the intermediate cache is
guaranteed to have topology {{T+1}}, but the others may have topology {{T}} by the time
the combine phase starts inserting values in the intermediate cache.
 I have seen the {{OutdatedTopologyException}} happen pretty often during the test suite,
especially after I removed the duplicate {{invokeRemotely}} call in
{{MapReduceTask.executeTaskInit()}}. Most of them were harmless, but there was one failure
in CI:
http://ci.infinispan.org/viewLog.html?buildId=9811&tab=buildResultsDi...
 A short-term fix would be to wait for all the members to finish joining in
{{CreateCacheCommand}}. Long-term, M/R tasks should be resilient to topology changes, so
we should investigate making {{PutKeyValue(k, DeltaAwareList)}} handle
{{OutdatedTopologyException}} s.  

--
This message was sent by Atlassian JIRA
(v6.2.6#6264)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009