[JBoss JIRA] (ISPN-4460) Map-Reduce: Mapper sometimes receives null value
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4460?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4460:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1116979|https://bugzilla.redhat.com/show_bug.cgi?id=1116979] from MODIFIED to ON_QA
> Map-Reduce: Mapper sometimes receives null value
> ------------------------------------------------
>
> Key: ISPN-4460
> URL: https://issues.jboss.org/browse/ISPN-4460
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Affects Versions: 6.0.2.Final
> Reporter: Rich DiCroce
> Assignee: Vladimir Blagojevic
> Fix For: 7.0.0.Alpha5
>
>
> I have a Mapper with the following map method:
> {code}
> public void map(EndpointAddress key, EndpointInfo value, Collector<Address, Integer> collector) {
> // TODO debugging, remove this
> if (value == null) {
> System.out.println("value is null! WTF");
> }
> if (collector == null) {
> System.out.println("collector is null! OMGWTFBBQ");
> }
> collector.emit(value.getConnectedGP(), 1);
> }
> {code}
> Null checks were added because I am sometimes seeing a NullPointerException on the last line. Console output is below. I cannot reliably reproduce this problem. It's clearly a race condition of some kind. The cache that is being queried has keys being added/removed all the time.
> {noformat}
> 14:05:30,020 INFO [stdout] (transport-thread-18) value is null! WTF
> 14:05:30,022 ERROR [com.sg.song.nms.ispn.DataGatherer] (EJB default - 3) GP table column query failed: java.util.concurrent.ExecutionException: org.infinispan.commons.CacheException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapReduceTaskFuture.get(MapReduceTask.java:762) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at com.sg.song.nms.ispn.DataGatherer.queryCurrentGPStatistics(DataGatherer.java:116) [classes:]
> at sun.reflect.GeneratedMethodAccessor41.invoke(Unknown Source) [:1.7.0_45]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_45]
> at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_45]
> at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53)
> at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:61)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:407)
> at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:82) [wildfly-weld-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:95) [wildfly-weld-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:61)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53)
> at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:61)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ejb3.tx.EjbBMTInterceptor.handleInvocation(EjbBMTInterceptor.java:104) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.as.ejb3.tx.BMTInterceptor.processInvocation(BMTInterceptor.java:56) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:407)
> at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:55) [weld-core-impl-2.1.2.Final.jar:2014-01-09 09:23]
> at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:83) [wildfly-weld-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
> at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:53)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:52) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:95) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.as.ejb3.component.interceptors.AdditionalSetupInterceptor.processInvocation(AdditionalSetupInterceptor.java:55) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:64)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:326)
> at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:448)
> at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:61)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:326)
> at org.jboss.invocation.PrivilegedWithCombinerInterceptor.processInvocation(PrivilegedWithCombinerInterceptor.java:80)
> at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:309)
> at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
> at org.jboss.as.ejb3.timerservice.TimedObjectInvokerImpl.callTimeout(TimedObjectInvokerImpl.java:104) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.as.ejb3.timerservice.task.CalendarTimerTask.callTimeout(CalendarTimerTask.java:61) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at org.jboss.as.ejb3.timerservice.task.TimerTask.run(TimerTask.java:168) [wildfly-ejb3-8.1.0.Final.jar:8.1.0.Final]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_45]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_45]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_45]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45]
> at org.jboss.threads.JBossThread.run(JBossThread.java:122)
> Caused by: org.infinispan.commons.CacheException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:348) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:634) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$3.call(MapReduceTask.java:652) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapReduceTaskFuture.get(MapReduceTask.java:760) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> ... 62 more
> Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) [rt.jar:1.7.0_45]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) [rt.jar:1.7.0_45]
> at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:845) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhase(MapReduceTask.java:439) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:342) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> ... 65 more
> Caused by: java.lang.NullPointerException
> at com.sgi.song.gp.protocol.SONGv1.cluster.query.RegistrationsByGPMapper.map(RegistrationsByGPMapper.java:26) [gp-ispn-shared-1.0.0-SNAPSHOT.jar:]
> at com.sgi.song.gp.protocol.SONGv1.cluster.query.RegistrationsByGPMapper.map(RegistrationsByGPMapper.java:1) [gp-ispn-shared-1.0.0-SNAPSHOT.jar:]
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.map(MapReduceManagerImpl.java:181) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:96) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.invokeMapCombineLocally(MapReduceTask.java:967) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.access$200(MapReduceTask.java:894) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$1.call(MapReduceTask.java:916) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$1.call(MapReduceTask.java:912) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_45]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_45]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_45]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_45]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4453) MapReduceTask#executeAsynchronously() isn't asynchronous
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4453?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4453:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1116979|https://bugzilla.redhat.com/show_bug.cgi?id=1116979] from MODIFIED to ON_QA
> MapReduceTask#executeAsynchronously() isn't asynchronous
> --------------------------------------------------------
>
> Key: ISPN-4453
> URL: https://issues.jboss.org/browse/ISPN-4453
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Affects Versions: 6.0.2.Final
> Reporter: Rich DiCroce
> Assignee: Vladimir Blagojevic
> Fix For: 7.0.0.Alpha5
>
>
> Quote from the linked forum thread:
> {quote}
> MapReduceTask#executeAsynchronously() doesn't actually do anything asynchronously. It just returns a MapReduceTaskFuture containing a Callable that calls execute(). The only place I see that Callable being called is in MapReduceTaskFuture#get().
>
> In other words, the task isn't actually started until you call Future#get(), which isn't asynchronous at all! On top of that, MapReduceTaskFuture extends AbstractInProcessFuture, which "implements" the get(long, TimeUnit) method by just calling get(). Which means that MapReduceTaskFuture doesn't respect the timeout parameters and will just run until it completes.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4154) Cancelled segment transfer causes future entry transfer to be ignored
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4154?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4154:
-----------------------------------------------
Dave Stahl <dstahl(a)redhat.com> changed the Status of [bug 1104045|https://bugzilla.redhat.com/show_bug.cgi?id=1104045] from MODIFIED to ON_QA
> Cancelled segment transfer causes future entry transfer to be ignored
> ---------------------------------------------------------------------
>
> Key: ISPN-4154
> URL: https://issues.jboss.org/browse/ISPN-4154
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: State Transfer
> Affects Versions: 7.0.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.0.0.Alpha5
>
>
> Distributed transactional cache.
> 1) Coordinator is gracefully leaving the cluster, sends a REBALANCE_START with topologyId 14, ST begins.
> 2) Node receives chunk from segment X, writes entry K=V to the container.
> 3) New coordinator jumps in with CH_UPDATE topology 16
> 4) Node receives CANCEL_STATE_TRANSFER and cancels transfer of segment X, invalidating K. In CommitManager, this operation is tracked and DiscardPolicy is set to DISCARD_STATE_TRANSFER for key K.
> 5) New coordinator starts rebalance with topology 17
> 6) Node starts new ST for segment X
> 7) Node receives the X: K=V, but in CommitManager it finds out that the policy is set to DISCARD_STATE_TRANSFER and ignores this update.
> Result: entry value is lost on some node.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4484) Outbound transfers can be cancelled by old CANCEL_STATE_TRANSFER command
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4484?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4484:
-----------------------------------------------
William Burns <wburns(a)redhat.com> changed the Status of [bug 1104045|https://bugzilla.redhat.com/show_bug.cgi?id=1104045] from ASSIGNED to MODIFIED
> Outbound transfers can be cancelled by old CANCEL_STATE_TRANSFER command
> ------------------------------------------------------------------------
>
> Key: ISPN-4484
> URL: https://issues.jboss.org/browse/ISPN-4484
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core, State Transfer
> Affects Versions: 6.0.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.0.0.Alpha5
>
>
> This appeared during the 32-nodes elasticity test in the Hyperion environment.
> Just as apex947 left, it started a rebalance, which apex948 dutifully cancelled as it became the new coordinator. apex949 had already requested segments from apex959, so it sent a StateRequestCommand(CANCEL_STATE_TRANSFER) asynchronously to apex959. Then apex948 started a new rebalance, and apex949 asked apex959 for the same segments. When apex959 finally received the cancel request, it didn't check the topology id and it incorrectly cancelled the outbound transfer to apex949.
> The solution would be to verify the topology id in the CANCEL_STATE_TRANSFER command before cancelling the transfer. I also think we can avoid sending the cancel command completely in this case, and only send it as we are about to stop.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4154) Cancelled segment transfer causes future entry transfer to be ignored
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4154?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4154:
-----------------------------------------------
William Burns <wburns(a)redhat.com> changed the Status of [bug 1104045|https://bugzilla.redhat.com/show_bug.cgi?id=1104045] from ASSIGNED to MODIFIED
> Cancelled segment transfer causes future entry transfer to be ignored
> ---------------------------------------------------------------------
>
> Key: ISPN-4154
> URL: https://issues.jboss.org/browse/ISPN-4154
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: State Transfer
> Affects Versions: 7.0.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.0.0.Alpha5
>
>
> Distributed transactional cache.
> 1) Coordinator is gracefully leaving the cluster, sends a REBALANCE_START with topologyId 14, ST begins.
> 2) Node receives chunk from segment X, writes entry K=V to the container.
> 3) New coordinator jumps in with CH_UPDATE topology 16
> 4) Node receives CANCEL_STATE_TRANSFER and cancels transfer of segment X, invalidating K. In CommitManager, this operation is tracked and DiscardPolicy is set to DISCARD_STATE_TRANSFER for key K.
> 5) New coordinator starts rebalance with topology 17
> 6) Node starts new ST for segment X
> 7) Node receives the X: K=V, but in CommitManager it finds out that the policy is set to DISCARD_STATE_TRANSFER and ignores this update.
> Result: entry value is lost on some node.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4480) Messages sent to leavers can clog the JGroups bundler thread
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4480?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4480:
-----------------------------------------------
William Burns <wburns(a)redhat.com> changed the Status of [bug 1104045|https://bugzilla.redhat.com/show_bug.cgi?id=1104045] from ASSIGNED to MODIFIED
> Messages sent to leavers can clog the JGroups bundler thread
> ------------------------------------------------------------
>
> Key: ISPN-4480
> URL: https://issues.jboss.org/browse/ISPN-4480
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core
> Affects Versions: 6.0.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
>
> In a stress test that repeatedly kills nodes while performing read/write operations, the TransferQueueBundler thread seems to spend a lot of time waiting for physical addresses:
> {noformat}
> 06:40:10,316 WARN [org.radargun.utils.Utils] (pool-5-thread-1) Stack for thread TransferQueueBundler,default,apex953-14666:
> java.lang.Thread.sleep(Native Method)
> org.jgroups.util.Util.sleep(Util.java:1504)
> org.jgroups.util.Util.sleepRandom(Util.java:1574)
> org.jgroups.protocols.TP.sendToSingleMember(TP.java:1685)
> org.jgroups.protocols.TP.doSend(TP.java:1670)
> org.jgroups.protocols.TP$TransferQueueBundler.sendBundledMessages(TP.java:2476)
> org.jgroups.protocols.TP$TransferQueueBundler.sendMessages(TP.java:2392)
> org.jgroups.protocols.TP$TransferQueueBundler.run(TP.java:2383)
> java.lang.Thread.run(Thread.java:744)
> {noformat}
> There are 2 bugs related to this already fixed in JGroups 3.5.0.Beta2+: JGRP-1814, JGRP-1815
> There is also a special case where the physical address could be removed from the cache too soon, exacerbating the effect of JGRP-1815: JGRP-1858
> We can work around the problem by changing the JGroups configuration:
> * TP.logical_addr_cache_expiration=86400000
> ** Only expire addresses after 1 day
> * TP.physical_addr_max_fetch_attempts=1
> ** Sleep for only 20ms waiting for the physical address (default 3 - 1500ms)
> * UNICAST3_conn_close_timeout=10000
> ** Drop the pending messages to leavers sooner
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4490) Members can miss the rebalance cancellation on coordinator change
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4490?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4490:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/2702
> Members can miss the rebalance cancellation on coordinator change
> -----------------------------------------------------------------
>
> Key: ISPN-4490
> URL: https://issues.jboss.org/browse/ISPN-4490
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core, State Transfer
> Affects Versions: 7.0.0.Alpha4
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 7.0.0.Alpha5
>
>
> The new coordinator sends first a CH_UPDATE command to cancel the existing rebalance, and then a REBALANCE_START command to start a new rebalance. But the CH_UPDATE command is sent asynchronously, so it's possible for some members to receive it after the REBALANCE_START command.
> If that happens, that node will assume that it will receive the segments it requested for the previous rebalance. But with the ISPN-4484 fix, the provider node cancels the outbound transfer tasks when receiving a CH_UPDATE without a pendingCH, so the state requestor will never receive its segments.
> Even without the ISPN-4484 fix this is a problem, although less obvious. Between the provider node receiving the CH_UPDATE and the REBALANCE_START commands, it won't have the requestor in its write CH, so the requestor can miss transactions.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4490) Members can miss the rebalance cancellation on coordinator change
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4490?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-4490:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1117948
> Members can miss the rebalance cancellation on coordinator change
> -----------------------------------------------------------------
>
> Key: ISPN-4490
> URL: https://issues.jboss.org/browse/ISPN-4490
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core, State Transfer
> Affects Versions: 7.0.0.Alpha4
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 7.0.0.Alpha5
>
>
> The new coordinator sends first a CH_UPDATE command to cancel the existing rebalance, and then a REBALANCE_START command to start a new rebalance. But the CH_UPDATE command is sent asynchronously, so it's possible for some members to receive it after the REBALANCE_START command.
> If that happens, that node will assume that it will receive the segments it requested for the previous rebalance. But with the ISPN-4484 fix, the provider node cancels the outbound transfer tasks when receiving a CH_UPDATE without a pendingCH, so the state requestor will never receive its segments.
> Even without the ISPN-4484 fix this is a problem, although less obvious. Between the provider node receiving the CH_UPDATE and the REBALANCE_START commands, it won't have the requestor in its write CH, so the requestor can miss transactions.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months