[JBoss JIRA] (ISPN-5030) NPE during node rebalance after a leave
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5030?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5030:
-----------------------------------------------
Radim Vansa <rvansa(a)redhat.com> changed the Status of [bug 1161214|https://bugzilla.redhat.com/show_bug.cgi?id=1161214] from ON_QA to VERIFIED
> NPE during node rebalance after a leave
> ---------------------------------------
>
> Key: ISPN-5030
> URL: https://issues.jboss.org/browse/ISPN-5030
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.1.0.Alpha1
> Reporter: Sanne Grinovero
> Assignee: Dan Berindei
> Priority: Minor
> Fix For: 7.1.0.Beta1
>
>
> I'm seeing stacktraces dumped to standard out during the normal test run of the infinispan-core module:
> {noformat}
> 2014-11-28 21:39:52,652 WARN [CacheTopologyControlCommand] (remote-thread-DistributedEntryRetrieverTest-NodeA-p610-t6) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=org.infinispan.iteration.DistributedEntryRetrieverTest, type=LEAVE, sender=DistributedEntryRetrieverTest-NodeC-12939, joinInfo=null, topologyId=0, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, actualMembers=null, throwable=null, viewId=2}
> java.lang.NullPointerException
> at org.infinispan.topology.ClusterCacheStatus.updateRebalanceMembers(ClusterCacheStatus.java:307)
> at org.infinispan.topology.ClusterCacheStatus.doLeave(ClusterCacheStatus.java:578)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleLeave(ClusterTopologyManagerImpl.java:148)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:164)
> at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:144)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher$4.run(CommandAwareRpcDispatcher.java:278)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This is the unmodified testsuite, in current master {{33c5f85}}. It doesn't seem to cause actual test failures.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
11 years, 4 months
[JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4949?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-4949:
------------------------------------
Yes, exactly.
> Split brain: inconsistent data after merge
> ------------------------------------------
>
> Key: ISPN-4949
> URL: https://issues.jboss.org/browse/ISPN-4949
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 7.0.0.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Alpha1
>
>
> 1) cluster A, B, C, D splits into 2 parts:
> A, B (coord A) finds this out immediately and enters degraded mode with CH [A, B, C, D]
> C, D (coord D) first detects that B is lost, gets view A, C, D and starts rebalance with CH [A, C, D]. Segment X is primary owned by C (it had backup on B but this got lost)
> 2) D detects that A was lost as well, therefore enters degraded mode with CH [A, C, D]
> 3) C inserts entry into X: all owners (only C) is present, therefore the modification is allowed
> 4) cluster is merged and coordinator finds out that the max stable topology has CH [A, B, C, D] (it is the older of the two partitions' topologies, got from A, B) - logs 'No active or unavailable partitions, so all the partitions must be in degraded mode' (yes, all partitions are in degraded mode, but write has happened in the meantime)
> 5) The old CH is broadcast in newest topology, no rebalance happens
> 6) Inconsistency: read in X may miss the update
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-3743) Silence "IllegalStateException: Default cache is in 'STOPPING' state" exceptions
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-3743?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-3743:
------------------------------------
I think the HotRod server is now handling the error and the client retries the operation automatically (ISPN-4717), but embedded users can still see it.
> Silence "IllegalStateException: Default cache is in 'STOPPING' state" exceptions
> --------------------------------------------------------------------------------
>
> Key: ISPN-3743
> URL: https://issues.jboss.org/browse/ISPN-3743
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 6.0.0.Final
> Reporter: Dan Berindei
> Labels: testsuite_stability
> Fix For: 7.1.0.Beta1
>
>
> When a cache in STOPPING state receives a command, the exception is propagated all the way to the caller:
> {noformat}
> 09:33:02,005 ERROR (testng-NonTxPutIfAbsentDuringLeaveStressTest:) [UnitTestTestNGListener] Test testNodeLeavingDuringPutIfAbsent(org.infinispan.distribution.rehash.NonTxPutIfAbsentDuringLeaveStressTest) failed.
> java.util.concurrent.ExecutionException: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from NodeC-25987, see cause for remote stack trace
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> Caused by: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from NodeC-25987, see cause for remote stack trace
> at org.infinispan.remoting.transport.AbstractTransport.checkResponse(AbstractTransport.java:41)
> Caused by: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from NodeD-57301, see cause for remote stack trace
> at org.infinispan.remoting.transport.AbstractTransport.checkResponse(AbstractTransport.java:41)
> Caused by: java.lang.IllegalStateException: Default cache is in 'STOPPING' state and this is an invocation not belonging to an on-going transaction, so it does not accept new invocations. Either restart it or recreate the cache container.
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:93)
> {noformat}
> This causes random failures in NonTxPutIfAbsentDuringLeaveStressTest.testNodeLeavingDuringPutIfAbsent.
> The originator should either ignore the IllegalStateException or wait for a new cache topology and retry the operation, just like it should for SuspectExceptions (ISPN-2577).
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-5072) MapExternalizer no longer needs to extend InstanceReusingAdvancedExternalizer
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5072?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5072:
----------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> MapExternalizer no longer needs to extend InstanceReusingAdvancedExternalizer
> -----------------------------------------------------------------------------
>
> Key: ISPN-5072
> URL: https://issues.jboss.org/browse/ISPN-5072
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Affects Versions: 7.1.0.Alpha1
> Reporter: William Burns
> Assignee: William Burns
> Fix For: 7.1.0.Beta1, 7.1.0.Final
>
>
> The MapExternalizer originally extended InstanceReusingAdvancedExternalizer to allow for the CacheStatusResponse class to properly reuse values. However with ISPN-3561 we no longer use just a Map but also a ManagerStatusResponse so we only need the Externalizer for that class to extend InstanceReusingAdvancedExternalizer.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months