[JBoss JIRA] (ISPN-8976) 2 subclusters failed to merge to 1 cluster - IllegalLifecycleStateException
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8976?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-8976:
-------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/6066
> 2 subclusters failed to merge to 1 cluster - IllegalLifecycleStateException
> ---------------------------------------------------------------------------
>
> Key: ISPN-8976
> URL: https://issues.jboss.org/browse/ISPN-8976
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.4.Final
> Reporter: Robert Cernak
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Final
>
> Attachments: logs.zip
>
>
> At the beginning I have main cluster consisted of 8 nodes.
> Then I disconnected main switch on which these nodes were connected.
> This leaded to separating main cluster to 2 subclusters - first with 2 nodes and second with 6 nodes. This was expected.
> After that I rebooted the nodes. After reboot, nodes again correctly formed 2 subclusters with 2 and 6 members.
> After a long time when all nodes were stable with low cpu load, I connected the main switch back which should lead to recreation of main cluster with 8 controllers.
> However main cluster did not recovered:
> subcluster2 did not change - still had 6 nodes connected - no new members
> subcluster1 - nodes did not connect with subcluster2 and after cca 30min they left the cluster.
> When I checked infinispan logs of node1 from 1st subcluster I had IllegalLifecycleStateException for every created cache (see included logs.zip):
> [transport-thread-744a974a-2811-4f79-ac63-f32daf005d7f-p4-t6] (ClusterCacheStatus.java:599) - ISPN000228: Failed to recover cache XXX state after the current node became the coordinator
> org.infinispan.IllegalLifecycleStateException: Cache container has been stopped and cannot be reused. Recreate the cache container.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months
[JBoss JIRA] (ISPN-8734) RemoteMultimapCacheAPITest always fails with trace logging enabled
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-8734?page=com.atlassian.jira.plugin.... ]
Radim Vansa closed ISPN-8734.
-----------------------------
Resolution: Duplicate Issue
> RemoteMultimapCacheAPITest always fails with trace logging enabled
> ------------------------------------------------------------------
>
> Key: ISPN-8734
> URL: https://issues.jboss.org/browse/ISPN-8734
> Project: Infinispan
> Issue Type: Bug
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Labels: testsuite_stability
>
> {noformat}
> [ERROR] testSize(org.infinispan.client.hotrod.RemoteMultimapCacheAPITest) Time elapsed: 0.009 s <<< FAILURE!
> java.util.concurrent.CompletionException: java.lang.ArrayIndexOutOfBoundsException: 113
> at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:375)
> at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1934)
> at org.infinispan.client.hotrod.RemoteMultimapCacheAPITest.testSize(RemoteMultimapCacheAPITest.java:120)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
> at org.testng.internal.MethodInvocationHelper$1.runTestMethod(MethodInvocationHelper.java:200)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at sun.reflect.GeneratedMethodAccessor140.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.testng.internal.MethodInvocationHelper.invokeHookable(MethodInvocationHelper.java:212)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:707)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 113
> at org.infinispan.client.hotrod.impl.protocol.HotRodConstants$Names.of(HotRodConstants.java:230)
> at org.infinispan.client.hotrod.impl.protocol.Codec20.writeHeader(Codec20.java:134)
> at org.infinispan.client.hotrod.impl.protocol.Codec27.writeHeader(Codec27.java:12)
> at org.infinispan.client.hotrod.impl.operations.HotRodOperation.sendHeader(HotRodOperation.java:78)
> at org.infinispan.client.hotrod.impl.operations.HotRodOperation.sendHeaderAndRead(HotRodOperation.java:69)
> at org.infinispan.client.hotrod.impl.multimap.operations.SizeMultimapOperation.executeOperation(SizeMultimapOperation.java:38)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.invoke(RetryOnFailureOperation.java:67)
> at org.infinispan.client.hotrod.impl.transport.netty.ChannelPool.lambda$createAndInvoke$0(ChannelPool.java:135)
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.client.hotrod.impl.transport.netty.ChannelInitializer$ActivationFuture.accept(ChannelInitializer.java:177)
> at org.infinispan.client.hotrod.impl.transport.netty.ChannelInitializer$ActivationFuture.accept(ChannelInitializer.java:161)
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.infinispan.client.hotrod.impl.transport.netty.ChannelRecord.complete(ChannelRecord.java:51)
> at org.infinispan.client.hotrod.impl.transport.netty.ActivationHandler.channelActive(ActivationHandler.java:28)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:213)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months
[JBoss JIRA] (ISPN-9173) Availability mode should be updated atomically with the actual members
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9173?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-9173:
------------------------------------
Unfortunately there are still other random test failures in the partition handling tests related to ISPN-9291. This is the kind of test failure the non-atomic availability update causes, because the availability mode changed to DEGRADED but the actual members are still a majority:
{noformat}
Test failed: org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge1[DIST_SYNC]
java.lang.AssertionError: Should have thrown an org.infinispan.partitionhandling.AvailabilityException
at org.infinispan.test.Exceptions.assertException(Exceptions.java:18)
at org.infinispan.test.Exceptions.expectException(Exceptions.java:99)
at org.infinispan.partitionhandling.BasePartitionHandlingTest.assertKeyNotAvailableForRead(BasePartitionHandlingTest.java:400)
at org.infinispan.partitionhandling.BasePartitionHandlingTest$Partition.assertKeyNotAvailableForRead(BasePartitionHandlingTest.java:346)
at org.infinispan.partitionhandling.BasePartitionHandlingTest$Partition.assertKeysNotAvailableForRead(BasePartitionHandlingTest.java:341)
at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge(TwoWaySplitAndMergeTest.java:92)
at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge1(TwoWaySplitAndMergeTest.java:31)
{noformat}
> Availability mode should be updated atomically with the actual members
> ----------------------------------------------------------------------
>
> Key: ISPN-9173
> URL: https://issues.jboss.org/browse/ISPN-9173
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.3.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.3.0.Final
>
>
> This is a follow-up on ISPN-7682, which asks for the topology itself to be updated atomically.
> {{LocalTopologyManagerImpl}} has additional logic to update the availability mode first when the cache becomes degraded and to update it last when the cache becomes available, which means any delay between the updates cannot cause data inconsistencies.
> But that logic doesn't really belong in {{LocalTopologyManagerImpl}}, and it's easy to forget it's there (and in fact we had a bug there related to the new rebalance phases).
> In addition, tests that want to check the cache behaviour in degraded mode and wait only for the availability mode change will fail if there's a big delay between the availability mode change. I actually hit this while testing my ISPN-8731/ISPN-7682 changes, and I had added a random delay in {{StateConsumerImpl}} before {{distributionManager.setCacheTopology()}}.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months
[JBoss JIRA] (ISPN-9061) X-site replication with functional commands throws NullPointerException
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-9061?page=com.atlassian.jira.plugin.... ]
Radim Vansa updated ISPN-9061:
------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> X-site replication with functional commands throws NullPointerException
> -----------------------------------------------------------------------
>
> Key: ISPN-9061
> URL: https://issues.jboss.org/browse/ISPN-9061
> Project: Infinispan
> Issue Type: Bug
> Components: Cross-Site Replication
> Affects Versions: 9.2.1.Final
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Fix For: 9.3.0.Final
>
>
> {{CacheOperationsTest.testFunctional()}} checks that some keys do not exist in the cache by calling {{evalMany}} on a read-write map, but with a read-only lambda.
> This creates a {{VersionedRepeatableReadEntry(value=null)}} in the tx invocation context, and {{BackupSenderImpl.filterModifications()}} sends that to the remote site as a {{PutKeyValueCommand(value=null)}}. On the remote site this is translated as {{cache.put(key, null)}}, which finally throws a {{NullPointerException}}:
> {noformat}
> 15:18:54,543 WARN (remote-thread-CacheOperationsTest[REPL_SYNC, tx=true, lockingMode=OPTIMISTIC, 2PC]-NodeD-p40433-t6:[]) [GlobalInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleXSiteRpcCommand{command=PrepareCommand {modifications=[PutKeyValueCommand{key=MagicKey#k2{1910/3D34DA4D/67@CacheOperationsTest[REPL_SYNC, tx=true, lockingMode=OPTIMISTIC, 2PC]-NodeB-4295}, value=null, flags=[], commandInvocationId=CommandInvocation:local:0, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=SimpleClusteredVersion{topologyId=0, version=0}}, successful=true, topologyId=-1}, PutKeyValueCommand{key=MagicKey#k0{190E/360DCEC7/18@CacheOperationsTest[REPL_SYNC, tx=true, lockingMode=OPTIMISTIC, 2PC]-NodeA-60870}, value=null, flags=[], commandInvocationId=CommandInvocation:local:0, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=SimpleClusteredVersion{topologyId=0, version=0}}, successful=true, topologyId=-1}, PutKeyValueCommand{key=MagicKey#k1{190F/71AF2073/6@CacheOperationsTest[REPL_SYNC, tx=true, lockingMode=OPTIMISTIC, 2PC]-NodeB-4295}, value=null, flags=[], commandInvocationId=CommandInvocation:local:0, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=SimpleClusteredVersion{topologyId=0, version=0}}, successful=true, topologyId=-1}], onePhaseCommit=false, retried=false, gtx=GlobalTx:CacheOperationsTest[REPL_SYNC, tx=true, lockingMode=OPTIMISTIC, 2PC]-NodeA-60870:26069, cacheName='___defaultcache', topologyId=-1}}
> java.lang.NullPointerException: Null values are not supported!
> at java.util.Objects.requireNonNull(Objects.java:228) ~[?:1.8.0_152]
> at org.infinispan.cache.impl.CacheImpl.assertValueNotNull(CacheImpl.java:199) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.assertKeyValueNotNull(CacheImpl.java:204) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1331) ~[classes/:?]
> at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:654) ~[classes/:?]
> at org.infinispan.cache.impl.AbstractDelegatingAdvancedCache.put(AbstractDelegatingAdvancedCache.java:355) ~[classes/:?]
> at org.infinispan.cache.impl.EncoderCache.put(EncoderCache.java:425) ~[classes/:?]
> at org.infinispan.xsite.BaseBackupReceiver$BackupCacheUpdater.visitPutKeyValueCommand(BaseBackupReceiver.java:110) ~[classes/:?]
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:67) ~[classes/:?]
> at org.infinispan.xsite.BaseBackupReceiver$BackupCacheUpdater.replayModifications(BaseBackupReceiver.java:259) ~[classes/:?]
> at org.infinispan.xsite.BaseBackupReceiver$BackupCacheUpdater.visitPrepareCommand(BaseBackupReceiver.java:155) ~[classes/:?]
> at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:185) ~[classes/:?]
> at org.infinispan.xsite.BaseBackupReceiver.handleRemoteCommand(BaseBackupReceiver.java:76) ~[classes/:?]
> at org.infinispan.xsite.SingleXSiteRpcCommand.performInLocalSite(SingleXSiteRpcCommand.java:37) ~[classes/:?]
> at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.runXSiteReplicableCommand(GlobalInboundInvocationHandler.java:126) ~[classes/:?]
> at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler.lambda$handleFromRemoteSite$0(GlobalInboundInvocationHandler.java:95) ~[classes/:?]
> {noformat}
> There's no exception on the local node, maybe because entries with null values are not committed regardless of what their flags say.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months
[JBoss JIRA] (ISPN-8616) DistAsyncFuncTest.testMergeFromNonOwner random failures
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-8616?page=com.atlassian.jira.plugin.... ]
Radim Vansa updated ISPN-8616:
------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Duplicate Issue
> DistAsyncFuncTest.testMergeFromNonOwner random failures
> -------------------------------------------------------
>
> Key: ISPN-8616
> URL: https://issues.jboss.org/browse/ISPN-8616
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.2.0.Beta1
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Labels: testsuite_stability
>
> {noformat}
> java.lang.AssertionError: Fail on owner cache DistAsyncFuncTest[DIST_ASYNC, tx=false]-NodeA-42474: dc.get(k1) returned null
> at org.infinispan.distribution.BaseDistFunctionalTest.assertOwnershipAndNonOwnership(BaseDistFunctionalTest.java:191)
> at org.infinispan.distribution.BaseDistFunctionalTest.assertOnAllCachesAndOwnership(BaseDistFunctionalTest.java:162)
> at org.infinispan.distribution.BaseDistFunctionalTest.initAndTest(BaseDistFunctionalTest.java:142)
> at org.infinispan.distribution.DistSyncFuncTest.testMergeFromNonOwner(DistSyncFuncTest.java:387)
> {noformat}
> The test method is new and the interceptors have changed, but it's probably the same problem with {{asyncWait}} signaled in ISPN-3741. I commented out the entire implementation of {{asyncWait}} in {{DistAsyncFuncTest}} and it still passed when run by itself, so we need a way to delay the commands and check that the test is not sensitive to such delays.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months
[JBoss JIRA] (ISPN-8204) Remove should be conditional
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-8204?page=com.atlassian.jira.plugin.... ]
Radim Vansa closed ISPN-8204.
-----------------------------
Resolution: Rejected
> Remove should be conditional
> ----------------------------
>
> Key: ISPN-8204
> URL: https://issues.jboss.org/browse/ISPN-8204
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.0.Final
> Reporter: Radim Vansa
> Assignee: Radim Vansa
>
> If {{cache.remove(k)}} is called on non-existent key, it should become a no-op, marking the command as unsuccessful, not writing the cache store and not replicating the change to backup owners. That makes the command effectively conditional (as it checks previous value), in the same way as {{cache.replace(k, newValue)}} is.
> While I think that this is the correct behaviour, it's a breaking change for transactions. Some transactions may become read-only and there are multiple tests in the testsuite that would be broken by this.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months
[JBoss JIRA] (ISPN-9276) FunctionalEncodingTypeTest.testDistReturnViewFromReadWriteEvalOnNonOwner[tx=true] always fails
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-9276?page=com.atlassian.jira.plugin.... ]
Radim Vansa updated ISPN-9276:
------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/6018
> FunctionalEncodingTypeTest.testDistReturnViewFromReadWriteEvalOnNonOwner[tx=true] always fails
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-9276
> URL: https://issues.jboss.org/browse/ISPN-9276
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Radim Vansa
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 9.3.0.Final
>
>
> The test is not currently running during the build because of ISPN-9149, but fails when run manually:
> {noformat}
> java.lang.Error: java.util.concurrent.ExecutionException: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from FunctionalEncodingTypeTest[tx=true]-NodeB-35039, see cause for remote stack trace
> at org.infinispan.functional.FunctionalTestUtils.await(FunctionalTestUtils.java:47)
> at org.infinispan.functional.FunctionalMapTest.doReturnViewFromReadWriteEval(FunctionalMapTest.java:595)
> at org.infinispan.functional.FunctionalMapTest.testDistReturnViewFromReadWriteEvalOnNonOwner(FunctionalMapTest.java:584)
> Caused by: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from FunctionalEncodingTypeTest[tx=true]-NodeB-35039, see cause for remote stack trace
> at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:27)
> at org.infinispan.remoting.transport.RemoteGetResponseCollector.addResponse(RemoteGetResponseCollector.java:26)
> at org.infinispan.remoting.transport.RemoteGetResponseCollector.addResponse(RemoteGetResponseCollector.java:17)
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:91)
> at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1364)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1267)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:125)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1412)
> at org.jgroups.JChannel.up(JChannel.java:816)
> Caused by: org.infinispan.commons.marshall.NotSerializableException: org.infinispan.functional.impl.EntryViews$EntryBackedReadWriteView
> Caused by: an exception which occurred:
> in object org.infinispan.functional.impl.EntryViews$EntryBackedReadWriteView@72d6a840
> -> toString = EntryBackedReadWriteView{entry=VersionedRepeatableReadEntry(39ed5af7){key=TestKey#MagicKey{778/3FB8CBF3/178@FunctionalEncodingTypeTest[tx=true]-NodeB-35039}, value=TestValue#one, isCreated=true, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=MetaParamsInternalMetadata{params=MetaParams{length=1, metas=[MetaEntryVersion=SimpleClusteredVersion{topologyId=0, version=0}]}}}}
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years, 6 months