[JBoss JIRA] (ISPN-10939) PessimisticTxPartitionAndMergeDuringRollbackTest random failures
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10939?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10939:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> PessimisticTxPartitionAndMergeDuringRollbackTest random failures
> ----------------------------------------------------------------
>
> Key: ISPN-10939
> URL: https://issues.redhat.com/browse/ISPN-10939
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 10.0.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.1.0.Final
>
> Attachments: test.log.gz
>
>
> {{PessimisticTxPartitionAndMergeDuringRollbackTest.testPrimaryOwnerIsolatedPartitionWithDiscard[DIST_SYNC, DENY_READ_WRITES]}} sometimes fails because {{NodeC}} never rolls back {{NodeA}}'s transaction. It's normal to not roll back the transaction on {{NodeC}} while it's in {{DEGRADED_MODE}}, but it should roll back the transaction after the merge.
> {noformat}
> 16:31:28,895 TRACE (ForkThread-1,PessimisticTxPartitionAndMergeDuringRollbackTest:[]) [TransactionCoordinator] rollback transaction GlobalTx:Test-NodeA-24422:22
> 16:31:28,897 DEBUG (jgroups-7,Test-NodeC-35037:[]) [BaseTxPartitionAndMergeTest] Ignoring command RollbackCommand {gtx=GlobalTx:Test-NodeA-24422:22, cacheName='pes-cache', topologyId=13}
> 16:31:28,898 DEBUG (testng-Test:[]) [GMS] Test-NodeC-35037: installing view [Test-NodeC-35037|31] (1) [Test-NodeC-35037]
> 16:31:28,904 DEBUG (testng-Test:[]) [GMS] Test-NodeA-24422: installing view [Test-NodeA-24422|32] (3) [Test-NodeA-24422, Test-NodeB-15428, Test-NodeD-40706]
> 16:31:28,968 TRACE (transport-thread-Test-NodeC-p181-t1:[Topology-pes-cache]) [TransactionTable] Checking for transactions originated on leavers. Current cache members are [Test-NodeC-35037], remote transactions: 1
> 16:31:28,971 TRACE (transport-thread-Test-NodeC-p181-t1:[Topology-pes-cache]) [TransactionTable] Checking transaction GlobalTx:Test-NodeA-24422:22
> 16:31:28,971 TRACE (transport-thread-Test-NodeC-p181-t1:[Topology-pes-cache]) [PartitionHandlingManagerImpl] Can rollback transaction? false
> 16:31:29,113 DEBUG (testng-Test:[]) [GMS] Test-NodeC-35037: installing view MergeView::[Test-NodeC-35037|35] (4) [Test-NodeC-35037, Test-NodeA-24422, Test-NodeB-15428, Test-NodeD-40706], 2 subgroups: [Test-NodeC-35037|33] (1) [Test-NodeC-35037], [Test-NodeA-24422|34] (3) [Test-NodeA-24422, Test-NodeB-15428, Test-NodeD-40706]
> 16:31:29,291 TRACE (transport-thread-Test-NodeC-p181-t4:[Topology-pes-cache]) [TransactionTable] Checking transaction GlobalTx:Test-NodeA-24422:22
> 16:31:29,291 TRACE (transport-thread-Test-NodeC-p181-t4:[Topology-pes-cache]) [TransactionTable] No remote transactions pertain to originator(s) who have left the cluster.
> 16:31:29,435 DEBUG (testng-Test:[]) [PessimisticTxPartitionAndMergeDuringRollbackTest] Cluster merged
> 16:31:29,436 TRACE (testng-Test:[]) [PessimisticTxPartitionAndMergeDuringRollbackTest] Local tx=[], remote tx=[GlobalTx:Test-NodeA-24422:22], for cache Test-NodeC-35037
> ...
> 16:31:39,446 TRACE (testng-Test:[]) [PessimisticTxPartitionAndMergeDuringRollbackTest] Local tx=[], remote tx=[GlobalTx:Test-NodeA-24422:22], for cache Test-NodeC-35037
> 16:31:39,446 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringRollbackTest.testPrimaryOwnerIsolatedPartitionWithDiscard[DIST_SYNC, DENY_READ_WRITES]
> java.lang.AssertionError: There are pending transactions!
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.14.3.jar:?]
> at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) ~[testng-6.14.3.jar:?]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:250) ~[test-classes/:?]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:390) ~[test-classes/:?]
> at org.infinispan.test.MultipleCacheManagersTest.assertNoTransactions(MultipleCacheManagersTest.java:947) ~[test-classes/:?]
> at org.infinispan.partitionhandling.BaseTxPartitionAndMergeTest.finalAsserts(BaseTxPartitionAndMergeTest.java:101) ~[test-classes/:?]
> at org.infinispan.partitionhandling.BasePessimisticTxPartitionAndMergeTest.doTest(BasePessimisticTxPartitionAndMergeTest.java:83) ~[test-classes/:?]
> at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringRollbackTest.testPrimaryOwnerIsolatedPartitionWithDiscard(PessimisticTxPartitionAndMergeDuringRollbackTest.java:43) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months
[JBoss JIRA] (ISPN-10922) StateTransferLinkFailuresTest random failures
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10922?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10922:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> StateTransferLinkFailuresTest random failures
> ---------------------------------------------
>
> Key: ISPN-10922
> URL: https://issues.redhat.com/browse/ISPN-10922
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 10.0.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.1.0.Beta1
>
>
> {{StateTransferLinkFailuresTest.testLinkBrokenDuringStateTransfer()}} tests that xsite state transfer eventually finishes when the link between sites is down. It simulates an exception in {{backupRemotely()}} and waits for 1 minute for state transfer to finish.
> The problem is that the default xsite state transfer retry configuration is to retry 30 times, waiting 2 seconds between retries. That means it's very possible the test will give up waiting on state transfer to finish just before state transfer finishes.
> {noformat}
> 21:43:02,524 DEBUG (transport-thread-Test-NodeA-p4484-t1:[]) [XSiteStateProviderImpl] [X-Site State Transfer - NYC-2] start DataContainer iteration
> 21:43:02,564 DEBUG (transport-thread-Test-NodeA-p4484-t1:[]) [XSiteStateProviderImpl] Sending chunk to site 'NYC-2'. Chunk contains [XSiteState{key=k_2, value=v_2, metadata=EmbeddedMetadata{version=null}}, XSiteState{key=k_4, value=v_4, metadata=EmbeddedMetadata{version=null}}]
> 21:44:02,579 TRACE (transport-thread-Test-NodeA-p4484-t1:[]) [RetryOnFailureXSiteCommand] Exception Response received. Exception is org.infinispan.util.concurrent.TimeoutException: induced timeout!
> 21:44:02,611 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.xsite.statetransfer.failures.StateTransferLinkFailuresTest.testLinkBrokenDuringStateTransfer[null, tx=false]
> java.lang.AssertionError:
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.14.3.jar:?]
> at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) ~[testng-6.14.3.jar:?]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:250) ~[test-classes/:?]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:232) ~[test-classes/:?]
> at org.infinispan.test.AbstractInfinispanTest.eventually(AbstractInfinispanTest.java:208) ~[test-classes/:?]
> at org.infinispan.xsite.AbstractXSiteTest.assertEventuallyInSite(AbstractXSiteTest.java:193) ~[test-classes/:?]
> at org.infinispan.xsite.statetransfer.failures.StateTransferLinkFailuresTest.testLinkBrokenDuringStateTransfer(StateTransferLinkFailuresTest.java:89) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months
[JBoss JIRA] (ISPN-10906) JGroupsTransport instance is reused in tests
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10906?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10906:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> JGroupsTransport instance is reused in tests
> --------------------------------------------
>
> Key: ISPN-10906
> URL: https://issues.redhat.com/browse/ISPN-10906
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 10.0.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.1.0.CR1
>
>
> The {{TRANSPORT}} {{AttributeDefinition}} uses {{IdentityAttributeCopier}}, which means that when a test uses {{GlobalConfigurationBuilder.read()}} to make a clone of the global configuration it keeps using the same {{JGroupsTransport}} instance. Both cache managers sort of work, but usually not as intended.
> We should detect when {{JGroupsTransport}}'s dependencies are injected twice and throw an exception. We should also consider changing the {{TRANSPORT}} copier to {{SimpleInstanceAttributeCopier}}.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months
[JBoss JIRA] (ISPN-10891) JGroupsTransport registers the channel in JMX ignoring the cacheManagerName
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10891?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10891:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> JGroupsTransport registers the channel in JMX ignoring the cacheManagerName
> ---------------------------------------------------------------------------
>
> Key: ISPN-10891
> URL: https://issues.redhat.com/browse/ISPN-10891
> Project: Infinispan
> Issue Type: Bug
> Components: Core, JMX, reporting and management
> Affects Versions: 9.4.16.Final, 10.0.1.Final
> Reporter: Dan Berindei
> Priority: Major
>
> {{JGroupsTransport}} registers the JGroups channel in JMX with a name like {{<jmx-domain>:type=channel,cluster={{cluster-name>}}.
> If two managers have a different {{cacheManagerName}}, all the cache manager and cache components can be registered along each other in the same JMX domain. The channel object name however doesn't include the manager name, so the 2nd cache manager fails to register its channel, and because of JGRP-2393 the cause of the error is hidden.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months
[JBoss JIRA] (ISPN-10911) HotRodMultiMapOperations random failures
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10911?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10911:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> HotRodMultiMapOperations random failures
> ----------------------------------------
>
> Key: ISPN-10911
> URL: https://issues.redhat.com/browse/ISPN-10911
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 10.0.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.1.0.Beta1
>
>
> Multimap operations are asynchronous, and the test doesn't wait for the writes to finish before starting a get. Multimap and Infinispan in general do not guarantee that asynchronous operations started from the same thread run in any particular order.
> {noformat}
> 09:07:49,325 ERROR (testng-HotRodMultiMapOperations:[]) [TestSuiteProgress] Test failed: HotRodMultiMapOperations.testMultiMap
> java.lang.AssertionError: expected:<2> but was:<0>
> at org.junit.Assert.fail(Assert.java:88) ~[junit-4.12.jar:4.12]
> at org.junit.Assert.failNotEquals(Assert.java:834) ~[junit-4.12.jar:4.12]
> at org.junit.Assert.assertEquals(Assert.java:645) ~[junit-4.12.jar:4.12]
> at org.junit.Assert.assertEquals(Assert.java:631) ~[junit-4.12.jar:4.12]
> at org.infinispan.server.functional.HotRodMultiMapOperations.testMultiMap(HotRodMultiMapOperations.java:41) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months
[JBoss JIRA] (ISPN-10912) HotRod server retries CheckAddressTask indefinitely during shutdown
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10912?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10912:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> HotRod server retries CheckAddressTask indefinitely during shutdown
> -------------------------------------------------------------------
>
> Key: ISPN-10912
> URL: https://issues.redhat.com/browse/ISPN-10912
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 10.0.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.1.0.Beta1
>
>
> Normally retrying to add the local address to the topology cache is a good idea, but {{IllegalLifecycleStateException}} should be handled differently.
> {noformat}
> 09:09:03,471 DEBUG (remote-thread--p11-t2:[]) [HotRodServer] Error re-adding address to topology cache, retrying
> org.infinispan.commons.CacheException: org.infinispan.IllegalLifecycleStateException: Cache container has been stopped and cannot be reused. Recreate the cache container.
> at org.infinispan.server.hotrod.HotRodServer$ReAddMyAddressListener.lambda$recursionTopologyChanged$0(HotRodServer.java:678) ~[infinispan-server-hotrod-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.impl.LocalClusterExecutor.lambda$submitConsumer$3(LocalClusterExecutor.java:78) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
> at org.infinispan.manager.impl.LocalClusterExecutor.lambda$localInvocation$6(LocalClusterExecutor.java:97) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl$RunnableWrapper.run(BlockingTaskAwareExecutorServiceImpl.java:215) [infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> Caused by: org.infinispan.IllegalLifecycleStateException: Cache container has been stopped and cannot be reused. Recreate the cache container.
> at org.infinispan.manager.DefaultCacheManager.assertIsNotTerminated(DefaultCacheManager.java:1070) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.DefaultCacheManager.internalGetCache(DefaultCacheManager.java:502) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:498) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:491) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.impl.AbstractDelegatingEmbeddedCacheManager.getCache(AbstractDelegatingEmbeddedCacheManager.java:196) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.impl.UnwrappingEmbeddedCacheManager.getCache(UnwrappingEmbeddedCacheManager.java:25) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.server.hotrod.CheckAddressTask.apply(HotRodServer.java:725) ~[infinispan-server-hotrod-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.server.hotrod.CheckAddressTask.apply(HotRodServer.java:712) ~[infinispan-server-hotrod-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> at org.infinispan.manager.impl.LocalClusterExecutor.lambda$localInvocation$6(LocalClusterExecutor.java:94) ~[infinispan-core-10.1.0-SNAPSHOT.jar:10.1.0-SNAPSHOT]
> ... 4 more
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months
[JBoss JIRA] (ISPN-10880) JCacheConfigurationTest leaks cache manager
by Pedro Zapata Fernandez (Jira)
[ https://issues.redhat.com/browse/ISPN-10880?page=com.atlassian.jira.plugi... ]
Pedro Zapata Fernandez updated ISPN-10880:
------------------------------------------
Sprint: DataGrid Sprint #36, DataGrid Sprint #37, DataGrid Sprint #38 (was: DataGrid Sprint #36, DataGrid Sprint #37)
> JCacheConfigurationTest leaks cache manager
> -------------------------------------------
>
> Key: ISPN-10880
> URL: https://issues.redhat.com/browse/ISPN-10880
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite
> Affects Versions: 10.0.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.1.0.Beta1
>
>
> {noformat}
> ThreadLeakCheckerorg.infinispan.commons.test.ThreadLeakChecker$LeakException: Leaked thread: expiration-thread--p446-t1 << testng-JCacheConfigurationTest << org.infinispan.jcache.JCacheConfigurationTest
> ...
> Caused by: org.infinispan.commons.test.ThreadLeakChecker$LeakException: testng-JCacheConfigurationTest << org.infinispan.jcache.JCacheConfigurationTest
> at org.infinispan.commons.test.ThreadLeakChecker$ThreadInfoLocal.childValue(ThreadLeakChecker.java:107)
> ...
> at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:713)
> at org.infinispan.manager.DefaultCacheManager.<init>(DefaultCacheManager.java:391)
> at org.infinispan.jcache.embedded.JCacheManager.<init>(JCacheManager.java:75)
> at org.infinispan.jcache.JCacheConfigurationTest.lambda$testJCacheManagerWithRealJarFileSchema$1(JCacheConfigurationTest.java:107)
> at org.infinispan.jcache.util.JCacheTestingUtil.withCachingProvider(JCacheTestingUtil.java:36)
> at org.infinispan.jcache.JCacheConfigurationTest.testJCacheManagerWithRealJarFileSchema(JCacheConfigurationTest.java:104)
> {noformat}
> The leak is only reported some of the time because {{AbstractJCacheManager}} has a {{finalize()}} method and stops the underlying cache manager.
> The threads created by {{DefaultCacheManager}} ensure it's still referenced during finalization, allowing it to stop cleanly.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 4 months