[JBoss JIRA] (ISPN-10093) PersistenceManagerImpl stop deadlock with topology update
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10093?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10093:
--------------------------------
Attachment: threaddump.txt
> PersistenceManagerImpl stop deadlock with topology update
> ---------------------------------------------------------
>
> Key: ISPN-10093
> URL: https://issues.jboss.org/browse/ISPN-10093
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
> Attachments: threaddump.txt
>
>
> {{DistSyncStoreNotSharedTest.clearContent}} hanged in CI recently:
> {noformat}
> "testng-DistSyncStoreNotSharedTest" #16 prio=5 os_prio=0 cpu=11511.26ms elapsed=435.14s tid=0x00007fdb710b6000 nid=0x3222 waiting on condition [0x00007fdb352d3000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base@11/Native Method)
> - parking to wait for <0x00000000c8a22450> (a java.util.concurrent.Semaphore$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(java.base@11/LockSupport.java:194)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11/AbstractQueuedSynchronizer.java:885)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base@11/AbstractQueuedSynchronizer.java:1009)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base@11/AbstractQueuedSynchronizer.java:1324)
> at java.util.concurrent.Semaphore.acquireUninterruptibly(java.base@11/Semaphore.java:504)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.stop(PersistenceManagerImpl.java:222)
> at jdk.internal.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
> at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11/DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(java.base@11/Method.java:566)
> at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:79)
> at org.infinispan.commons.util.SecurityActions$$Lambda$237/0x0000000100661c40.run(Unknown Source)
> at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:71)
> at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:76)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:181)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.performStop(BasicComponentRegistryImpl.java:601)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.stopWrapper(BasicComponentRegistryImpl.java:590)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.stop(BasicComponentRegistryImpl.java:461)
> at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:431)
> at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:366)
> at org.infinispan.cache.impl.CacheImpl.performImmediateShutdown(CacheImpl.java:1160)
> at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:1125)
> at org.infinispan.cache.impl.AbstractDelegatingCache.stop(AbstractDelegatingCache.java:521)
> at org.infinispan.manager.DefaultCacheManager.terminate(DefaultCacheManager.java:747)
> at org.infinispan.manager.DefaultCacheManager.stopCaches(DefaultCacheManager.java:799)
> at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:775)
> at org.infinispan.test.TestingUtil.killCacheManagers(TestingUtil.java:846)
> at org.infinispan.test.MultipleCacheManagersTest.clearContent(MultipleCacheManagersTest.java:158)
> "persistence-thread-DistSyncStoreNotSharedTest-NodeB-p16432-t1" #53654 daemon prio=5 os_prio=0 cpu=1.26ms elapsed=301.93s tid=0x00007fdb3c3d8000 nid=0x8ef waiting on condition [0x00007fdb00055000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base@11/Native Method)
> - parking to wait for <0x00000000c8b1fb88> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(java.base@11/LockSupport.java:194)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11/AbstractQueuedSynchronizer.java:885)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base@11/AbstractQueuedSynchronizer.java:1009)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base@11/AbstractQueuedSynchronizer.java:1324)
> at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(java.base@11/ReentrantReadWriteLock.java:738)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.pollStoreAvailability(PersistenceManagerImpl.java:196)
> at org.infinispan.persistence.manager.PersistenceManagerImpl$$Lambda$492/0x00000001007fb440.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11/Executors.java:515)
> at java.util.concurrent.FutureTask.runAndReset(java.base@11/FutureTask.java:305)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11/ScheduledThreadPoolExecutor.java:305)
> "transport-thread-DistSyncStoreNotSharedTest-NodeB-p16424-t5" #53646 daemon prio=5 os_prio=0 cpu=3.15ms elapsed=301.94s tid=0x00007fdb2007a000 nid=0x8e8 waiting on condition [0x00007fdb0b406000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base@11/Native Method)
> - parking to wait for <0x00000000c8d2abb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(java.base@11/LockSupport.java:194)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11/AbstractQueuedSynchronizer.java:2081)
> at io.reactivex.internal.operators.flowable.BlockingFlowableIterable$BlockingFlowableIterator.hasNext(BlockingFlowableIterable.java:94)
> at io.reactivex.Flowable.blockingForEach(Flowable.java:5682)
> at org.infinispan.statetransfer.StateConsumerImpl.removeStaleData(StateConsumerImpl.java:1011)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:453)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:202)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:58)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:114)
> at org.infinispan.topology.LocalTopologyManagerImpl.resetLocalTopologyBeforeRebalance(LocalTopologyManagerImpl.java:437)
> at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:519)
> - locked <0x00000000c8b30b30> (a org.infinispan.topology.LocalCacheStatus)
> at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:484)
> at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$574/0x000000010089a040.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175){noformat}
> [Full thread dump|https://ci.infinispan.org/job/Infinispan/job/master/1133/artifact/core/]
> Somehow the producer thread for the transport-thread iteration is blocked, but without waiting for the persistence mutex. Maybe it's waiting for a topology? Not sure if it's relevant, but the last test to run was {{testClearWithFlag}}, so the data container was empty and the store had 5 entries.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 8 months
[JBoss JIRA] (ISPN-10108) In the query project we cannot mix Junit with TestNG
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-10108?page=com.atlassian.jira.plugin... ]
Diego Lovison updated ISPN-10108:
---------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/6826
> In the query project we cannot mix Junit with TestNG
> ----------------------------------------------------
>
> Key: ISPN-10108
> URL: https://issues.jboss.org/browse/ISPN-10108
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 10.0.0.Final, 9.4.11.Final
> Reporter: Diego Lovison
> Priority: Major
>
> Running the command {{mvn verify -pl 'query' -Dtest=MassIndexerAsyncBackendTest}} from the root folder will produce two XMLs
> TEST-infinispan-query.xml
> TEST-org.infinispan.query.distributed.MassIndexerAsyncBackendTest.xml
> One will be the Junit report and the other one will be the Polarion Report.
> The query project uses TestNG instead of JUNIT
> I would like to fix the following
> {code:xml}
> <testcase name="query.distributed.MassIndexerAsyncBackendTest" classname="org.infinispan.query.distributed.MassIndexerAsyncBackendTest" time="3.469"/>
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10108) In the query project we cannot mix Junit with TestNG
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-10108?page=com.atlassian.jira.plugin... ]
Diego Lovison updated ISPN-10108:
---------------------------------
Status: Open (was: New)
> In the query project we cannot mix Junit with TestNG
> ----------------------------------------------------
>
> Key: ISPN-10108
> URL: https://issues.jboss.org/browse/ISPN-10108
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 10.0.0.Final, 9.4.11.Final
> Reporter: Diego Lovison
> Priority: Major
>
> Running the command {{mvn verify -pl 'query' -Dtest=MassIndexerAsyncBackendTest}} from the root folder will produce two XMLs
> TEST-infinispan-query.xml
> TEST-org.infinispan.query.distributed.MassIndexerAsyncBackendTest.xml
> One will be the Junit report and the other one will be the Polarion Report.
> The query project uses TestNG instead of JUNIT
> I would like to fix the following
> {code:xml}
> <testcase name="query.distributed.MassIndexerAsyncBackendTest" classname="org.infinispan.query.distributed.MassIndexerAsyncBackendTest" time="3.469"/>
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10108) In the query project we cannot mix Junit with TestNG
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-10108?page=com.atlassian.jira.plugin... ]
Diego Lovison updated ISPN-10108:
---------------------------------
Description:
Running the command {{mvn verify -pl 'query' -Dtest=MassIndexerAsyncBackendTest}} from the root folder will produce two XMLs
TEST-infinispan-query.xml
TEST-org.infinispan.query.distributed.MassIndexerAsyncBackendTest.xml
One will be the Junit report and the other one will be the Polarion Report.
The query project uses TestNG instead of JUNIT
I would like to fix the following
{code:xml}
<testcase name="query.distributed.MassIndexerAsyncBackendTest" classname="org.infinispan.query.distributed.MassIndexerAsyncBackendTest" time="3.469"/>
{code}
was:
Running the command {{mvn verify -pl 'query' -Dtest=MassIndexerAsyncBackendTest}} from the root folder will produce two XMLs
TEST-infinispan-query.xml
TEST-org.infinispan.query.distributed.MassIndexerAsyncBackendTest.xml
One will be the Junit report and the other one will be the Polarion Report.
The query project uses TestNG instead of JUNIT
> In the query project we cannot mix Junit with TestNG
> ----------------------------------------------------
>
> Key: ISPN-10108
> URL: https://issues.jboss.org/browse/ISPN-10108
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 10.0.0.Final, 9.4.11.Final
> Reporter: Diego Lovison
> Priority: Major
>
> Running the command {{mvn verify -pl 'query' -Dtest=MassIndexerAsyncBackendTest}} from the root folder will produce two XMLs
> TEST-infinispan-query.xml
> TEST-org.infinispan.query.distributed.MassIndexerAsyncBackendTest.xml
> One will be the Junit report and the other one will be the Polarion Report.
> The query project uses TestNG instead of JUNIT
> I would like to fix the following
> {code:xml}
> <testcase name="query.distributed.MassIndexerAsyncBackendTest" classname="org.infinispan.query.distributed.MassIndexerAsyncBackendTest" time="3.469"/>
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10108) In the query project we cannot mix Junit with TestNG
by Diego Lovison (Jira)
Diego Lovison created ISPN-10108:
------------------------------------
Summary: In the query project we cannot mix Junit with TestNG
Key: ISPN-10108
URL: https://issues.jboss.org/browse/ISPN-10108
Project: Infinispan
Issue Type: Bug
Affects Versions: 9.4.11.Final, 10.0.0.Final
Reporter: Diego Lovison
Running the command {{mvn verify -pl 'query' -Dtest=MassIndexerAsyncBackendTest}} from the root folder will produce two XMLs
TEST-infinispan-query.xml
TEST-org.infinispan.query.distributed.MassIndexerAsyncBackendTest.xml
One will be the Junit report and the other one will be the Polarion Report.
The query project uses TestNG instead of JUNIT
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10107) ClientSocketReadTimeoutTest.testPutTimeout fails
by Diego Lovison (Jira)
Diego Lovison created ISPN-10107:
------------------------------------
Summary: ClientSocketReadTimeoutTest.testPutTimeout fails
Key: ISPN-10107
URL: https://issues.jboss.org/browse/ISPN-10107
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Server
Affects Versions: 10.0.0.Beta2, 9.4.10.Final
Environment: CI
https://ci.infinispan.org/job/Infinispan/job/9.4.x/158/testReport/org.inf...
Reporter: Diego Lovison
Assignee: Tristan Tarrant
Fix For: 10.0.0.Beta4, 9.4.12.Final
Error Message
Expected exception of type class java.net.SocketTimeoutException but got io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: /127.0.0.1:11222
Stacktrace
org.testng.TestException:
Expected exception of type class java.net.SocketTimeoutException but got io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: /127.0.0.1:11222
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: syscall:getsockopt(..) failed: Connection refused: /127.0.0.1:11222
at io.netty.channel.unix.Socket.finishConnect(..)(Unknown Source)
Caused by: io.netty.channel.unix.Errors$NativeConnectException: syscall:getsockopt(..) failed: Connection refused
... 1 more
... Removed 13 stack frames
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10051) Cluster stats error after node leaves
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10051?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10051:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Cluster stats error after node leaves
> -------------------------------------
>
> Key: ISPN-10051
> URL: https://issues.jboss.org/browse/ISPN-10051
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.4.9.Final
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Major
> Fix For: 9.4.12.Final
>
>
> Up to 9.4.x, {{ClusterCacheStatsImpl}} uses the distributed executor service to collect statistics from all the members of the cache. However, DES will fail with a {{SuspectException}} if one of the cache members is no longer in the cluster view, which is very common (a crashed node is always removed from the cluster view first and from the cache topology later):
> {noformat}
> 23:40:57,029 ERROR [org.infinispan.stats.impl.ClusterCacheStatsImpl] (pool-1-thread-1) Could not execute cluster wide cache stats operation : java.util.concurrent.ExecutionException:
> org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node rhdg73-server-11-9pl9h was suspected
> {noformat}
> In 10.0.x the distributed executor was removed and stats use the cluster executor, which only works with the cluster view.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10040) Embedded and server thread pool defaults should be the same
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10040?page=com.atlassian.jira.plugin... ]
Dan Berindei reopened ISPN-10040:
---------------------------------
Reverted the 9.4.x commit for now
> Embedded and server thread pool defaults should be the same
> -----------------------------------------------------------
>
> Key: ISPN-10040
> URL: https://issues.jboss.org/browse/ISPN-10040
> Project: Infinispan
> Issue Type: Bug
> Components: Configuration
> Affects Versions: 8.2.11.Final, 10.0.0.Beta2, 9.4.9.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.Beta4, 9.4.12.Final
>
>
> Embedded:
> {code:java}
> DEFAULT_THREAD_COUNT.put(REMOTE_COMMAND_EXECUTOR, 200);
> DEFAULT_QUEUE_SIZE.put(REMOTE_COMMAND_EXECUTOR, 0);
> {code}
> Server:
> {code:java}
> REMOTE_COMMAND("remote-command", 25, 25, 100000, 60000),
> {code}
> Using a huge queue is not ok for the remote thread pool, but I missed it before because I assumed the server was using the {{KnownComponentNames}} defaults. We should unify the thread pool defaults in another class with a more topical name and remove {{KnownComponentNames}}.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months
[JBoss JIRA] (ISPN-10106) Fix thread leaks in JUnit modules
by Dan Berindei (Jira)
Dan Berindei created ISPN-10106:
-----------------------------------
Summary: Fix thread leaks in JUnit modules
Key: ISPN-10106
URL: https://issues.jboss.org/browse/ISPN-10106
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Server
Affects Versions: 9.4.11.Final, 10.0.0.Beta3
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 10.0.0.Beta4
ISPN-9863 added a thread leak checker, but even with all the recent improvements, leaks in JUnit modules are not reported as test failures.
Because Surefire ignores failures in JUnit configuration methods and listeners, the only sign of a leak is an error message in the console output:
{noformat}
[2019-03-28T17:33:54.119Z] org.apache.maven.surefire.booter.SurefireBooterForkException: ExecutionException There was an error in the forked process
[2019-03-28T17:33:54.119Z] Test mechanism :: Leaked threads:
[2019-03-28T17:33:54.119Z] {pool-7-thread-1: possible sources [UNKNOWN]},
[2019-03-28T17:33:54.119Z] {management-client-thread 1-1: possible sources [UNKNOWN]}
{noformat}
And in a dump file:
{noformat}
[2019-03-28T18:23:39.501Z] ./integrationtests/security-it/target/failsafe-reports/2019-03-28T17-28-10_213-jvmRun1.dump
[2019-03-28T18:23:39.501Z] # Created at 2019-03-28T17:29:25.623
[2019-03-28T18:23:39.501Z] org.apache.maven.surefire.testset.TestSetFailedException: Test mechanism :: Leaked threads:
[2019-03-28T18:23:39.501Z] {pool-7-thread-1: possible sources [UNKNOWN]},
[2019-03-28T18:23:39.501Z] {management-client-thread 1-1: possible sources [UNKNOWN]}
[2019-03-28T18:23:39.501Z] at org.apache.maven.surefire.common.junit4.JUnit4RunListener.rethrowAnyTestMechanismFailures(JUnit4RunListener.java:223)
{noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
5 years, 9 months