[JBoss JIRA] (ISPN-10346) BlockingTaskAwareExecutorServiceImpl may leak controller thread
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10346?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-10346:
--------------------------------
Status: Open (was: New)
> BlockingTaskAwareExecutorServiceImpl may leak controller thread
> ---------------------------------------------------------------
>
> Key: ISPN-10346
> URL: https://issues.jboss.org/browse/ISPN-10346
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.Beta4
>
>
> I got a thread leak when running the test suite with trace logging enabled and {{taskset -c 1-2}}:
> {noformat}
> 17:06:05,307 DEBUG (testng-Test:[]) [TestSuiteProgress] Test configuration finished: org.infinispan.remoting.transport.InitialClusterSizeTest.testClassFinished
> 17:15:17,284 ERROR [TestSuiteProgress] Test failed: InitialClusterSizeTest.ThreadLeakChecker
> java.lang.RuntimeException: Leaked thread Controller-remote-thread-InitialClusterSizeTest-NodeB
> at jdk.internal.misc.Unsafe.park(Native Method) ~[?:?]
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) ~[?:?]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) ~[?:?]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) ~[?:?]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345) ~[?:?]
> at java.util.concurrent.Semaphore.acquire(Semaphore.java:318) ~[?:?]
> at org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl$ControllerThread.run(BlockingTaskAwareExecutorServiceImpl.java:159) ~[classes/:?]
> {noformat}
> I believe it may be caused by {{doExecute()}} adding a task to the blocked tasks queue without calling {{checkForReadyTasks()}} (after the executor rejected the task). {{ControllerThread.run()}} needs a semaphore permit to start polling for tasks, so it needs a {{semaphore.release()}} after each {{blockedTasks.add().
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months
[JBoss JIRA] (ISPN-10346) BlockingTaskAwareExecutorServiceImpl may leak controller thread
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-10346?page=com.atlassian.jira.plugin... ]
Dan Berindei reassigned ISPN-10346:
-----------------------------------
Assignee: Dan Berindei
> BlockingTaskAwareExecutorServiceImpl may leak controller thread
> ---------------------------------------------------------------
>
> Key: ISPN-10346
> URL: https://issues.jboss.org/browse/ISPN-10346
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.Beta4
>
>
> I got a thread leak when running the test suite with trace logging enabled and {{taskset -c 1-2}}:
> {noformat}
> 17:06:05,307 DEBUG (testng-Test:[]) [TestSuiteProgress] Test configuration finished: org.infinispan.remoting.transport.InitialClusterSizeTest.testClassFinished
> 17:15:17,284 ERROR [TestSuiteProgress] Test failed: InitialClusterSizeTest.ThreadLeakChecker
> java.lang.RuntimeException: Leaked thread Controller-remote-thread-InitialClusterSizeTest-NodeB
> at jdk.internal.misc.Unsafe.park(Native Method) ~[?:?]
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) ~[?:?]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) ~[?:?]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) ~[?:?]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345) ~[?:?]
> at java.util.concurrent.Semaphore.acquire(Semaphore.java:318) ~[?:?]
> at org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl$ControllerThread.run(BlockingTaskAwareExecutorServiceImpl.java:159) ~[classes/:?]
> {noformat}
> I believe it may be caused by {{doExecute()}} adding a task to the blocked tasks queue without calling {{checkForReadyTasks()}} (after the executor rejected the task). {{ControllerThread.run()}} needs a semaphore permit to start polling for tasks, so it needs a {{semaphore.release()}} after each {{blockedTasks.add().
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months
[JBoss JIRA] (ISPN-10269) ProtobufMetadataManagerImpl.registerProtofiles always times out
by Will Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-10269?page=com.atlassian.jira.plugin... ]
Will Burns updated ISPN-10269:
------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> ProtobufMetadataManagerImpl.registerProtofiles always times out
> ---------------------------------------------------------------
>
> Key: ISPN-10269
> URL: https://issues.jboss.org/browse/ISPN-10269
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying, Test Suite - Query
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
>
> {{ProtobufMetadataManagerImpl.registerProtofiles()}} uses {{protobufSchemaCache.putAll(files)}} under the covers, and because of a bug {{ProtobufMetadataManagerInterceptor.visitPutMapCommand()}} it tries to lock the {{.errors}} key both on the originator and on the remote nodes (during remote prepare). The {{LockControlCommand}} goes through {{TransactionSynchronizerInterceptor}} and tries to "lock" the transaction with {{enterSynchronizationAsync()}}, but it hangs because the {{PrepareCommand}} invocation can't continue and complete its {{releaseFuture}}.
> {noformat}
> 00:59:56,177 ERROR [org.infinispan.transaction.impl.TransactionCoordinator] (pool-8-thread-1) ISPN000097: Error while processing a prepare in a single-phase transaction: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:263)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:159)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionTable.afterCompletion(TransactionTable.java:870)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.synchronization.SynchronizationAdapter.afterCompletion(SynchronizationAdapter.java:33)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction.performConsumer(AbstractTransaction.java:223)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction$AssociatingSynchronization.afterCompletion(AbstractTransaction.java:306)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.afterCompletion(SynchronizationImple.java:96)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:545)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:101)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1288)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
> at org.jboss.jts.integration@5.9.0.Final//com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:89)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.LocalTransaction.commitAndDissociate(LocalTransaction.java:77)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.ContextTransactionManager.commit(ContextTransactionManager.java:71)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.tryCommit(CacheImpl.java:1981)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandWithInjectedTx(CacheImpl.java:1901)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1874)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:1458)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:2005)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:469)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.AbstractDelegatingCache.putAll(AbstractDelegatingCache.java:459)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.EncoderCache.putAll(EncoderCache.java:693)
> at org.infinispan.remote-query.server:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.query.remote.impl.ProtobufMetadataManagerImpl.registerProtofiles(ProtobufMetadataManagerImpl.java:166)
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> {{enterSynchronizationAsync()}} doesn't have any timeout. When the prepare times out on the originator and it sends a {{RollbackCommand}} to all the affected nodes, the {{RollbackCommand}} will hang as well. However, because {{ProtobufMetadataManagerImpl}} configures the protobuf metadata cache with {{invocationBatching().enable()}}, it uses {{Synchronization}}, and all the exceptions are hidden from the user.
> The failure can still causes failures later in {{RemoteQueryDescriptorIT}}:
> {noformat}
> 00:57:59.423 [ERROR] testDescriptorPropagation(org.infinispan.server.test.query.RemoteQueryDescriptorIT) Time elapsed: 38.848 s <<< FAILURE!
> java.lang.AssertionError
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.registerProtoOnServer1(RemoteQueryDescriptorIT.java:83)
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.testDescriptorPropagation(RemoteQueryDescriptorIT.java:59)
> {noformat}
> Interesting enough, it never fails when running all the tests in {{server/integration/testsuite}} on master, but it fails when running only {{RemoteQueryDescriptorIT}} with {{mvn verify -pl server/integration/testsuite -Dit.test=RemoteQueryDescriptorIT -Psuite.queries}}, and it also fails when running all the tests after I removed {{StringBasedStoreMultinodeIT.testSingleton}}. Maybe it's because multiple ITs share the same global state persistence location?
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months
[JBoss JIRA] (ISPN-10346) BlockingTaskAwareExecutorServiceImpl may leak controller thread
by Dan Berindei (Jira)
Dan Berindei created ISPN-10346:
-----------------------------------
Summary: BlockingTaskAwareExecutorServiceImpl may leak controller thread
Key: ISPN-10346
URL: https://issues.jboss.org/browse/ISPN-10346
Project: Infinispan
Issue Type: Bug
Components: Core, Test Suite - Core
Affects Versions: 10.0.0.Beta3
Reporter: Dan Berindei
Fix For: 10.0.0.Beta4
I got a thread leak when running the test suite with trace logging enabled and {{taskset -c 1-2}}:
{noformat}
17:06:05,307 DEBUG (testng-Test:[]) [TestSuiteProgress] Test configuration finished: org.infinispan.remoting.transport.InitialClusterSizeTest.testClassFinished
17:15:17,284 ERROR [TestSuiteProgress] Test failed: InitialClusterSizeTest.ThreadLeakChecker
java.lang.RuntimeException: Leaked thread Controller-remote-thread-InitialClusterSizeTest-NodeB
at jdk.internal.misc.Unsafe.park(Native Method) ~[?:?]
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) ~[?:?]
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) ~[?:?]
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) ~[?:?]
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345) ~[?:?]
at java.util.concurrent.Semaphore.acquire(Semaphore.java:318) ~[?:?]
at org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl$ControllerThread.run(BlockingTaskAwareExecutorServiceImpl.java:159) ~[classes/:?]
{noformat}
I believe it may be caused by {{doExecute()}} adding a task to the blocked tasks queue without calling {{checkForReadyTasks()}} (after the executor rejected the task). {{ControllerThread.run()}} needs a semaphore permit to start polling for tasks, so it needs a {{semaphore.release()}} after each {{blockedTasks.add().
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months
[JBoss JIRA] (ISPN-10269) ProtobufMetadataManagerImpl.registerProtofiles always times out
by Gustavo Fernandes (Jira)
[ https://issues.jboss.org/browse/ISPN-10269?page=com.atlassian.jira.plugin... ]
Gustavo Fernandes updated ISPN-10269:
-------------------------------------
Git Pull Request: https://github.com/infinispan/infinispan/pull/7003, https://github.com/infinispan/infinispan/pull/7088 (was: https://github.com/infinispan/infinispan/pull/7003)
> ProtobufMetadataManagerImpl.registerProtofiles always times out
> ---------------------------------------------------------------
>
> Key: ISPN-10269
> URL: https://issues.jboss.org/browse/ISPN-10269
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying, Test Suite - Query
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
>
> {{ProtobufMetadataManagerImpl.registerProtofiles()}} uses {{protobufSchemaCache.putAll(files)}} under the covers, and because of a bug {{ProtobufMetadataManagerInterceptor.visitPutMapCommand()}} it tries to lock the {{.errors}} key both on the originator and on the remote nodes (during remote prepare). The {{LockControlCommand}} goes through {{TransactionSynchronizerInterceptor}} and tries to "lock" the transaction with {{enterSynchronizationAsync()}}, but it hangs because the {{PrepareCommand}} invocation can't continue and complete its {{releaseFuture}}.
> {noformat}
> 00:59:56,177 ERROR [org.infinispan.transaction.impl.TransactionCoordinator] (pool-8-thread-1) ISPN000097: Error while processing a prepare in a single-phase transaction: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:263)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:159)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionTable.afterCompletion(TransactionTable.java:870)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.synchronization.SynchronizationAdapter.afterCompletion(SynchronizationAdapter.java:33)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction.performConsumer(AbstractTransaction.java:223)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction$AssociatingSynchronization.afterCompletion(AbstractTransaction.java:306)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.afterCompletion(SynchronizationImple.java:96)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:545)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:101)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1288)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
> at org.jboss.jts.integration@5.9.0.Final//com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:89)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.LocalTransaction.commitAndDissociate(LocalTransaction.java:77)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.ContextTransactionManager.commit(ContextTransactionManager.java:71)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.tryCommit(CacheImpl.java:1981)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandWithInjectedTx(CacheImpl.java:1901)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1874)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:1458)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:2005)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:469)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.AbstractDelegatingCache.putAll(AbstractDelegatingCache.java:459)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.EncoderCache.putAll(EncoderCache.java:693)
> at org.infinispan.remote-query.server:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.query.remote.impl.ProtobufMetadataManagerImpl.registerProtofiles(ProtobufMetadataManagerImpl.java:166)
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> {{enterSynchronizationAsync()}} doesn't have any timeout. When the prepare times out on the originator and it sends a {{RollbackCommand}} to all the affected nodes, the {{RollbackCommand}} will hang as well. However, because {{ProtobufMetadataManagerImpl}} configures the protobuf metadata cache with {{invocationBatching().enable()}}, it uses {{Synchronization}}, and all the exceptions are hidden from the user.
> The failure can still causes failures later in {{RemoteQueryDescriptorIT}}:
> {noformat}
> 00:57:59.423 [ERROR] testDescriptorPropagation(org.infinispan.server.test.query.RemoteQueryDescriptorIT) Time elapsed: 38.848 s <<< FAILURE!
> java.lang.AssertionError
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.registerProtoOnServer1(RemoteQueryDescriptorIT.java:83)
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.testDescriptorPropagation(RemoteQueryDescriptorIT.java:59)
> {noformat}
> Interesting enough, it never fails when running all the tests in {{server/integration/testsuite}} on master, but it fails when running only {{RemoteQueryDescriptorIT}} with {{mvn verify -pl server/integration/testsuite -Dit.test=RemoteQueryDescriptorIT -Psuite.queries}}, and it also fails when running all the tests after I removed {{StringBasedStoreMultinodeIT.testSingleton}}. Maybe it's because multiple ITs share the same global state persistence location?
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months
[JBoss JIRA] (ISPN-10269) ProtobufMetadataManagerImpl.registerProtofiles always times out
by Gustavo Fernandes (Jira)
[ https://issues.jboss.org/browse/ISPN-10269?page=com.atlassian.jira.plugin... ]
Gustavo Fernandes updated ISPN-10269:
-------------------------------------
Fix Version/s: 9.4.16.Final
> ProtobufMetadataManagerImpl.registerProtofiles always times out
> ---------------------------------------------------------------
>
> Key: ISPN-10269
> URL: https://issues.jboss.org/browse/ISPN-10269
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying, Test Suite - Query
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4, 9.4.16.Final
>
>
> {{ProtobufMetadataManagerImpl.registerProtofiles()}} uses {{protobufSchemaCache.putAll(files)}} under the covers, and because of a bug {{ProtobufMetadataManagerInterceptor.visitPutMapCommand()}} it tries to lock the {{.errors}} key both on the originator and on the remote nodes (during remote prepare). The {{LockControlCommand}} goes through {{TransactionSynchronizerInterceptor}} and tries to "lock" the transaction with {{enterSynchronizationAsync()}}, but it hangs because the {{PrepareCommand}} invocation can't continue and complete its {{releaseFuture}}.
> {noformat}
> 00:59:56,177 ERROR [org.infinispan.transaction.impl.TransactionCoordinator] (pool-8-thread-1) ISPN000097: Error while processing a prepare in a single-phase transaction: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:263)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:159)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.impl.TransactionTable.afterCompletion(TransactionTable.java:870)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.transaction.synchronization.SynchronizationAdapter.afterCompletion(SynchronizationAdapter.java:33)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction.performConsumer(AbstractTransaction.java:223)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.AbstractTransaction$AssociatingSynchronization.afterCompletion(AbstractTransaction.java:306)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.resources.arjunacore.SynchronizationImple.afterCompletion(SynchronizationImple.java:96)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.afterCompletion(TwoPhaseCoordinator.java:545)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:101)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1288)
> at org.jboss.jts@5.9.0.Final//com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
> at org.jboss.jts.integration@5.9.0.Final//com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:89)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.LocalTransaction.commitAndDissociate(LocalTransaction.java:77)
> at org.wildfly.transaction.client@1.1.2.Final//org.wildfly.transaction.client.ContextTransactionManager.commit(ContextTransactionManager.java:71)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.tryCommit(CacheImpl.java:1981)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandWithInjectedTx(CacheImpl.java:1901)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1874)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:1458)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:2005)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.CacheImpl.putAll(CacheImpl.java:469)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.AbstractDelegatingCache.putAll(AbstractDelegatingCache.java:459)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.cache.impl.EncoderCache.putAll(EncoderCache.java:693)
> at org.infinispan.remote-query.server:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.query.remote.impl.ProtobufMetadataManagerImpl.registerProtofiles(ProtobufMetadataManagerImpl.java:166)
> Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 3411 from node1
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
> at org.infinispan.core:ispn-10.0@10.0.0-SNAPSHOT//org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
> ... 3 more
> {noformat}
> {{enterSynchronizationAsync()}} doesn't have any timeout. When the prepare times out on the originator and it sends a {{RollbackCommand}} to all the affected nodes, the {{RollbackCommand}} will hang as well. However, because {{ProtobufMetadataManagerImpl}} configures the protobuf metadata cache with {{invocationBatching().enable()}}, it uses {{Synchronization}}, and all the exceptions are hidden from the user.
> The failure can still causes failures later in {{RemoteQueryDescriptorIT}}:
> {noformat}
> 00:57:59.423 [ERROR] testDescriptorPropagation(org.infinispan.server.test.query.RemoteQueryDescriptorIT) Time elapsed: 38.848 s <<< FAILURE!
> java.lang.AssertionError
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.registerProtoOnServer1(RemoteQueryDescriptorIT.java:83)
> at org.infinispan.server.test.query.RemoteQueryDescriptorIT.testDescriptorPropagation(RemoteQueryDescriptorIT.java:59)
> {noformat}
> Interesting enough, it never fails when running all the tests in {{server/integration/testsuite}} on master, but it fails when running only {{RemoteQueryDescriptorIT}} with {{mvn verify -pl server/integration/testsuite -Dit.test=RemoteQueryDescriptorIT -Psuite.queries}}, and it also fails when running all the tests after I removed {{StringBasedStoreMultinodeIT.testSingleton}}. Maybe it's because multiple ITs share the same global state persistence location?
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
6 years, 9 months