[infinispan-issues] [JBoss JIRA] (ISPN-10093) PersistenceManagerImpl stop deadlock with topology update
Will Burns (Jira)
issues at jboss.org
Thu May 23 10:40:00 EDT 2019
[ https://issues.jboss.org/browse/ISPN-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737822#comment-13737822 ]
Will Burns edited comment on ISPN-10093 at 5/23/19 10:39 AM:
-------------------------------------------------------------
Oh and also the semaphore acquire only only acquires permits when it can get the entire batch at once, the key is just that it is holding the writeLock which in turns blocks the storeAvailability.
This brings up the point that pollStoreAvailability should probably use a tryLock and not a lock call as it shouldn't matter if it can't get it.
was (Author: william.burns):
Oh and also the semaphore acquire only only acquires permits when it can get the entire batch at once, the key is just that it is holding the writeLock which in turns blocks the storeAvailability.
This brings up the point that pollStoreAvailability should probably use a tryLock and not a lock call as it shouldn't matter if it can't get it. Same thing for expiration.
> PersistenceManagerImpl stop deadlock with topology update
> ---------------------------------------------------------
>
> Key: ISPN-10093
> URL: https://issues.jboss.org/browse/ISPN-10093
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 10.0.0.Beta3
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Major
> Labels: testsuite_stability
> Fix For: 10.0.0.Beta4
>
> Attachments: threaddump.txt
>
>
> {{DistSyncStoreNotSharedTest.clearContent}} hanged in CI recently:
> {noformat}
> "testng-DistSyncStoreNotSharedTest" #16 prio=5 os_prio=0 cpu=11511.26ms elapsed=435.14s tid=0x00007fdb710b6000 nid=0x3222 waiting on condition [0x00007fdb352d3000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base at 11/Native Method)
> - parking to wait for <0x00000000c8a22450> (a java.util.concurrent.Semaphore$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(java.base at 11/LockSupport.java:194)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base at 11/AbstractQueuedSynchronizer.java:885)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base at 11/AbstractQueuedSynchronizer.java:1009)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base at 11/AbstractQueuedSynchronizer.java:1324)
> at java.util.concurrent.Semaphore.acquireUninterruptibly(java.base at 11/Semaphore.java:504)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.stop(PersistenceManagerImpl.java:222)
> at jdk.internal.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
> at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base at 11/DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(java.base at 11/Method.java:566)
> at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:79)
> at org.infinispan.commons.util.SecurityActions$$Lambda$237/0x0000000100661c40.run(Unknown Source)
> at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:71)
> at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:76)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:181)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.performStop(BasicComponentRegistryImpl.java:601)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.stopWrapper(BasicComponentRegistryImpl.java:590)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.stop(BasicComponentRegistryImpl.java:461)
> at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:431)
> at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:366)
> at org.infinispan.cache.impl.CacheImpl.performImmediateShutdown(CacheImpl.java:1160)
> at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:1125)
> at org.infinispan.cache.impl.AbstractDelegatingCache.stop(AbstractDelegatingCache.java:521)
> at org.infinispan.manager.DefaultCacheManager.terminate(DefaultCacheManager.java:747)
> at org.infinispan.manager.DefaultCacheManager.stopCaches(DefaultCacheManager.java:799)
> at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:775)
> at org.infinispan.test.TestingUtil.killCacheManagers(TestingUtil.java:846)
> at org.infinispan.test.MultipleCacheManagersTest.clearContent(MultipleCacheManagersTest.java:158)
> "persistence-thread-DistSyncStoreNotSharedTest-NodeB-p16432-t1" #53654 daemon prio=5 os_prio=0 cpu=1.26ms elapsed=301.93s tid=0x00007fdb3c3d8000 nid=0x8ef waiting on condition [0x00007fdb00055000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base at 11/Native Method)
> - parking to wait for <0x00000000c8b1fb88> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(java.base at 11/LockSupport.java:194)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base at 11/AbstractQueuedSynchronizer.java:885)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base at 11/AbstractQueuedSynchronizer.java:1009)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base at 11/AbstractQueuedSynchronizer.java:1324)
> at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(java.base at 11/ReentrantReadWriteLock.java:738)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.pollStoreAvailability(PersistenceManagerImpl.java:196)
> at org.infinispan.persistence.manager.PersistenceManagerImpl$$Lambda$492/0x00000001007fb440.run(Unknown Source)
> at java.util.concurrent.Executors$RunnableAdapter.call(java.base at 11/Executors.java:515)
> at java.util.concurrent.FutureTask.runAndReset(java.base at 11/FutureTask.java:305)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base at 11/ScheduledThreadPoolExecutor.java:305)
> "transport-thread-DistSyncStoreNotSharedTest-NodeB-p16424-t5" #53646 daemon prio=5 os_prio=0 cpu=3.15ms elapsed=301.94s tid=0x00007fdb2007a000 nid=0x8e8 waiting on condition [0x00007fdb0b406000]
> java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park(java.base at 11/Native Method)
> - parking to wait for <0x00000000c8d2abb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(java.base at 11/LockSupport.java:194)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base at 11/AbstractQueuedSynchronizer.java:2081)
> at io.reactivex.internal.operators.flowable.BlockingFlowableIterable$BlockingFlowableIterator.hasNext(BlockingFlowableIterable.java:94)
> at io.reactivex.Flowable.blockingForEach(Flowable.java:5682)
> at org.infinispan.statetransfer.StateConsumerImpl.removeStaleData(StateConsumerImpl.java:1011)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:453)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:202)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:58)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:114)
> at org.infinispan.topology.LocalTopologyManagerImpl.resetLocalTopologyBeforeRebalance(LocalTopologyManagerImpl.java:437)
> at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:519)
> - locked <0x00000000c8b30b30> (a org.infinispan.topology.LocalCacheStatus)
> at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:484)
> at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$574/0x000000010089a040.run(Unknown Source)
> at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175){noformat}
> [Full thread dump|https://ci.infinispan.org/job/Infinispan/job/master/1133/artifact/core/]
> Somehow the producer thread for the transport-thread iteration is blocked, but without waiting for the persistence mutex. Maybe it's waiting for a topology? Not sure if it's relevant, but the last test to run was {{testClearWithFlag}}, so the data container was empty and the store had 5 entries.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
More information about the infinispan-issues
mailing list