[JBoss JIRA] (ISPN-11101) Purge on JDBC shared stores can cause deadlocks
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11101?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11101:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Purge on JDBC shared stores can cause deadlocks
> -----------------------------------------------
>
> Key: ISPN-11101
> URL: https://issues.redhat.com/browse/ISPN-11101
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 9.4.17.Final, 10.1.0.CR1
> Reporter: Ryan Emerson
> Assignee: Ryan Emerson
> Priority: Major
> Fix For: 10.1.0.Final, 9.4.18.Final
>
>
> ISPN-10337 ensured that the JdbcStringBasedStore correctly acquired the locks of expired rows during the purging of store entries, however it has exposed an issue with shared stores. When the jdbc store is shared, the coordinator locks the rows of the affected entries and releases them once they have all been removed as part of the purge transaction. However, the call to {{ExpirationManager::handleInStoreExpiration}} also sends a {{RemoveExpiredCommand}} to ensure that entries are removed from memory. This results in a {{java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction}} being thrown by the store's delete method when the purge process is still in progress, because the expired rows are still locked and the purge process cannot complete until the {{RemoveExpiredCommand}} has been executed.
> In summary:
> # purge locks all the rows
> # purge sends a RemoveExpiredCommand and waits for it to complete
> # RemoveExpiredCommand tries to remove the row but it can't as it is locked
> Solution, send the {{RemoveExpiredCommand}} with the {{SKIP_CACHE_STORE}} flag when a store is shared,so that it only removes entries from memory.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-10882) Remove the warn message ISPN000026: Caught exception purging data container
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-10882?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-10882:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 10.1.0.Final
Resolution: Done
> Remove the warn message ISPN000026: Caught exception purging data container
> ---------------------------------------------------------------------------
>
> Key: ISPN-10882
> URL: https://issues.redhat.com/browse/ISPN-10882
> Project: Infinispan
> Issue Type: Bug
> Reporter: Diego Lovison
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.1.0.Final
>
>
> When the node1 is leaving the cluster we could have the following warn message:
> {noformat}
> [0m[33m18:43:28,387 WARN [org.infinispan.expiration.impl.ClusterExpirationManager] (expiration-thread--p7-t1) ISPN000026: Caught exception purging data container!: java.lang.IllegalArgumentException: Node edg-perf01-2556 is not a member
> at org.infinispan.distribution.ch.impl.DefaultConsistentHash.getPrimarySegmentsForOwner(DefaultConsistentHash.java:128)
> at org.infinispan.distribution.group.impl.PartitionerConsistentHash.getPrimarySegmentsForOwner(PartitionerConsistentHash.java:76)
> at org.infinispan.expiration.impl.ClusterExpirationManager.purgeInMemoryContents(ClusterExpirationManager.java:123)
> at org.infinispan.expiration.impl.ClusterExpirationManager.processExpiration(ClusterExpirationManager.java:98)
> at org.infinispan.expiration.impl.ExpirationManagerImpl$ScheduledTask.run(ExpirationManagerImpl.java:282)
> at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> Dan added the following comments:
> {noformat}
> Dan: but I think it can actually happen during start and/or after a merge
> Dan: when joining, the first cache topology usually doesn't have the joiner as a member
> Dan: after a merge as well, nodes that are not in the majority partition will receive a cache topology in which they are not members
> Dan: luckily StateConsumerImpl clears the data container and private stores after receiving that cache topology anyway, so there's nothing to expire
> Dan: and after the node becomes a full member again expiration will work
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-11108) Move eviction components to impl package
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-11108?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-11108:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 10.1.0.Final
Resolution: Done
> Move eviction components to impl package
> ----------------------------------------
>
> Key: ISPN-11108
> URL: https://issues.redhat.com/browse/ISPN-11108
> Project: Infinispan
> Issue Type: Task
> Components: Core
> Affects Versions: 10.1.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.1.0.Final
>
>
> {{ActivationManager}} and {{PassivationManager}} are in a public package, probably because they were supposed to be used by custom {{DataContainer}} implementations. We no longer support custom {{DataContainer}} implementations, so we should move them to the impl package.
> {{EvictionManager}} is currently accessible through {{AdvancedCache.getEvictionManager()}}, so we can only deprecate it now and remove it in 11.0.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-10310) State Transfer needs to be made non blocking
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-10310?page=com.atlassian.jira.plugi... ]
Tristan Tarrant updated ISPN-10310:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> State Transfer needs to be made non blocking
> --------------------------------------------
>
> Key: ISPN-10310
> URL: https://issues.redhat.com/browse/ISPN-10310
> Project: Infinispan
> Issue Type: Sub-task
> Components: Core, State Transfer
> Reporter: Will Burns
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.1.0.Final
>
>
> State Transfer currently invokes many methods that are non blocking and blocks to wait for those to complete. We need to ensure that all the various usages are converted to be non blocking and when absolutely not possible convert them to using a separate thread pool. The final goal is to eventually eliminate the state transfer thread pool as well.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-9988) ScatteredStateConsumerImpl can leak the exclusive topology lock
by Tristan Tarrant (Jira)
[ https://issues.redhat.com/browse/ISPN-9988?page=com.atlassian.jira.plugin... ]
Tristan Tarrant updated ISPN-9988:
----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 10.1.0.Final
Resolution: Done
> ScatteredStateConsumerImpl can leak the exclusive topology lock
> ---------------------------------------------------------------
>
> Key: ISPN-9988
> URL: https://issues.redhat.com/browse/ISPN-9988
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.4.7.Final, 10.0.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.1.0.Final
>
>
> When an exception happens in {{ScatteredStateConsumerImpl.beforeTopologyInstalled}}, the exclusive topology lock is not released in {{StateConsumerImpl.onTopologyUpdate}}:
> {noformat}
> 15:21:54,783 ERROR (transport-thread-FunctionalScatteredInMemoryTest-NodeA-p43135-t5:[Topology-scattered]) [LocalTopologyManagerImpl] ISPN000230: Failed to start rebalance for cache scattered
> java.lang.IllegalArgumentException: The task is already cancelled.
> at org.infinispan.statetransfer.InboundTransferTask.cancelSegments(InboundTransferTask.java:172) ~[classes/:?]
> at org.infinispan.statetransfer.StateConsumerImpl.cancelTransfers(StateConsumerImpl.java:959) ~[classes/:?]
> at org.infinispan.scattered.impl.ScatteredStateConsumerImpl.beforeTopologyInstalled(ScatteredStateConsumerImpl.java:115) ~[classes/:?]
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:292) ~[classes/:?]
> at org.infinispan.scattered.impl.ScatteredStateConsumerImpl.onTopologyUpdate(ScatteredStateConsumerImpl.java:102) ~[classes/:?]
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:200) ~[classes/:?]
> {noformat}
> Because the exclusive topology lock is not released, threads that try to apply a new topology update block forever. This causes random failures with the ISPN-9863 thread leak checker:
> {noformat}
> 15:26:25,922 WARN (testng-RehashClusterPublisherManagerTest:[]) [ThreadLeakChecker] Possible leaked thread:
> "transport-thread-FunctionalScatteredInMemoryTest-NodeA-p43135-t3" daemon prio=5 tid=0x236fd nid=NA waiting
> java.lang.Thread.State: WAITING
> java.base(a)11/jdk.internal.misc.Unsafe.park(Native Method)
> java.base@11/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> java.base@11/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
> java.base@11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917)
> java.base@11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240)
> java.base@11/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:959)
> app//org.infinispan.statetransfer.StateTransferLockImpl.acquireExclusiveTopologyLock(StateTransferLockImpl.java:42)
> app//org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:291)
> app//org.infinispan.scattered.impl.ScatteredStateConsumerImpl.onTopologyUpdate(ScatteredStateConsumerImpl.java:102)
> app//org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:200)
> app//org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:57)
> app//org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:113)
> app//org.infinispan.topology.LocalTopologyManagerImpl.doHandleTopologyUpdate(LocalTopologyManagerImpl.java:353)
> app//org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleTopologyUpdate$1(LocalTopologyManagerImpl.java:275)
> 15:26:25,923 ERROR (testng-RehashClusterPublisherManagerTest:[]) [TestSuiteProgress] Test configuration failed: org.infinispan.reactive.publisher.impl.RehashClusterPublisherManagerTest.testClassFinished
> java.lang.AssertionError: Leaked threads:
> {transport-thread-FunctionalScatteredInMemoryTest-NodeA-p43135-t3: possible sources [org.infinispan.functional.FunctionalScatteredInMemoryTest[bias=ON_WRITE], org.infinispan.statetransfer.ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false], org.infinispan.functional.FunctionalCachestoreTest[passivation=true], org.infinispan.functional.distribution.rehash.FunctionalNonTxBackupOwnerBecomingPrimaryOwnerTest, org.infinispan.functional.distribution.rehash.FunctionalNonTxJoinerBecomingBackupOwnerTest, org.infinispan.api.mvcc.PutForExternalReadTest[REPL_SYNC, tx=false], org.infinispan.functional.distribution.rehash.FunctionalTxTest, org.infinispan.functional.FunctionalEncodingTypeTest[tx=true]]}
> at org.infinispan.commons.test.ThreadLeakChecker.performCheck(ThreadLeakChecker.java:148) ~[infinispan-commons-test-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT]
> at org.infinispan.commons.test.ThreadLeakChecker.testFinished(ThreadLeakChecker.java:109) ~[infinispan-commons-test-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT]
> at org.infinispan.test.fwk.TestResourceTracker.testFinished(TestResourceTracker.java:112) ~[test-classes/:?]
> at org.infinispan.test.AbstractInfinispanTest.testClassFinished(AbstractInfinispanTest.java:142) ~[test-classes/:?]
> {noformat}
> The fix should address both the exclusive topology lock itself, by releasing it in a finally block, and the {{IllegalArgumentException}}, either by ignoring already cancelled transfers or by only cancelling transfers while holding {{transferMapsLock}}.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-10882) Remove the warn message ISPN000026: Caught exception purging data container
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-10882?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-10882:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/7669
> Remove the warn message ISPN000026: Caught exception purging data container
> ---------------------------------------------------------------------------
>
> Key: ISPN-10882
> URL: https://issues.redhat.com/browse/ISPN-10882
> Project: Infinispan
> Issue Type: Bug
> Reporter: Diego Lovison
> Assignee: Dan Berindei
> Priority: Major
>
> When the node1 is leaving the cluster we could have the following warn message:
> {noformat}
> [0m[33m18:43:28,387 WARN [org.infinispan.expiration.impl.ClusterExpirationManager] (expiration-thread--p7-t1) ISPN000026: Caught exception purging data container!: java.lang.IllegalArgumentException: Node edg-perf01-2556 is not a member
> at org.infinispan.distribution.ch.impl.DefaultConsistentHash.getPrimarySegmentsForOwner(DefaultConsistentHash.java:128)
> at org.infinispan.distribution.group.impl.PartitionerConsistentHash.getPrimarySegmentsForOwner(PartitionerConsistentHash.java:76)
> at org.infinispan.expiration.impl.ClusterExpirationManager.purgeInMemoryContents(ClusterExpirationManager.java:123)
> at org.infinispan.expiration.impl.ClusterExpirationManager.processExpiration(ClusterExpirationManager.java:98)
> at org.infinispan.expiration.impl.ExpirationManagerImpl$ScheduledTask.run(ExpirationManagerImpl.java:282)
> at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
> at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}
> Dan added the following comments:
> {noformat}
> Dan: but I think it can actually happen during start and/or after a merge
> Dan: when joining, the first cache topology usually doesn't have the joiner as a member
> Dan: after a merge as well, nodes that are not in the majority partition will receive a cache topology in which they are not members
> Dan: luckily StateConsumerImpl clears the data container and private stores after receiving that cache topology anyway, so there's nothing to expire
> Dan: and after the node becomes a full member again expiration will work
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-9988) ScatteredStateConsumerImpl can leak the exclusive topology lock
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-9988?page=com.atlassian.jira.plugin... ]
Dan Berindei updated ISPN-9988:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/7669
> ScatteredStateConsumerImpl can leak the exclusive topology lock
> ---------------------------------------------------------------
>
> Key: ISPN-9988
> URL: https://issues.redhat.com/browse/ISPN-9988
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.4.7.Final, 10.0.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
>
> When an exception happens in {{ScatteredStateConsumerImpl.beforeTopologyInstalled}}, the exclusive topology lock is not released in {{StateConsumerImpl.onTopologyUpdate}}:
> {noformat}
> 15:21:54,783 ERROR (transport-thread-FunctionalScatteredInMemoryTest-NodeA-p43135-t5:[Topology-scattered]) [LocalTopologyManagerImpl] ISPN000230: Failed to start rebalance for cache scattered
> java.lang.IllegalArgumentException: The task is already cancelled.
> at org.infinispan.statetransfer.InboundTransferTask.cancelSegments(InboundTransferTask.java:172) ~[classes/:?]
> at org.infinispan.statetransfer.StateConsumerImpl.cancelTransfers(StateConsumerImpl.java:959) ~[classes/:?]
> at org.infinispan.scattered.impl.ScatteredStateConsumerImpl.beforeTopologyInstalled(ScatteredStateConsumerImpl.java:115) ~[classes/:?]
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:292) ~[classes/:?]
> at org.infinispan.scattered.impl.ScatteredStateConsumerImpl.onTopologyUpdate(ScatteredStateConsumerImpl.java:102) ~[classes/:?]
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:200) ~[classes/:?]
> {noformat}
> Because the exclusive topology lock is not released, threads that try to apply a new topology update block forever. This causes random failures with the ISPN-9863 thread leak checker:
> {noformat}
> 15:26:25,922 WARN (testng-RehashClusterPublisherManagerTest:[]) [ThreadLeakChecker] Possible leaked thread:
> "transport-thread-FunctionalScatteredInMemoryTest-NodeA-p43135-t3" daemon prio=5 tid=0x236fd nid=NA waiting
> java.lang.Thread.State: WAITING
> java.base(a)11/jdk.internal.misc.Unsafe.park(Native Method)
> java.base@11/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
> java.base@11/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885)
> java.base@11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:917)
> java.base@11/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1240)
> java.base@11/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:959)
> app//org.infinispan.statetransfer.StateTransferLockImpl.acquireExclusiveTopologyLock(StateTransferLockImpl.java:42)
> app//org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:291)
> app//org.infinispan.scattered.impl.ScatteredStateConsumerImpl.onTopologyUpdate(ScatteredStateConsumerImpl.java:102)
> app//org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:200)
> app//org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:57)
> app//org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:113)
> app//org.infinispan.topology.LocalTopologyManagerImpl.doHandleTopologyUpdate(LocalTopologyManagerImpl.java:353)
> app//org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleTopologyUpdate$1(LocalTopologyManagerImpl.java:275)
> 15:26:25,923 ERROR (testng-RehashClusterPublisherManagerTest:[]) [TestSuiteProgress] Test configuration failed: org.infinispan.reactive.publisher.impl.RehashClusterPublisherManagerTest.testClassFinished
> java.lang.AssertionError: Leaked threads:
> {transport-thread-FunctionalScatteredInMemoryTest-NodeA-p43135-t3: possible sources [org.infinispan.functional.FunctionalScatteredInMemoryTest[bias=ON_WRITE], org.infinispan.statetransfer.ClusterTopologyManagerTest[SCATTERED_SYNC, tx=false], org.infinispan.functional.FunctionalCachestoreTest[passivation=true], org.infinispan.functional.distribution.rehash.FunctionalNonTxBackupOwnerBecomingPrimaryOwnerTest, org.infinispan.functional.distribution.rehash.FunctionalNonTxJoinerBecomingBackupOwnerTest, org.infinispan.api.mvcc.PutForExternalReadTest[REPL_SYNC, tx=false], org.infinispan.functional.distribution.rehash.FunctionalTxTest, org.infinispan.functional.FunctionalEncodingTypeTest[tx=true]]}
> at org.infinispan.commons.test.ThreadLeakChecker.performCheck(ThreadLeakChecker.java:148) ~[infinispan-commons-test-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT]
> at org.infinispan.commons.test.ThreadLeakChecker.testFinished(ThreadLeakChecker.java:109) ~[infinispan-commons-test-10.0.0-SNAPSHOT.jar:10.0.0-SNAPSHOT]
> at org.infinispan.test.fwk.TestResourceTracker.testFinished(TestResourceTracker.java:112) ~[test-classes/:?]
> at org.infinispan.test.AbstractInfinispanTest.testClassFinished(AbstractInfinispanTest.java:142) ~[test-classes/:?]
> {noformat}
> The fix should address both the exclusive topology lock itself, by releasing it in a finally block, and the {{IllegalArgumentException}}, either by ignoring already cancelled transfers or by only cancelling transfers while holding {{transferMapsLock}}.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months
[JBoss JIRA] (ISPN-11101) Purge on JDBC shared stores can cause deadlocks
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11101?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11101:
--------------------------------
Git Pull Request: https://github.com/infinispan/infinispan/pull/7692, https://github.com/infinispan/infinispan/pull/7701, https://github.com/infinispan/infinispan/pull/7704 (was: https://github.com/infinispan/infinispan/pull/7692, https://github.com/infinispan/infinispan/pull/7701)
> Purge on JDBC shared stores can cause deadlocks
> -----------------------------------------------
>
> Key: ISPN-11101
> URL: https://issues.redhat.com/browse/ISPN-11101
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 9.4.17.Final, 10.1.0.CR1
> Reporter: Ryan Emerson
> Assignee: Ryan Emerson
> Priority: Major
> Fix For: 10.1.0.Final, 9.4.18.Final
>
>
> ISPN-10337 ensured that the JdbcStringBasedStore correctly acquired the locks of expired rows during the purging of store entries, however it has exposed an issue with shared stores. When the jdbc store is shared, the coordinator locks the rows of the affected entries and releases them once they have all been removed as part of the purge transaction. However, the call to {{ExpirationManager::handleInStoreExpiration}} also sends a {{RemoveExpiredCommand}} to ensure that entries are removed from memory. This results in a {{java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction}} being thrown by the store's delete method when the purge process is still in progress, because the expired rows are still locked and the purge process cannot complete until the {{RemoveExpiredCommand}} has been executed.
> In summary:
> # purge locks all the rows
> # purge sends a RemoveExpiredCommand and waits for it to complete
> # RemoveExpiredCommand tries to remove the row but it can't as it is locked
> Solution, send the {{RemoveExpiredCommand}} with the {{SKIP_CACHE_STORE}} flag when a store is shared,so that it only removes entries from memory.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years, 2 months