[JBoss JIRA] (ISPN-9291) BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9291?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9291:
-------------------------------
Status: Open (was: New)
> BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-9291
> URL: https://issues.jboss.org/browse/ISPN-9291
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.3.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.3.0.Final
>
>
> The partition handling tests use {{BasePartitionHandlingTest.Partition.installMergeView(view1, view2)}} to install the merge view without waiting for {{MERGE3}} to run, making them much faster. Unfortunately, the implementation is incorrect: {{GMS.installView(view)}} only works for regular views, merge views need to be installed with {{GMS.installView(mergeView, digest)}}.
> The result is that the nodes that got isolated from the coordinator request the retransmission of all the {{NAKACK2}} messages (including view updates) since the cluster first started. The isolated nodes cannot install the merge view until they deliver all the older messages (even without knowing whether they're OOB or not). But if {{STABLE}} ran and cleared a range of messages already, the retransmission request cannot be satisfied, so the view updates will never be delivered.
> This is easily reproducible in {{CrashedNodeDuringConflictResolutionTest}} if we add a delay before updating the topology in {{StateConsumerImpl}}. The test installs the merge view manually, but then kills NodeC and expects the cluster to install the new view automatically. NodeD can't install the new view because it's waiting for earlier messages from NodeA:
> {noformat}
> 18:27:13,054 INFO (testng-test:[]) [TestSuiteProgress] Test starting: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> 18:27:13,640 DEBUG (testng-test:[]) [GMS] test-NodeA-39513: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,674 DEBUG (testng-test:[]) [GMS] test-NodeD-59078: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
> 18:27:13,828 TRACE (jgroups-7,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((1): {50}) to test-NodeA-39513
> 18:27:13,966 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((49): {1-49}) to test-NodeA-39513
> 18:27:14,067 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:14,504 DEBUG (testng-test:[]) [DefaultCacheManager] Stopping cache manager ISPN on test-NodeC-43706
> 18:27:18,642 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: joiners=[], suspected=[test-NodeC-43706], leaving=[], new view: [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,643 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: mcasting view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,646 DEBUG (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: installing view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
> 18:27:18,652 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: sending msg to null, src=test-NodeA-39513, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=63], TP: [cluster_name=ISPN]
> 18:27:18,656 TRACE (jgroups-20,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: received [dst: test-NodeA-39513, src: test-NodeB-9439 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=100, TP: [cluster_name=ISPN]
> 18:27:20,554 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,653 WARN (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: failed to collect all ACKs (expected=2) for view [test-NodeA-39513|11] after 2000ms, missing 1 ACKs from (1) test-NodeD-59078
> 18:27:20,656 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:27:20,756 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> ...
> 18:28:14,412 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,513 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
> 18:28:14,589 ERROR (testng-test:[]) [TestSuiteProgress] Test failed: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
> java.lang.RuntimeException: Cache ___defaultcache timed out waiting for rebalancing to complete on node test-NodeA-39513, current topology is CacheTopology{id=21, phase=CONFLICT_RESOLUTION, rebalanceId=7, currentCH=PartitionerConsistentHash:DefaultConsistentHash{ns=256, owners = (3)[test-NodeD-59078: 256+0, test-NodeA-39513: 0+256, test-NodeB-9439: 0+256]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeD-59078, test-NodeA-39513, test-NodeB-9439], persistentUUIDs=[828108c4-4251-49fc-9481-ff6392bea9fb, 1d4b6f07-b71b-41a1-adfb-abbe68944a9f, 3a1ece05-c282-433e-9eb5-7b3e0f1932aa]}. rebalanceInProgress=true, currentChIsBalanced=true
> at org.infinispan.test.TestingUtil.waitForNoRebalance(TestingUtil.java:392) ~[test-classes/:?]
> at org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.performMerge(CrashedNodeDuringConflictResolutionTest.java:113) ~[test-classes/:?]
> at org.infinispan.conflict.impl.BaseMergePolicyTest.testPartitionMergePolicy(BaseMergePolicyTest.java:137) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months
[JBoss JIRA] (ISPN-9291) BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
by Dan Berindei (JIRA)
Dan Berindei created ISPN-9291:
----------------------------------
Summary: BasePartitionHandlingTest.Partition.installMergeView() doesn't compute the merge digest
Key: ISPN-9291
URL: https://issues.jboss.org/browse/ISPN-9291
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Core
Affects Versions: 9.3.0.CR1
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 9.3.0.Final
The partition handling tests use {{BasePartitionHandlingTest.Partition.installMergeView(view1, view2)}} to install the merge view without waiting for {{MERGE3}} to run, making them much faster. Unfortunately, the implementation is incorrect: {{GMS.installView(view)}} only works for regular views, merge views need to be installed with {{GMS.installView(mergeView, digest)}}.
The result is that the nodes that got isolated from the coordinator request the retransmission of all the {{NAKACK2}} messages (including view updates) since the cluster first started. The isolated nodes cannot install the merge view until they deliver all the older messages (even without knowing whether they're OOB or not). But if {{STABLE}} ran and cleared a range of messages already, the retransmission request cannot be satisfied, so the view updates will never be delivered.
This is easily reproducible in {{CrashedNodeDuringConflictResolutionTest}} if we add a delay before updating the topology in {{StateConsumerImpl}}. The test installs the merge view manually, but then kills NodeC and expects the cluster to install the new view automatically. NodeD can't install the new view because it's waiting for earlier messages from NodeA:
{noformat}
18:27:13,054 INFO (testng-test:[]) [TestSuiteProgress] Test starting: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
18:27:13,640 DEBUG (testng-test:[]) [GMS] test-NodeA-39513: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
18:27:13,674 DEBUG (testng-test:[]) [GMS] test-NodeD-59078: installing view MergeView::[test-NodeA-39513|10] (4) [test-NodeA-39513, test-NodeB-9439, test-NodeC-43706, test-NodeD-59078], 2 subgroups: [test-NodeA-39513|8] (2) [test-NodeA-39513, test-NodeB-9439], [test-NodeC-43706|9] (2) [test-NodeC-43706, test-NodeD-59078]
18:27:13,828 TRACE (jgroups-7,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((1): {50}) to test-NodeA-39513
18:27:13,966 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((49): {1-49}) to test-NodeA-39513
18:27:14,067 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
18:27:14,504 DEBUG (testng-test:[]) [DefaultCacheManager] Stopping cache manager ISPN on test-NodeC-43706
18:27:18,642 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: joiners=[], suspected=[test-NodeC-43706], leaving=[], new view: [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
18:27:18,643 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: mcasting view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
18:27:18,646 DEBUG (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: installing view [test-NodeA-39513|11] (3) [test-NodeA-39513, test-NodeB-9439, test-NodeD-59078]
18:27:18,652 TRACE (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: sending msg to null, src=test-NodeA-39513, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=63], TP: [cluster_name=ISPN]
18:27:18,656 TRACE (jgroups-20,test-NodeA-39513:[]) [TCP_NIO2] test-NodeA-39513: received [dst: test-NodeA-39513, src: test-NodeB-9439 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=100, TP: [cluster_name=ISPN]
18:27:20,554 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
18:27:20,653 WARN (VERIFY_SUSPECT.TimerThread-89,test-NodeA-39513:[]) [GMS] test-NodeA-39513: failed to collect all ACKs (expected=2) for view [test-NodeA-39513|11] after 2000ms, missing 1 ACKs from (1) test-NodeD-59078
18:27:20,656 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
18:27:20,756 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
...
18:28:14,412 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
18:28:14,513 TRACE (Timer runner-1,test-NodeD-59078:[]) [NAKACK2] test-NodeD-59078: sending XMIT_REQ ((45): {1-45}) to test-NodeA-39513
18:28:14,589 ERROR (testng-test:[]) [TestSuiteProgress] Test failed: org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.testPartitionMergePolicy[DIST_SYNC]
java.lang.RuntimeException: Cache ___defaultcache timed out waiting for rebalancing to complete on node test-NodeA-39513, current topology is CacheTopology{id=21, phase=CONFLICT_RESOLUTION, rebalanceId=7, currentCH=PartitionerConsistentHash:DefaultConsistentHash{ns=256, owners = (3)[test-NodeD-59078: 256+0, test-NodeA-39513: 0+256, test-NodeB-9439: 0+256]}, pendingCH=null, unionCH=null, actualMembers=[test-NodeD-59078, test-NodeA-39513, test-NodeB-9439], persistentUUIDs=[828108c4-4251-49fc-9481-ff6392bea9fb, 1d4b6f07-b71b-41a1-adfb-abbe68944a9f, 3a1ece05-c282-433e-9eb5-7b3e0f1932aa]}. rebalanceInProgress=true, currentChIsBalanced=true
at org.infinispan.test.TestingUtil.waitForNoRebalance(TestingUtil.java:392) ~[test-classes/:?]
at org.infinispan.conflict.impl.CrashedNodeDuringConflictResolutionTest.performMerge(CrashedNodeDuringConflictResolutionTest.java:113) ~[test-classes/:?]
at org.infinispan.conflict.impl.BaseMergePolicyTest.testPartitionMergePolicy(BaseMergePolicyTest.java:137) ~[test-classes/:?]
{noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months
[JBoss JIRA] (ISPN-8976) 2 subclusters failed to merge to 1 cluster - IllegalLifecycleStateException
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8976?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-8976:
-------------------------------
Component/s: Core
> 2 subclusters failed to merge to 1 cluster - IllegalLifecycleStateException
> ---------------------------------------------------------------------------
>
> Key: ISPN-8976
> URL: https://issues.jboss.org/browse/ISPN-8976
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.4.Final
> Reporter: Robert Cernak
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Final
>
> Attachments: logs.zip
>
>
> At the beginning I have main cluster consisted of 8 nodes.
> Then I disconnected main switch on which these nodes were connected.
> This leaded to separating main cluster to 2 subclusters - first with 2 nodes and second with 6 nodes. This was expected.
> After that I rebooted the nodes. After reboot, nodes again correctly formed 2 subclusters with 2 and 6 members.
> After a long time when all nodes were stable with low cpu load, I connected the main switch back which should lead to recreation of main cluster with 8 controllers.
> However main cluster did not recovered:
> subcluster2 did not change - still had 6 nodes connected - no new members
> subcluster1 - nodes did not connect with subcluster2 and after cca 30min they left the cluster.
> When I checked infinispan logs of node1 from 1st subcluster I had IllegalLifecycleStateException for every created cache (see included logs.zip):
> [transport-thread-744a974a-2811-4f79-ac63-f32daf005d7f-p4-t6] (ClusterCacheStatus.java:599) - ISPN000228: Failed to recover cache XXX state after the current node became the coordinator
> org.infinispan.IllegalLifecycleStateException: Cache container has been stopped and cannot be reused. Recreate the cache container.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months
[JBoss JIRA] (ISPN-9211) CacheListener must be annotated with org.infinispan.notifications.Listener
by Diego Lovison (JIRA)
[ https://issues.jboss.org/browse/ISPN-9211?page=com.atlassian.jira.plugin.... ]
Diego Lovison closed ISPN-9211.
-------------------------------
Resolution: Rejected
We need to register via RemoteCache.addClientListener and not Cache.addListener
I will perform more test in this issue and reopen if needed.
> CacheListener must be annotated with org.infinispan.notifications.Listener
> --------------------------------------------------------------------------
>
> Key: ISPN-9211
> URL: https://issues.jboss.org/browse/ISPN-9211
> Project: Infinispan
> Issue Type: Bug
> Reporter: Diego Lovison
>
> Why I need to add @Listener in the following java class?
> @ClientListener is a listener.
> It seems redundant
> {code:java}
> @ClientListener
> public class CacheListener {
> @CacheEntryCreated
> public void entryCreated(CacheEntryCreatedEvent<String, LocationWeather> event) {
> if (!event.isOriginLocal()) {
> System.out.printf("-- Entry for %s modified by another node in the cluster\n", event.getKey());
> }
> }
> }
> {code}
> Without the @Listener annotation it will fail with the following error:
> {noformat}
> Exception in thread "main" org.infinispan.notifications.IncorrectListenerException: Cache listener class com.github.diegolovison.example.infinispan.listener.CacheListener must be annotated with org.infinispan.notifications.Listener
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months
[JBoss JIRA] (ISPN-9286) WriteBehindFaultToleranceTest Failures
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-9286?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-9286:
-------------------------------
Status: Open (was: New)
> WriteBehindFaultToleranceTest Failures
> --------------------------------------
>
> Key: ISPN-9286
> URL: https://issues.jboss.org/browse/ISPN-9286
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Loaders and Stores
> Affects Versions: 9.3.0.CR1
> Reporter: Ryan Emerson
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Final
>
>
> #testBlockingOnStoreAvailabilityChange failure due to the async write not completing before store.size is called.
> {code:java}
> java.lang.AssertionError: expected:<1> but was:<0>
> at org.infinispan.persistence.WriteBehindFaultToleranceTest.testBlockingOnStoreAvailabilityChange(WriteBehindFaultToleranceTest.java:55)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ... Removed 23 stack frames
> {code}
> #testWritesFailSilentlyWhenConfigured failure due to the async write not being executed on the store before the entry is loaded from the store.
> {code:java}
> java.lang.NullPointerException
> at org.infinispan.persistence.WriteBehindFaultToleranceTest.testWritesFailSilentlyWhenConfigured(WriteBehindFaultToleranceTest.java:110)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ... Removed 18 stack frames
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months
[JBoss JIRA] (ISPN-9286) WriteBehindFaultToleranceTest Failures
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-9286?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-9286:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/6061
> WriteBehindFaultToleranceTest Failures
> --------------------------------------
>
> Key: ISPN-9286
> URL: https://issues.jboss.org/browse/ISPN-9286
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Loaders and Stores
> Affects Versions: 9.3.0.CR1
> Reporter: Ryan Emerson
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Final
>
>
> #testBlockingOnStoreAvailabilityChange failure due to the async write not completing before store.size is called.
> {code:java}
> java.lang.AssertionError: expected:<1> but was:<0>
> at org.infinispan.persistence.WriteBehindFaultToleranceTest.testBlockingOnStoreAvailabilityChange(WriteBehindFaultToleranceTest.java:55)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ... Removed 23 stack frames
> {code}
> #testWritesFailSilentlyWhenConfigured failure due to the async write not being executed on the store before the entry is loaded from the store.
> {code:java}
> java.lang.NullPointerException
> at org.infinispan.persistence.WriteBehindFaultToleranceTest.testWritesFailSilentlyWhenConfigured(WriteBehindFaultToleranceTest.java:110)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ... Removed 18 stack frames
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months
[JBoss JIRA] (ISPN-8533) Deadlock in pessimistic transaction
by Pedro Ruivo (JIRA)
[ https://issues.jboss.org/browse/ISPN-8533?page=com.atlassian.jira.plugin.... ]
Pedro Ruivo updated ISPN-8533:
------------------------------
Sprint: Sprint 9.3.0.Final
> Deadlock in pessimistic transaction
> -----------------------------------
>
> Key: ISPN-8533
> URL: https://issues.jboss.org/browse/ISPN-8533
> Project: Infinispan
> Issue Type: Bug
> Reporter: Pedro Ruivo
> Assignee: Pedro Ruivo
> Fix For: 9.3.0.Final
>
>
> Deadlock can happen (not sure how) during a topology change and pessimistic transaction.
> 2 transaction can end up in the pending lock manager and they will wait for each other to finish.
> {noformat}
> Caused by: org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on xxxx in behalf of transaction GlobalTx:node-a:3746. Current owner GlobalTx:node-b:3629.
> at org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.timeout(DefaultPendingLockManager.java:262) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.awaitOn(DefaultPendingLockManager.java:345) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.awaitPendingTransactionsForAllKeys(DefaultPendingLockManager.java:147) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.checkPendingAndLockAllKeys(AbstractTxLockingInterceptor.java:166) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAllOrRegisterBackupLock(AbstractTxLockingInterceptor.java:134) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.localLockCommandWork(PessimisticLockingInterceptor.java:234) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.lambda$new$0(PessimisticLockingInterceptor.java:41) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> {noformat}
> {noformat}
> Caused by: org.infinispan.util.concurrent.TimeoutException: Could not acquire lock on xxxx in behalf of transaction GlobalTx:node-b:3629. Current
> owner GlobalTx:node-a:3746.
> at org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.timeout(DefaultPendingLockManager.java:262) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.awaitOn(DefaultPendingLockManager.java:345) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.util.concurrent.locks.impl.DefaultPendingLockManager.awaitPendingTransactionsForAllKeys(DefaultPendingLockManager.java:147) ~[infinispan-embedded-9.1.2.Final.
> jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.checkPendingAndLockAllKeys(AbstractTxLockingInterceptor.java:166) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAllOrRegisterBackupLock(AbstractTxLockingInterceptor.java:134) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.localLockCommandWork(PessimisticLockingInterceptor.java:234) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.lambda$new$0(PessimisticLockingInterceptor.java:41) ~[infinispan-embedded-9.1.2.Final.jar:9.1.2.Final]
> {noformat}
> notes: confirm the pending lock manager is cleanup!
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 10 months