[JBoss JIRA] (ISPN-5884) ClusteredListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent random failures
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5884?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5884:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3783
> ClusteredListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent random failures
> ----------------------------------------------------------------------------------------
>
> Key: ISPN-5884
> URL: https://issues.jboss.org/browse/ISPN-5884
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 8.1.0.Beta1, 8.0.2.Final
>
>
> {{ClusteredListenerReplTest}} uses a {{BlockingInterceptor}} to ensure a write command is blocked on the backup owners while the primary owner is killed.
> The {{BlockingInterceptor}} is removed after the primary owner was killed, but only on one of the backups. This is enough if the node 0 becomes the primary owner of the key, because the command is retried before the previous invocation is unblocked, and {{BlockingInterceptor}} only blocks one command at a time. But if node 2 becomes the primary owner, the retry happens after the initial command was unblocked, and it blocks in {{BlockingInterceptor}}.
> The fix would be to remove the interceptor from both nodes.
> I also believe the "block one command at a time" logic should be removed, because it makes tests too dependent on timing.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5884) ClusteredListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent random failures
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5884?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5884:
-------------------------------
Fix Version/s: 8.0.2.Final
> ClusteredListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent random failures
> ----------------------------------------------------------------------------------------
>
> Key: ISPN-5884
> URL: https://issues.jboss.org/browse/ISPN-5884
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 8.1.0.Beta1, 8.0.2.Final
>
>
> {{ClusteredListenerReplTest}} uses a {{BlockingInterceptor}} to ensure a write command is blocked on the backup owners while the primary owner is killed.
> The {{BlockingInterceptor}} is removed after the primary owner was killed, but only on one of the backups. This is enough if the node 0 becomes the primary owner of the key, because the command is retried before the previous invocation is unblocked, and {{BlockingInterceptor}} only blocks one command at a time. But if node 2 becomes the primary owner, the retry happens after the initial command was unblocked, and it blocks in {{BlockingInterceptor}}.
> The fix would be to remove the interceptor from both nodes.
> I also believe the "block one command at a time" logic should be removed, because it makes tests too dependent on timing.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5885) NonTotalOrderPerCacheInvocationHandlerImpl should release locks
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5885?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5885:
-------------------------------
Description:
Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its lock is never released.
{noformat}
12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
{noformat}
Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by -topology changes or- programming errors, so a solution that handles any exception would be preferable.
was:
Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its lock is never released.
{noformat}
12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
{noformat}
Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
> NonTotalOrderPerCacheInvocationHandlerImpl should release locks
> ---------------------------------------------------------------
>
> Key: ISPN-5885
> URL: https://issues.jboss.org/browse/ISPN-5885
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.0.1.Final, 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Labels: testsuite_stability
> Fix For: 8.1.0.Beta1, 8.0.2.Final
>
> Attachments: ThreeNodesReplicatedSplitAndMergeTest.log.zip
>
>
> Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
> There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its lock is never released.
> {noformat}
> 12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
> 12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
> 12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
> org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
> 12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
> {noformat}
> Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by -topology changes or- programming errors, so a solution that handles any exception would be preferable.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5885) NonTotalOrderPerCacheInvocationHandlerImpl should release locks
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5885?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5885:
-------------------------------
Description:
Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its lock is never released.
{noformat}
12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
{noformat}
Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
was:
Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its locks are never released.
{noformat}
12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
{noformat}
Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
> NonTotalOrderPerCacheInvocationHandlerImpl should release locks
> ---------------------------------------------------------------
>
> Key: ISPN-5885
> URL: https://issues.jboss.org/browse/ISPN-5885
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.0.1.Final, 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Labels: testsuite_stability
> Fix For: 8.1.0.Beta1, 8.0.2.Final
>
> Attachments: ThreeNodesReplicatedSplitAndMergeTest.log.zip
>
>
> Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
> There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its lock is never released.
> {noformat}
> 12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
> 12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
> 12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
> org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
> 12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
> {noformat}
> Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5885) NonTotalOrderPerCacheInvocationHandlerImpl should release locks
by Dan Berindei (JIRA)
Dan Berindei created ISPN-5885:
----------------------------------
Summary: NonTotalOrderPerCacheInvocationHandlerImpl should release locks
Key: ISPN-5885
URL: https://issues.jboss.org/browse/ISPN-5885
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 8.1.0.Alpha2, 8.0.1.Final
Reporter: Dan Berindei
Fix For: 8.1.0.Beta1, 8.0.2.Final
Attachments: ThreeNodesReplicatedSplitAndMergeTest.log.zip
Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its locks are never released.
{noformat}
12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
{noformat}
Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5885) NonTotalOrderPerCacheInvocationHandlerImpl should release locks
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5885?page=com.atlassian.jira.plugin.... ]
Dan Berindei reassigned ISPN-5885:
----------------------------------
Assignee: Pedro Ruivo
> NonTotalOrderPerCacheInvocationHandlerImpl should release locks
> ---------------------------------------------------------------
>
> Key: ISPN-5885
> URL: https://issues.jboss.org/browse/ISPN-5885
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.0.1.Final, 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Labels: testsuite_stability
> Fix For: 8.1.0.Beta1, 8.0.2.Final
>
> Attachments: ThreeNodesReplicatedSplitAndMergeTest.log.zip
>
>
> Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
> There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its locks are never released.
> {noformat}
> 12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
> 12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
> 12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
> org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
> 12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
> {noformat}
> Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5885) NonTotalOrderPerCacheInvocationHandlerImpl should release locks
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5885?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5885:
-------------------------------
Status: Open (was: New)
> NonTotalOrderPerCacheInvocationHandlerImpl should release locks
> ---------------------------------------------------------------
>
> Key: ISPN-5885
> URL: https://issues.jboss.org/browse/ISPN-5885
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.0.1.Final, 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Labels: testsuite_stability
> Fix For: 8.1.0.Beta1, 8.0.2.Final
>
> Attachments: ThreeNodesReplicatedSplitAndMergeTest.log.zip
>
>
> Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
> There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its locks are never released.
> {noformat}
> 12:53:41,088 INFO (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
> 12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
> 12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
> org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
> 12:53:42,144 WARN (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
> {noformat}
> Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5613) Replication timeouts should show more information of the affected entries
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5613?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5613:
-------------------------------
Status: Open (was: New)
> Replication timeouts should show more information of the affected entries
> -------------------------------------------------------------------------
>
> Key: ISPN-5613
> URL: https://issues.jboss.org/browse/ISPN-5613
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Reporter: Wolf-Dieter Fink
> Assignee: Dan Berindei
>
> Typical timeouts for replication are
> ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (remote-thread-595) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: Replication timeout for jboss####-#####
> and
> ERROR ....
> Caused by: org.infinispan.util.concurrent.TimeoutException: Node jboss####-##### timed out
> The issue is that it is not possible to check what key/entry is affected.
> The cache can become inconsistent between nodes.
> There should be an information which key is affected.
> For some operations it is not possible at all as there is no information about the entries (i.e. commit) or the key is binary or custom without a valid toString() method.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-5884) ClusteredListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent random failures
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-5884?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-5884:
-------------------------------
Status: Open (was: New)
> ClusteredListenerReplTest.testPrimaryOwnerGoesDownAfterBackupRaisesEvent random failures
> ----------------------------------------------------------------------------------------
>
> Key: ISPN-5884
> URL: https://issues.jboss.org/browse/ISPN-5884
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 8.1.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 8.1.0.Beta1
>
>
> {{ClusteredListenerReplTest}} uses a {{BlockingInterceptor}} to ensure a write command is blocked on the backup owners while the primary owner is killed.
> The {{BlockingInterceptor}} is removed after the primary owner was killed, but only on one of the backups. This is enough if the node 0 becomes the primary owner of the key, because the command is retried before the previous invocation is unblocked, and {{BlockingInterceptor}} only blocks one command at a time. But if node 2 becomes the primary owner, the retry happens after the initial command was unblocked, and it blocks in {{BlockingInterceptor}}.
> The fix would be to remove the interceptor from both nodes.
> I also believe the "block one command at a time" logic should be removed, because it makes tests too dependent on timing.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months