[infinispan-issues] [JBoss JIRA] (ISPN-5885) NonTotalOrderPerCacheInvocationHandlerImpl should release locks

Mon Oct 26 09:03:00 EDT 2015

     [ https://issues.jboss.org/browse/ISPN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dan Berindei reassigned ISPN-5885:
----------------------------------

    Assignee: Pedro Ruivo


> NonTotalOrderPerCacheInvocationHandlerImpl should release locks
> ---------------------------------------------------------------
>
>                 Key: ISPN-5885
>                 URL: https://issues.jboss.org/browse/ISPN-5885
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 8.0.1.Final, 8.1.0.Alpha2
>            Reporter: Dan Berindei
>            Assignee: Pedro Ruivo
>              Labels: testsuite_stability
>             Fix For: 8.1.0.Beta1, 8.0.2.Final
>
>         Attachments: ThreeNodesReplicatedSplitAndMergeTest.log.zip
>
>
> Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by {{NonTotalOrderPerCacheInvocationHandlerImpl}} before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.
> There is currently one situation where I have reproduced this, in {{ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0}}. Node {{C}} is split in a partition by itself, and then it is merged with the {{AB}} partition. After the merge, JGroups sometimes replays the put command broadcasted in the {{AB}} partition on {{C}}. {{C}}'s cache is still in degraded mode, so the write fails, but its locks are never released.
> {noformat}
> 12:53:41,088 INFO  (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
> 12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2 at NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2 at NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
> 12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2 at NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext at 67bb1b50]
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2 at NodeC-3920/9}, status=DEGRADED_MODE
> 12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2 at NodeC-3920/9}
> 12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
> org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2 at NodeC-3920/9}' is not available. Not all owners are in this partition
> 12:53:42,144 WARN  (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2 at NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
> {noformat}
> Of course, knowing that {{C}} is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.


--
This message was sent by Atlassian JIRA
(v6.4.11#64026)