[jbosscache-dev] LockParentForChildInsertRemove and PessimisticLocking

Manik Surtani manik at jboss.org
Wed Aug 5 12:39:03 EDT 2009


On 5 Aug 2009, at 17:22, Brian Stansberry wrote:

> Another detail that may or may not be relevant:
>
> Note that the failed lock attempt msg is for a "upgrade lock" not a  
> "write lock". That indicates IdentityLock already had the caller  
> thread listed as a RL owner on the node. Not sure why the thread  
> would have a RL on the node.

Any of this running within a tx?

>
> Brian Stansberry wrote:
>> Sorry, nope. :(
>> I might (30% chance) be able to create one later today.
>> Manik Surtani wrote:
>>> You wouldn't have this encapsulated as a unit test, would you?
>>>
>>> On 5 Aug 2009, at 14:42, Brian Stansberry wrote:
>>>
>>>> Paul and I have been poking through code trying to diagnose the  
>>>> issue
>>>> that led to this thread for quite a while yesterday. Time to  
>>>> throw it
>>>> out there and see if anyone hopefully spots something:
>>>>
>>>> This is web session replication with buddy replication. The
>>>> "LockParentForChildInsertRemove" property isn't configured on the  
>>>> cache,
>>>> so the default "false" should be in effect. A thread trying to  
>>>> execute a
>>>> remote DataGravitationCleanupCommand on a session node fails  
>>>> upgrading
>>>> to a write lock on fqn /BUDDY_MANAGER.[1] We can't understand why  
>>>> it
>>>> would want a WL.
>>>>
>>>> Looking at the PessimisticNodeBasedLockManager code, the only  
>>>> reasons I
>>>> see why a WL would be requested here are:
>>>>
>>>> 1) The invocation created the previously non-existent / 
>>>> BUDDY_MANAGER
>>>> node in order to execute the remove of the
>>>> great-great-great-great-grandchild. Highly unlikely in general and
>>>> logging shows another thread successfully doing a put on a  
>>>> different
>>>> great-great-great-great-grandchild a millisecond earlier. (Also  
>>>> other
>>>> test runs w/ less verbose logging showed failures getting a WL on  
>>>> fqn /
>>>> which of course exists).
>>>>
>>>> 2) In call to PNBLM.writeLockNeeded(...)
>>>> currentNode.isLockForChildInsertRemove() is returning "true". Which
>>>> AFAICT it shouldn't be.
>>>>
>>>> [1] The logging from the thread:
>>>>
>>>> [JBoss] 20:29:27,488 TRACE
>>>> [org.jboss.cache.marshall.CommandAwareRpcDispatcher]
>>>> (Incoming-5,10.34.32.154:46064) Executing command:
>>>> DataGravitationCleanupCommand{fqn=/JSESSION/st_localhost/ 
>>>> wUaU4FhUzWcuiDEjkJzDiQ__,
>>>> backup=/_BUDDY_BACKUP_/10.34.32.155_41754:DEAD/1/JSESSION/ 
>>>> st_localhost/wUaU4FhUzWcuiDEjkJzDiQ__}
>>>> [sender=10.34.32.153:24595]
>>>> [JBoss] 20:29:27,489 TRACE
>>>> [org.jboss.cache.marshall.CommandAwareRpcDispatcher]
>>>> (Incoming-5,10.34.32.154:46064) This is a non-visitable command -  
>>>> so
>>>> performing directly and not via the invoker.
>>>> [JBoss] 20:29:42,497 ERROR
>>>> [org.jboss.cache.lock.ReadWriteLockWithUpgrade]
>>>> (Incoming-5,10.34.32.154:46064) upgradeLock(): failed
>>>> [JBoss] 20:29:42,500 TRACE
>>>> [org.jboss.cache.marshall.CommandAwareRpcDispatcher]
>>>> (Incoming-5,10.34.32.154:46064) Problems invoking command.
>>>> [JBoss] org.jboss.cache.lock.UpgradeException: upgrade lock for
>>>> /_BUDDY_BACKUP_ could not be acquired after 15000 ms. Lock map  
>>>> ownership
>>>> Read lock owners: [GlobalTransaction:<10.34.32.153:24595>:70,
>>>> GlobalTransaction:<10.34.32.153:24595>:69]
>>>> [JBoss] Write lock owner: null
>>>> [JBoss]  (caller=Thread[Incoming-5,10.34.32.154:46064,5,Thread  
>>>> Pools],
>>>> lock info: read owners=[GlobalTransaction:<10.34.32.153:24595>: 
>>>> 70,
>>>> GlobalTransaction:<10.34.32.153:24595>:69] (activeReaders=2,
>>>> activeWriter=null, waitingReaders=1, waitingWriters=0,  
>>>> waitingUpgrader=0))
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache.lock.IdentityLock.acquireWriteLock0(IdentityLock.java:227)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss.cache.lock.IdentityLock.acquireWriteLock(IdentityLock.java: 
>>>> 176)
>>>> [JBoss]     at  
>>>> org.jboss.cache.lock.IdentityLock.acquire(IdentityLock.java:503)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .lock 
>>>> .PessimisticNodeBasedLockManager 
>>>> .acquireNodeLock(PessimisticNodeBasedLockManager.java:260)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .lock 
>>>> .PessimisticNodeBasedLockManager 
>>>> .lock(PessimisticNodeBasedLockManager.java:169)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .lock 
>>>> .PessimisticNodeBasedLockManager 
>>>> .lockPessimistically(PessimisticNodeBasedLockManager.java:97)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .PessimisticLockInterceptor 
>>>> .handleRemoveNodeCommand(PessimisticLockInterceptor.java:265)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base 
>>>> .PrePostProcessingCommandInterceptor 
>>>> .visitRemoveNodeCommand(PrePostProcessingCommandInterceptor.java: 
>>>> 126)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base 
>>>> .CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java: 
>>>> 116)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .ReplicationInterceptor 
>>>> .handleCrudMethod(ReplicationInterceptor.java:148)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .ReplicationInterceptor 
>>>> .visitRemoveNodeCommand(ReplicationInterceptor.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base 
>>>> .CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java: 
>>>> 116)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base 
>>>> .CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java: 
>>>> 116)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .TxInterceptor.attachGtxAndPassUpChain(TxInterceptor.java:301)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors.TxInterceptor.handleDefault(TxInterceptor.java:283)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base 
>>>> .CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java: 
>>>> 116)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .base 
>>>> .CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java: 
>>>> 116)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .InvocationContextInterceptor 
>>>> .handleAll(InvocationContextInterceptor.java:178)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .interceptors 
>>>> .InvocationContextInterceptor 
>>>> .visitRemoveNodeCommand(InvocationContextInterceptor.java:88)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache.interceptors.InterceptorChain.invoke(InterceptorChain.java: 
>>>> 287)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .remote 
>>>> .DataGravitationCleanupCommand 
>>>> .executeRemove(DataGravitationCleanupCommand.java:151)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .commands 
>>>> .remote 
>>>> .DataGravitationCleanupCommand 
>>>> .perform(DataGravitationCleanupCommand.java:105)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .marshall 
>>>> .CommandAwareRpcDispatcher 
>>>> .executeCommand(CommandAwareRpcDispatcher.java:305)
>>>> [JBoss]     at
>>>> org 
>>>> .jboss 
>>>> .cache 
>>>> .marshall 
>>>> .CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java: 
>>>> 242)
>>>> [JBoss]     at
>>>> org 
>>>> .jgroups 
>>>> .blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:637)
>>>> [JBoss]     at
>>>> org 
>>>> .jgroups 
>>>> .blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java: 
>>>> 545)
>>>> [JBoss]     at
>>>> org 
>>>> .jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java: 
>>>> 368)
>>>> [JBoss]     at
>>>> org.jgroups.blocks.MessageDispatcher 
>>>> $ProtocolAdapter.up(MessageDispatcher.java:775)
>>>> [JBoss]     at org.jgroups.JChannel.up(JChannel.java:1339)
>>>> [JBoss]     at  
>>>> org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:462)
>>>> [JBoss]     at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java: 
>>>> 432)
>>>> [JBoss]     at
>>>> org 
>>>> .jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java: 
>>>> 144)
>>>> [JBoss]     at org.jgroups.protocols.FRAG2.up(FRAG2.java:188)
>>>> [JBoss]     at org.jgroups.protocols.FC.up(FC.java:473)
>>>> [JBoss]     at org.jgroups.protocols.pbcast.GMS.up(GMS.java:824)
>>>> [JBoss]     at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java: 
>>>> 233)
>>>> [JBoss]     at
>>>> org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:616)
>>>> [JBoss]     at org.jgroups.protocols.UNICAST.up(UNICAST.java:282)
>>>> [JBoss]     at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java: 
>>>> 758)
>>>> [JBoss]     at  
>>>> org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
>>>> [JBoss]     at org.jgroups.protocols.FD.up(FD.java:284)
>>>> [JBoss]     at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
>>>> [JBoss]     at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
>>>> [JBoss]     at org.jgroups.protocols.Discovery.up(Discovery.java: 
>>>> 264)
>>>> [JBoss]     at org.jgroups.protocols.PING.up(PING.java:273)
>>>> [JBoss]     at org.jgroups.protocols.TP 
>>>> $ProtocolAdapter.up(TP.java:2327)
>>>> [JBoss]     at org.jgroups.protocols.TP.passMessageUp(TP.java:1261)
>>>> [JBoss]     at org.jgroups.protocols.TP.access$100(TP.java:49)
>>>> [JBoss]     at
>>>> org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java: 
>>>> 1838)
>>>> [JBoss]     at org.jgroups.protocols.TP 
>>>> $IncomingPacket.run(TP.java:1817)
>>>> [JBoss]     at
>>>> java.util.concurrent.ThreadPoolExecutor 
>>>> $Worker.runTask(ThreadPoolExecutor.java:886)
>>>> [JBoss]     at
>>>> java.util.concurrent.ThreadPoolExecutor 
>>>> $Worker.run(ThreadPoolExecutor.java:908)
>>>> [JBoss]     at java.lang.Thread.run(Thread.java:619)
>>>>
>>>> Manik Surtani wrote:
>>>>> On 4 Aug 2009, at 21:28, Brian Stansberry wrote:
>>>>>
>>>>>> Continuing dialogue with myself. :)
>>>>>>
>>>>>> I've found the logic in PessimisticNodeBasedLockManager where the
>>>>>> LockParentForChildInsertRemove configuration is meant to be  
>>>>>> coming
>>>>>> through via the Node.
>>>>> Yes, this is because you can override this on a per-node basis.
>>>>>
>>>>>> In our testing we're seeing cases where write
>>>>>> locks are being acquired on parent nodes, which isn't the  
>>>>>> intent. I'll
>>>>>> dig further.
>>>>>>
>>>>>> Brian Stansberry wrote:
>>>>>>> https://jira.jboss.org/jira/browse/JBCACHE-1527
>>>>>>>
>>>>>>> Brian Stansberry wrote:
>>>>>>>> From looking at the JBC 3 code, it seems the
>>>>>>>> LockParentForChildInsertRemove configuration is no longer  
>>>>>>>> respected for
>>>>>>>> pessimistic locking.  I can't trace any path from the  
>>>>>>>> property in
>>>>>>>> Configuration to code that uses it.
>>>>>>>> PessimisticLockInterceptor.handlePutCommand,  
>>>>>>>> handleMoveCommand and
>>>>>>>> handleRemoveNodeCommand all always tell the lock manager to  
>>>>>>>> lock
>>>>>>>> parents. handleEvictFqnCommand always tells the lock manager  
>>>>>>>> not to
>>>>>>>> lock
>>>>>>>> parents.
>>>>>>>>
>>>>>>>> This is causing failures in buddy replication testing when  
>>>>>>>> nodes
>>>>>>>> join/leave clusters under load. There's a lot of data  
>>>>>>>> gravitation plus
>>>>>>>> stuff like migrating defunct backup trees to "DEAD" regions.  
>>>>>>>> Too much
>>>>>>>> contention for parent level locks.
>>>>>>>>
>>>>>>>> Plus locking on the structural parent to add/remove session  
>>>>>>>> nodes will
>>>>>>>> suck for the session caching use case.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Brian Stansberry
>>>>>> Lead, AS Clustering
>>>>>> JBoss by Red Hat
>>>>>> _______________________________________________
>>>>>> jbosscache-dev mailing list
>>>>>> jbosscache-dev at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>>>> -- 
>>>>> Manik Surtani
>>>>> manik at jboss.org
>>>>> Lead, Infinispan
>>>>> Lead, JBoss Cache
>>>>> http://www.infinispan.org
>>>>> http://www.jbosscache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> -- 
>>>> Brian Stansberry
>>>> Lead, AS Clustering
>>>> JBoss by Red Hat
>>>> _______________________________________________
>>>> jbosscache-dev mailing list
>>>> jbosscache-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>> -- 
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>>
>>>
>>>
>>>
>
>
> -- 
> Brian Stansberry
> Lead, AS Clustering
> JBoss by Red Hat

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org






More information about the jbosscache-dev mailing list