[jbosscache-dev] LockParentForChildInsertRemove and PessimisticLocking

Brian Stansberry brian.stansberry at redhat.com
Wed Aug 5 12:54:00 EDT 2009


Manik Surtani wrote:
> 
> On 5 Aug 2009, at 17:22, Brian Stansberry wrote:
> 
>> Another detail that may or may not be relevant:
>>
>> Note that the failed lock attempt msg is for a "upgrade lock" not a 
>> "write lock". That indicates IdentityLock already had the caller 
>> thread listed as a RL owner on the node. Not sure why the thread would 
>> have a RL on the node.
> 
> Any of this running within a tx?
> 

Good question. Most requests involve a GTX; I'll need to figure out why 
this one doesn't.

>>
>> Brian Stansberry wrote:
>>> Sorry, nope. :(
>>> I might (30% chance) be able to create one later today.
>>> Manik Surtani wrote:
>>>> You wouldn't have this encapsulated as a unit test, would you?
>>>>
>>>> On 5 Aug 2009, at 14:42, Brian Stansberry wrote:
>>>>
>>>>> Paul and I have been poking through code trying to diagnose the issue
>>>>> that led to this thread for quite a while yesterday. Time to throw it
>>>>> out there and see if anyone hopefully spots something:
>>>>>
>>>>> This is web session replication with buddy replication. The
>>>>> "LockParentForChildInsertRemove" property isn't configured on the 
>>>>> cache,
>>>>> so the default "false" should be in effect. A thread trying to 
>>>>> execute a
>>>>> remote DataGravitationCleanupCommand on a session node fails upgrading
>>>>> to a write lock on fqn /BUDDY_MANAGER.[1] We can't understand why it
>>>>> would want a WL.
>>>>>
>>>>> Looking at the PessimisticNodeBasedLockManager code, the only 
>>>>> reasons I
>>>>> see why a WL would be requested here are:
>>>>>
>>>>> 1) The invocation created the previously non-existent /BUDDY_MANAGER
>>>>> node in order to execute the remove of the
>>>>> great-great-great-great-grandchild. Highly unlikely in general and
>>>>> logging shows another thread successfully doing a put on a different
>>>>> great-great-great-great-grandchild a millisecond earlier. (Also other
>>>>> test runs w/ less verbose logging showed failures getting a WL on 
>>>>> fqn /
>>>>> which of course exists).
>>>>>
>>>>> 2) In call to PNBLM.writeLockNeeded(...)
>>>>> currentNode.isLockForChildInsertRemove() is returning "true". Which
>>>>> AFAICT it shouldn't be.
>>>>>
>>>>> [1] The logging from the thread:
>>>>>
>>>>> [JBoss] 20:29:27,488 TRACE
>>>>> [org.jboss.cache.marshall.CommandAwareRpcDispatcher]
>>>>> (Incoming-5,10.34.32.154:46064) Executing command:
>>>>> DataGravitationCleanupCommand{fqn=/JSESSION/st_localhost/wUaU4FhUzWcuiDEjkJzDiQ__, 
>>>>>
>>>>> backup=/_BUDDY_BACKUP_/10.34.32.155_41754:DEAD/1/JSESSION/st_localhost/wUaU4FhUzWcuiDEjkJzDiQ__} 
>>>>>
>>>>> [sender=10.34.32.153:24595]
>>>>> [JBoss] 20:29:27,489 TRACE
>>>>> [org.jboss.cache.marshall.CommandAwareRpcDispatcher]
>>>>> (Incoming-5,10.34.32.154:46064) This is a non-visitable command - so
>>>>> performing directly and not via the invoker.
>>>>> [JBoss] 20:29:42,497 ERROR
>>>>> [org.jboss.cache.lock.ReadWriteLockWithUpgrade]
>>>>> (Incoming-5,10.34.32.154:46064) upgradeLock(): failed
>>>>> [JBoss] 20:29:42,500 TRACE
>>>>> [org.jboss.cache.marshall.CommandAwareRpcDispatcher]
>>>>> (Incoming-5,10.34.32.154:46064) Problems invoking command.
>>>>> [JBoss] org.jboss.cache.lock.UpgradeException: upgrade lock for
>>>>> /_BUDDY_BACKUP_ could not be acquired after 15000 ms. Lock map 
>>>>> ownership
>>>>> Read lock owners: [GlobalTransaction:<10.34.32.153:24595>:70,
>>>>> GlobalTransaction:<10.34.32.153:24595>:69]
>>>>> [JBoss] Write lock owner: null
>>>>> [JBoss]  (caller=Thread[Incoming-5,10.34.32.154:46064,5,Thread Pools],
>>>>> lock info: read owners=[GlobalTransaction:<10.34.32.153:24595>:70,
>>>>> GlobalTransaction:<10.34.32.153:24595>:69] (activeReaders=2,
>>>>> activeWriter=null, waitingReaders=1, waitingWriters=0, 
>>>>> waitingUpgrader=0))
>>>>> [JBoss]     at
>>>>> org.jboss.cache.lock.IdentityLock.acquireWriteLock0(IdentityLock.java:227) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.lock.IdentityLock.acquireWriteLock(IdentityLock.java:176) 
>>>>>
>>>>> [JBoss]     at 
>>>>> org.jboss.cache.lock.IdentityLock.acquire(IdentityLock.java:503)
>>>>> [JBoss]     at
>>>>> org.jboss.cache.lock.PessimisticNodeBasedLockManager.acquireNodeLock(PessimisticNodeBasedLockManager.java:260) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.lock.PessimisticNodeBasedLockManager.lock(PessimisticNodeBasedLockManager.java:169) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.lock.PessimisticNodeBasedLockManager.lockPessimistically(PessimisticNodeBasedLockManager.java:97) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.PessimisticLockInterceptor.handleRemoveNodeCommand(PessimisticLockInterceptor.java:265) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.PrePostProcessingCommandInterceptor.visitRemoveNodeCommand(PrePostProcessingCommandInterceptor.java:126) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:148) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.ReplicationInterceptor.visitRemoveNodeCommand(ReplicationInterceptor.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.TxInterceptor.attachGtxAndPassUpChain(TxInterceptor.java:301) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.TxInterceptor.handleDefault(TxInterceptor.java:283) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:178) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.InvocationContextInterceptor.visitRemoveNodeCommand(InvocationContextInterceptor.java:88) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.interceptors.InterceptorChain.invoke(InterceptorChain.java:287) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.remote.DataGravitationCleanupCommand.executeRemove(DataGravitationCleanupCommand.java:151) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.commands.remote.DataGravitationCleanupCommand.perform(DataGravitationCleanupCommand.java:105) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.marshall.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:305) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jboss.cache.marshall.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:242) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:637) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:545) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:368) 
>>>>>
>>>>> [JBoss]     at
>>>>> org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:775) 
>>>>>
>>>>> [JBoss]     at org.jgroups.JChannel.up(JChannel.java:1339)
>>>>> [JBoss]     at 
>>>>> org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:462)
>>>>> [JBoss]     at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:432)
>>>>> [JBoss]     at
>>>>> org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:144) 
>>>>>
>>>>> [JBoss]     at org.jgroups.protocols.FRAG2.up(FRAG2.java:188)
>>>>> [JBoss]     at org.jgroups.protocols.FC.up(FC.java:473)
>>>>> [JBoss]     at org.jgroups.protocols.pbcast.GMS.up(GMS.java:824)
>>>>> [JBoss]     at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:233)
>>>>> [JBoss]     at
>>>>> org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:616)
>>>>> [JBoss]     at org.jgroups.protocols.UNICAST.up(UNICAST.java:282)
>>>>> [JBoss]     at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:758)
>>>>> [JBoss]     at 
>>>>> org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
>>>>> [JBoss]     at org.jgroups.protocols.FD.up(FD.java:284)
>>>>> [JBoss]     at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
>>>>> [JBoss]     at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
>>>>> [JBoss]     at org.jgroups.protocols.Discovery.up(Discovery.java:264)
>>>>> [JBoss]     at org.jgroups.protocols.PING.up(PING.java:273)
>>>>> [JBoss]     at 
>>>>> org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2327)
>>>>> [JBoss]     at org.jgroups.protocols.TP.passMessageUp(TP.java:1261)
>>>>> [JBoss]     at org.jgroups.protocols.TP.access$100(TP.java:49)
>>>>> [JBoss]     at
>>>>> org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1838)
>>>>> [JBoss]     at 
>>>>> org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1817)
>>>>> [JBoss]     at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) 
>>>>>
>>>>> [JBoss]     at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
>>>>>
>>>>> [JBoss]     at java.lang.Thread.run(Thread.java:619)
>>>>>
>>>>> Manik Surtani wrote:
>>>>>> On 4 Aug 2009, at 21:28, Brian Stansberry wrote:
>>>>>>
>>>>>>> Continuing dialogue with myself. :)
>>>>>>>
>>>>>>> I've found the logic in PessimisticNodeBasedLockManager where the
>>>>>>> LockParentForChildInsertRemove configuration is meant to be coming
>>>>>>> through via the Node.
>>>>>> Yes, this is because you can override this on a per-node basis.
>>>>>>
>>>>>>> In our testing we're seeing cases where write
>>>>>>> locks are being acquired on parent nodes, which isn't the intent. 
>>>>>>> I'll
>>>>>>> dig further.
>>>>>>>
>>>>>>> Brian Stansberry wrote:
>>>>>>>> https://jira.jboss.org/jira/browse/JBCACHE-1527
>>>>>>>>
>>>>>>>> Brian Stansberry wrote:
>>>>>>>>> From looking at the JBC 3 code, it seems the
>>>>>>>>> LockParentForChildInsertRemove configuration is no longer 
>>>>>>>>> respected for
>>>>>>>>> pessimistic locking.  I can't trace any path from the property in
>>>>>>>>> Configuration to code that uses it.
>>>>>>>>> PessimisticLockInterceptor.handlePutCommand, handleMoveCommand and
>>>>>>>>> handleRemoveNodeCommand all always tell the lock manager to lock
>>>>>>>>> parents. handleEvictFqnCommand always tells the lock manager 
>>>>>>>>> not to
>>>>>>>>> lock
>>>>>>>>> parents.
>>>>>>>>>
>>>>>>>>> This is causing failures in buddy replication testing when nodes
>>>>>>>>> join/leave clusters under load. There's a lot of data 
>>>>>>>>> gravitation plus
>>>>>>>>> stuff like migrating defunct backup trees to "DEAD" regions. 
>>>>>>>>> Too much
>>>>>>>>> contention for parent level locks.
>>>>>>>>>
>>>>>>>>> Plus locking on the structural parent to add/remove session 
>>>>>>>>> nodes will
>>>>>>>>> suck for the session caching use case.
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Brian Stansberry
>>>>>>> Lead, AS Clustering
>>>>>>> JBoss by Red Hat
>>>>>>> _______________________________________________
>>>>>>> jbosscache-dev mailing list
>>>>>>> jbosscache-dev at lists.jboss.org
>>>>>>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>>>>> -- 
>>>>>> Manik Surtani
>>>>>> manik at jboss.org
>>>>>> Lead, Infinispan
>>>>>> Lead, JBoss Cache
>>>>>> http://www.infinispan.org
>>>>>> http://www.jbosscache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>> Brian Stansberry
>>>>> Lead, AS Clustering
>>>>> JBoss by Red Hat
>>>>> _______________________________________________
>>>>> jbosscache-dev mailing list
>>>>> jbosscache-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>>>> -- 
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> Lead, Infinispan
>>>> Lead, JBoss Cache
>>>> http://www.infinispan.org
>>>> http://www.jbosscache.org
>>>>
>>>>
>>>>
>>>>
>>
>>
>> -- 
>> Brian Stansberry
>> Lead, AS Clustering
>> JBoss by Red Hat
> 
> -- 
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 


-- 
Brian Stansberry
Lead, AS Clustering
JBoss by Red Hat


More information about the jbosscache-dev mailing list