You wouldn't have this encapsulated as a unit test, would you?
On 5 Aug 2009, at 14:42, Brian Stansberry wrote:
Paul and I have been poking through code trying to diagnose the
issue
that led to this thread for quite a while yesterday. Time to throw it
out there and see if anyone hopefully spots something:
This is web session replication with buddy replication. The
"LockParentForChildInsertRemove" property isn't configured on the
cache,
so the default "false" should be in effect. A thread trying to
execute a
remote DataGravitationCleanupCommand on a session node fails upgrading
to a write lock on fqn /BUDDY_MANAGER.[1] We can't understand why it
would want a WL.
Looking at the PessimisticNodeBasedLockManager code, the only
reasons I
see why a WL would be requested here are:
1) The invocation created the previously non-existent /BUDDY_MANAGER
node in order to execute the remove of the
great-great-great-great-grandchild. Highly unlikely in general and
logging shows another thread successfully doing a put on a different
great-great-great-great-grandchild a millisecond earlier. (Also other
test runs w/ less verbose logging showed failures getting a WL on
fqn /
which of course exists).
2) In call to PNBLM.writeLockNeeded(...)
currentNode.isLockForChildInsertRemove() is returning "true". Which
AFAICT it shouldn't be.
[1] The logging from the thread:
[JBoss] 20:29:27,488 TRACE
[org.jboss.cache.marshall.CommandAwareRpcDispatcher]
(Incoming-5,10.34.32.154:46064) Executing command:
DataGravitationCleanupCommand{fqn=/JSESSION/st_localhost/
wUaU4FhUzWcuiDEjkJzDiQ__,
backup=/_BUDDY_BACKUP_/10.34.32.155_41754:DEAD/1/JSESSION/
st_localhost/wUaU4FhUzWcuiDEjkJzDiQ__}
[sender=10.34.32.153:24595]
[JBoss] 20:29:27,489 TRACE
[org.jboss.cache.marshall.CommandAwareRpcDispatcher]
(Incoming-5,10.34.32.154:46064) This is a non-visitable command - so
performing directly and not via the invoker.
[JBoss] 20:29:42,497 ERROR
[org.jboss.cache.lock.ReadWriteLockWithUpgrade]
(Incoming-5,10.34.32.154:46064) upgradeLock(): failed
[JBoss] 20:29:42,500 TRACE
[org.jboss.cache.marshall.CommandAwareRpcDispatcher]
(Incoming-5,10.34.32.154:46064) Problems invoking command.
[JBoss] org.jboss.cache.lock.UpgradeException: upgrade lock for
/_BUDDY_BACKUP_ could not be acquired after 15000 ms. Lock map
ownership
Read lock owners: [GlobalTransaction:<10.34.32.153:24595>:70,
GlobalTransaction:<10.34.32.153:24595>:69]
[JBoss] Write lock owner: null
[JBoss] (caller=Thread[Incoming-5,10.34.32.154:46064,5,Thread Pools],
lock info: read owners=[GlobalTransaction:<10.34.32.153:24595>:70,
GlobalTransaction:<10.34.32.153:24595>:69] (activeReaders=2,
activeWriter=null, waitingReaders=1, waitingWriters=0,
waitingUpgrader=0))
[JBoss] at
org
.jboss.cache.lock.IdentityLock.acquireWriteLock0(IdentityLock.java:
227)
[JBoss] at
org.jboss.cache.lock.IdentityLock.acquireWriteLock(IdentityLock.java:
176)
[JBoss] at
org.jboss.cache.lock.IdentityLock.acquire(IdentityLock.java:503)
[JBoss] at
org
.jboss
.cache
.lock
.PessimisticNodeBasedLockManager
.acquireNodeLock(PessimisticNodeBasedLockManager.java:260)
[JBoss] at
org
.jboss
.cache
.lock
.PessimisticNodeBasedLockManager
.lock(PessimisticNodeBasedLockManager.java:169)
[JBoss] at
org
.jboss
.cache
.lock
.PessimisticNodeBasedLockManager
.lockPessimistically(PessimisticNodeBasedLockManager.java:97)
[JBoss] at
org
.jboss
.cache
.interceptors
.PessimisticLockInterceptor
.handleRemoveNodeCommand(PessimisticLockInterceptor.java:265)
[JBoss] at
org
.jboss
.cache
.interceptors
.base
.PrePostProcessingCommandInterceptor
.visitRemoveNodeCommand(PrePostProcessingCommandInterceptor.java:126)
[JBoss] at
org
.jboss
.cache
.commands
.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
[JBoss] at
org
.jboss
.cache
.interceptors
.base
.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
[JBoss] at
org
.jboss
.cache
.interceptors
.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:
148)
[JBoss] at
org
.jboss
.cache
.interceptors
.ReplicationInterceptor
.visitRemoveNodeCommand(ReplicationInterceptor.java:125)
[JBoss] at
org
.jboss
.cache
.commands
.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
[JBoss] at
org
.jboss
.cache
.interceptors
.base
.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
[JBoss] at
org
.jboss
.cache
.interceptors
.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
[JBoss] at
org
.jboss
.cache
.commands
.AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75)
[JBoss] at
org
.jboss
.cache
.commands
.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
[JBoss] at
org
.jboss
.cache
.interceptors
.base
.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
[JBoss] at
org
.jboss
.cache
.interceptors
.TxInterceptor.attachGtxAndPassUpChain(TxInterceptor.java:301)
[JBoss] at
org
.jboss
.cache.interceptors.TxInterceptor.handleDefault(TxInterceptor.java:
283)
[JBoss] at
org
.jboss
.cache
.commands
.AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75)
[JBoss] at
org
.jboss
.cache
.commands
.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
[JBoss] at
org
.jboss
.cache
.interceptors
.base
.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
[JBoss] at
org
.jboss
.cache
.interceptors
.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
[JBoss] at
org
.jboss
.cache
.commands
.AbstractVisitor.visitRemoveNodeCommand(AbstractVisitor.java:75)
[JBoss] at
org
.jboss
.cache
.commands
.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
[JBoss] at
org
.jboss
.cache
.interceptors
.base
.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
[JBoss] at
org
.jboss
.cache
.interceptors
.InvocationContextInterceptor
.handleAll(InvocationContextInterceptor.java:178)
[JBoss] at
org
.jboss
.cache
.interceptors
.InvocationContextInterceptor
.visitRemoveNodeCommand(InvocationContextInterceptor.java:88)
[JBoss] at
org
.jboss
.cache
.commands
.write.RemoveNodeCommand.acceptVisitor(RemoveNodeCommand.java:125)
[JBoss] at
org
.jboss
.cache.interceptors.InterceptorChain.invoke(InterceptorChain.java:287)
[JBoss] at
org
.jboss
.cache
.commands
.remote
.DataGravitationCleanupCommand
.executeRemove(DataGravitationCleanupCommand.java:151)
[JBoss] at
org
.jboss
.cache
.commands
.remote
.DataGravitationCleanupCommand
.perform(DataGravitationCleanupCommand.java:105)
[JBoss] at
org
.jboss
.cache
.marshall
.CommandAwareRpcDispatcher
.executeCommand(CommandAwareRpcDispatcher.java:305)
[JBoss] at
org
.jboss
.cache
.marshall
.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:242)
[JBoss] at
org
.jgroups
.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:637)
[JBoss] at
org
.jgroups
.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:545)
[JBoss] at
org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:
368)
[JBoss] at
org.jgroups.blocks.MessageDispatcher
$ProtocolAdapter.up(MessageDispatcher.java:775)
[JBoss] at org.jgroups.JChannel.up(JChannel.java:1339)
[JBoss] at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:462)
[JBoss] at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:432)
[JBoss] at
org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:
144)
[JBoss] at org.jgroups.protocols.FRAG2.up(FRAG2.java:188)
[JBoss] at org.jgroups.protocols.FC.up(FC.java:473)
[JBoss] at org.jgroups.protocols.pbcast.GMS.up(GMS.java:824)
[JBoss] at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:233)
[JBoss] at
org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:616)
[JBoss] at org.jgroups.protocols.UNICAST.up(UNICAST.java:282)
[JBoss] at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:758)
[JBoss] at
org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
[JBoss] at org.jgroups.protocols.FD.up(FD.java:284)
[JBoss] at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
[JBoss] at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
[JBoss] at org.jgroups.protocols.Discovery.up(Discovery.java:264)
[JBoss] at org.jgroups.protocols.PING.up(PING.java:273)
[JBoss] at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2327)
[JBoss] at org.jgroups.protocols.TP.passMessageUp(TP.java:1261)
[JBoss] at org.jgroups.protocols.TP.access$100(TP.java:49)
[JBoss] at
org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1838)
[JBoss] at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1817)
[JBoss] at
java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
[JBoss] at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
[JBoss] at java.lang.Thread.run(Thread.java:619)
Manik Surtani wrote:
>
> On 4 Aug 2009, at 21:28, Brian Stansberry wrote:
>
>> Continuing dialogue with myself. :)
>>
>> I've found the logic in PessimisticNodeBasedLockManager where the
>> LockParentForChildInsertRemove configuration is meant to be coming
>> through via the Node.
>
> Yes, this is because you can override this on a per-node basis.
>
>> In our testing we're seeing cases where write
>> locks are being acquired on parent nodes, which isn't the intent.
>> I'll
>> dig further.
>>
>> Brian Stansberry wrote:
>>>
https://jira.jboss.org/jira/browse/JBCACHE-1527
>>>
>>> Brian Stansberry wrote:
>>>> From looking at the JBC 3 code, it seems the
>>>> LockParentForChildInsertRemove configuration is no longer
>>>> respected for
>>>> pessimistic locking. I can't trace any path from the property in
>>>> Configuration to code that uses it.
>>>> PessimisticLockInterceptor.handlePutCommand, handleMoveCommand and
>>>> handleRemoveNodeCommand all always tell the lock manager to lock
>>>> parents. handleEvictFqnCommand always tells the lock manager not
>>>> to
>>>> lock
>>>> parents.
>>>>
>>>> This is causing failures in buddy replication testing when nodes
>>>> join/leave clusters under load. There's a lot of data
>>>> gravitation plus
>>>> stuff like migrating defunct backup trees to "DEAD" regions.
Too
>>>> much
>>>> contention for parent level locks.
>>>>
>>>> Plus locking on the structural parent to add/remove session
>>>> nodes will
>>>> suck for the session caching use case.
>>>>
>>>
>>>
>>
>>
>> --
>> Brian Stansberry
>> Lead, AS Clustering
>> JBoss by Red Hat
>> _______________________________________________
>> jbosscache-dev mailing list
>> jbosscache-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/jbosscache-dev
>
> --
> Manik Surtani
> manik(a)jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
>
http://www.infinispan.org
>
http://www.jbosscache.org
>
>
>
>
--
Brian Stansberry
Lead, AS Clustering
JBoss by Red Hat
_______________________________________________
jbosscache-dev mailing list
jbosscache-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/jbosscache-dev