[jboss-jira] [JBoss JIRA] Commented: (JGRP-465) Deadlock in FC if RPC response blocks

Brian Stansberry (JIRA) jira-events at lists.jboss.org
Mon Apr 16 15:38:01 EDT 2007


    [ http://jira.jboss.com/jira/browse/JGRP-465?page=comments#action_12359478 ] 
            
Brian Stansberry commented on JGRP-465:
---------------------------------------

Bela, on our call we discussed using a ThreadLocal to flag the thread, and decided to maintain a Set of up threads instead to ensure J2ME support.

As I think about it more, in 2.4, there's at maximum only one thread the carries messages through FC.up(), correct?  So we can just store a ref to that thread in FC and in handleDownMessage() the check is just "if (up_thread == Thread.currentThread()) ...".

> Deadlock in FC if RPC response blocks
> -------------------------------------
>
>                 Key: JGRP-465
>                 URL: http://jira.jboss.com/jira/browse/JGRP-465
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.4.1 SP1
>            Reporter: Brian Stansberry
>         Assigned To: Brian Stansberry
>             Fix For: 2.4.1 SP2
>
>
> In 2.4.1.SP1 (and probably earlier) FC can deadlock if all up/down threads are set to false at and above FC and an incoming RPC loops back down the channel with its response.
> Following stack trace shows a deadlock situation (note the line numbers in FC are off from the cvs code; this occured with a patched FC version, but the patch is not relevant to this error):
> "IncomingPacketHandler (channel=Tomcat-Cluster)" daemon prio=1 tid=0xc8a68b60 nid=0x1ece in Object.wait() [0xc94d1000..0xc94d1f30]
> 	at java.lang.Object.wait(Native Method)
> 	at EDU.oswego.cs.dl.util.concurrent.CondVar.timedwait(CondVar.java:222)
> 	- locked <0xd187e398> (a EDU.oswego.cs.dl.util.concurrent.CondVar)
> 	at org.jgroups.protocols.FC.handleDownMessage(FC.java:394)
> 	at org.jgroups.protocols.FC.down(FC.java:336)
> 	at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
> 	at org.jgroups.protocols.FC.receiveDownEvent(FC.java:330)
> 	at org.jgroups.stack.Protocol.passDown(Protocol.java:551)
> 	at org.jgroups.protocols.FRAG2.down(FRAG2.java:167)
> 	at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
> 	at org.jgroups.stack.Protocol.passDown(Protocol.java:551)
> 	at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:294)
> 	at org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
> 	at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:385)
> 	at org.jgroups.JChannel.down(JChannel.java:1231)
> 	at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:790)
> 	at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passDown(MessageDispatcher.java:767)
> 	at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:693)
> 	at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:544)
> 	at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:367)
> 	at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:777)
> 	at org.jgroups.JChannel.up(JChannel.java:1091)
> 	at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:377)
> 	at org.jgroups.stack.ProtocolStack.receiveUpEvent(ProtocolStack.java:393)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.pbcast.STATE_TRANSFER.up(STATE_TRANSFER.java:158)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.FRAG2.up(FRAG2.java:197)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.FC.up(FC.java:377)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.pbcast.GMS.up(GMS.java:768)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.protocols.pbcast.GMS.receiveUpEvent(GMS.java:788)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:260)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:476)
> 	- locked <0xd1eec6d8> (a org.jgroups.protocols.UNICAST$Entry)
> 	at org.jgroups.protocols.UNICAST.up(UNICAST.java:206)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:569)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:170)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.FD.up(FD.java:300)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:301)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.MERGE2.up(MERGE2.java:162)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.Discovery.up(Discovery.java:225)
> 	at org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
> 	at org.jgroups.stack.Protocol.passUp(Protocol.java:538)
> 	at org.jgroups.protocols.TP.handleIncomingMessage(TP.java:908)
> 	at org.jgroups.protocols.TP.handleIncomingPacket(TP.java:850)
> 	at org.jgroups.protocols.TP.access$400(TP.java:45)
> 	at org.jgroups.protocols.TP$IncomingPacketHandler.run(TP.java:1296)
> 	at java.lang.Thread.run(Thread.java:595)
> The thread carried an RPC up to RpcDispatcher and is waiting in FC.handleDownMessage() for credits to become available to send the RPC response.  Those credits will never arrive, as the thread that is blocking is the one that would need to deliver the credits.
> This is less of an issue in 2.5.  This is because 2.5. uses the concurrent stack and credit replenishments are sent as OOB messages on a separate thread. (Would still be an issue in 2.5 if the concurrent stack were disabled via configuration.)
> Hacky solution is to somehow flag the up thread and in handleDownMessage() check for the flag before blocking the thread. If the flag is set, don't block the thread -- just let it through, i.e. let it exceed max_credits.
> In cases like recent JBC releases where the vast majority of RPC responses are lightweight "null" responses, this is pretty safe. Need to add a config flag to disable the workaround though for use in applications where RPC responses frequently return large amounts of data.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list