[
http://jira.jboss.com/jira/browse/JGRP-532?page=comments#action_12365526 ]
Bela Ban commented on JGRP-532:
-------------------------------
So, the basic issue is this:
- We have 1 thread that receives messages from each node (Connection.Receiver thread)
- When all threads are used up in the pool, rejection policy of "run" make that
receiver thread deliver the message
- If that's the case and the receiver thread blocks on NAKACK's NakReceiverWindow,
then it will not be able to pull further data off of the socket. After the socket's
receive window fills up, the sender will block in write(). Alternatively, if
use_send_queues is true, the receiver won't take messages off the queue, so put() will
block (as the queue is bounded).
- How can the receiver thread block ? If another receiver thread ahead of it acquires the
lock for the NakReceiverWindow in NAKACK and calls down() on the same thread and happens
to block on write() !
Example:
- Threads 1,2 and 3 have submitted messages to the thread pool
- Now the thread pool is exhausted, so "run" makes sure the Connection.Receiver
calls up() directly (on its own thread)
- The Connection.Receiver blocks until it can acquire the NRW lock, therefore it
doesn't pull data off of the socket
- Threads 1-3 send data, at one point, the TCP write() might block, say for example for
Thread 3.
- Thread-3 now blocks on write() and has the NRW lock
- The Connection.Receiver thread waits for the NRW lock, but meanwhile doesn't cause
the write() to unblock as it doesn't pull data off of the socket: ===> DEADLOCK !
SOLUTION: threads 1-3 need to unlock the NRW *before* they send down messages ! This is
JIRA 535
TCP: hangs without FC
---------------------
Key: JGRP-532
URL:
http://jira.jboss.com/jira/browse/JGRP-532
Project: JGroups
Issue Type: Bug
Reporter: Bela Ban
Assigned To: Bela Ban
Fix For: 2.5
Attachments: config.txt, tcp.xml, tmp.txt
Usually, FC is enabled, but when FC is not present, perf.Test with 2 senders and the
attached tcp.xml/config.txt hangs. Stack trace attached too.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira