[jboss-jira] [JBoss JIRA] (JGRP-1675) CreditRequest in FlowControl is not OOB
Bela Ban (JIRA)
jira-events at lists.jboss.org
Mon Sep 23 07:19:03 EDT 2013
[ https://issues.jboss.org/browse/JGRP-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806457#comment-12806457 ]
Bela Ban commented on JGRP-1675:
--------------------------------
No, I don't want to run the entire Radargun test to find the needle in the haystack... :-)
Thinking about this logically, the problem you see is caused by retransmissions not getting served because the OOB thread pool is exhausted ? I can think of a few things to try out:
* Increasing OOB.max_threads: this would potentially create a lot of threads, but what's the current max and have you tried setting this to a large value for your test ?
* It seems the application issues a lot of OOB (RPC?) calls. Are these blocking ? I though we bypass flow control for blocking (sync) RPCs ?
* We could send retransmission requests as INTERNAL, bypassing the OOB thread pool.
** This would of course require an internal pool to be enabled, but by default it is anyway...
Can you briefly describe what the RadarGun test does ? Blocking RPCs with mode=REPL ?
WDYT ?
> CreditRequest in FlowControl is not OOB
> ---------------------------------------
>
> Key: JGRP-1675
> URL: https://issues.jboss.org/browse/JGRP-1675
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.4
> Reporter: Radim Vansa
> Assignee: Bela Ban
> Fix For: 3.4
>
>
> I have recently observed a repeated situation where many (or all) threads have been stuck waiting for credits in FlowControl protocol.
> The credit request was not handled on the other node as this is non-oob message and some (actually many of them - cause unknown) messages before the request have been lost - therefore the request was waiting for them to be re-sent.
> However, these have not been re-sent properly as the retransmission request was not received - all OOB threads were stuck in the FlowControl protocol as these handled some other request and tried to send a response - but the response could not be sent until FlowControl gets the credits.
> The probability of such situation could be lowered by tagging the credit request to be OOB - then it would be handled immediately. If the credit replenish message would then be processed in regular OOB pool, this could get already depleted by many requests, but setting up the internal thread pool would solve the problem.
> Other consideration would be to allow releasing thread from FlowControl (let it send the message even without credits) if it waits there for too long.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list