[jboss-jira] [JBoss JIRA] (JGRP-1659) deadlock in MFC with default configuration
Bela Ban (JIRA)
jira-events at lists.jboss.org
Thu Jul 18 11:39:26 EDT 2013
[ https://issues.jboss.org/browse/JGRP-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790778#comment-12790778 ]
Bela Ban commented on JGRP-1659:
--------------------------------
I think I know what the problem is: a misconfiguration !
{noformat}
<MFC max_credits="200k" min_threshold="0.20"/>
<FRAG2 frag_size="60k"/>
{noformat}
MFC has a receiver send credits to sender(s) when the min credits drops to 20% of 200k = 40k.
However, when FRAG2 sends a 60k message, and it has less than 60k of credits available, the receivers won't automatically send new credits. It is rather the sender which needs to ask the receivers to send more credits, and this is done every max_block_time ms (default: 5000) at most.
So this would extremely slow down things !
SOLUTION:
- Either set FRAG2.frag_size to be less than MFC.min_threshold * MFC.max_credits (=40k), e.g. to 20k, or
- Increase max_credits or min_threshold, such that their product is greater than FRAG2.frag_size
I'll try with multiple nodes on my end to make sure this is the correct recommendation
> deadlock in MFC with default configuration
> ------------------------------------------
>
> Key: JGRP-1659
> URL: https://issues.jboss.org/browse/JGRP-1659
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.2.7
> Reporter: Mircea Markus
> Assignee: Bela Ban
> Fix For: 3.4
>
> Attachments: expiration-test.zip
>
>
> MFC.down does the following:
> {code:java}
> credits.decrement(length, block_time); //A
> if(needToSendCreditRequest()) //B
> sendCreditRequest(tuple.getVal1(), Math.min(max_credits)
> {code}
> A blocks forever even if the MFC.max_block_time is configured:
> {code:xml}
> <MFC max_credits="200k" min_threshold="0.20" max_block_time="1"/>
> {code}
> This happens at the same time on the whole cluster. B never gets invoked, so both wake up conditions( credits received or timeout) for credits.decrement are never satisfied resulting in the whole cluster to freeze.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list