[jboss-jira] [JBoss JIRA] (JGRP-1835) DONT_LOOPBACK flag causes credit exhaustion in MFC

Fri May 9 09:07:57 EDT 2014

    [ https://issues.jboss.org/browse/JGRP-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966770#comment-12966770 ] 

Bela Ban edited comment on JGRP-1835 at 5/9/14 9:07 AM:
--------------------------------------------------------

Hmm, the above comment is wrong: A would have to wait for B's credits (slower) before it can send new messages.

However, I found a bug: in Table.add(seqno, message, remove_if_possible), we remove the message if {{hd+1 == seqno}}. This is not always true when multiple threads add messages marked as {{DONT_LOOPBACK}}, e.g. (next expected seqno is 4)
* T1 adds seqno=5
* T2 adds seqno=4
* T3 adds seqno=6

For T1 and T3 the condition {{hd+1 == seqno}} is not true, only for T2. So 5 is added *but not removed*. Then 4 is added and removed. Message 6 is also added but not removed, either. So we end up having messages 5 and 6 in the retransmission table which are not removed until a regular message is received. This may never be the case if all messages are sent as {{DONT_LOOPBACK}} !

SOLUTION:
* When adding a {{DONT_LOOPBACK}} message to the table, add via a new {{Table.addAndDrop(seqno, msg, filter)}} method which adds a message and *removes as many consecutive messages as possible which pass filter*.

was (Author: belaban):
Hmm, the above comment is wrong: A would have to wait for B's credits (slower) before it can send new messages.

However, I found a bug: in Table.add(seqno, message, remove_if_possible), we remove the message if {{hd+1 == seqno}}. This is not always true when multiple threads add messages marked as {{DONT_LOOPBACK}}, e.g. (next expected seqno is 4)
* T1 adds seqno=5
* T2 adds seqno=4
* T3 adds seqno=6
For T1 and T3 the condition {{hd+1 == seqno}} is not true, only for T2. So 5 is added *but not removed*. Then 4 is added and removed. Message 6 is also added but not removed, either. So we end up having messages 5 and 6 in the retransmission table which are not removed until a regular message is received. This may never be the case if all messages are sent as {{DONT_LOOPBACK}} !

SOLUTION:
* When adding a {{DONT_LOOPBACK}} message to the table, add via a new {{Table.addAndDrop(seqno, msg, filter)}} method which adds a message and *removes as many consecutive messages as possible which pass filter*.

> DONT_LOOPBACK flag causes credit exhaustion in MFC
> --------------------------------------------------
>
>                 Key: JGRP-1835
>                 URL: https://issues.jboss.org/browse/JGRP-1835
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.5
>
>
> When a message M is multicast by sender P and has the {{DONT_LOOPBACK}} transient flag set, it will not get looped back up the stack.
> When M passes MFC on the way down, it decrements the sent credits for P. When M is received by P, the received credits are decrements and possibly credits are sent back to P.
> However, M is *never received by P* because {{DONT_LOOPBACK}} causes it to be dropped.
> This results in credits for P getting exhausted if we send many messages. Credit requests sent by P (every 5 s by default) do replenish P's credits, but if P sends many messages this is not enough, and slows things down.
> SOLUTION:
> # Don't use {{DONT_LOOPBACK}} in upper protocols or application space. JGroups itself uses it for discovery (lower protocols)
> # Don't maintain credits for self; ie. P doesn't decrement or replenish credits for itself, only for others
> #* Make sure P doesn't starve processing of messages from all other members (who use flow control) by circumventing flow control...
> # When sending a message tagged as {{DONT_LOOPBACK}}, decrement the *received* credits immediately and send new credits if needed

--
This message was sent by Atlassian JIRA
(v6.2.3#6260)