[jboss-jira] [JBoss JIRA] (JGRP-1362) NAKACK: second line of defense for requested retransmissions that are not found

Bela Ban (JIRA) issues at jboss.org
Fri Jan 15 08:03:00 EST 2016


     [ https://issues.jboss.org/browse/JGRP-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bela Ban closed JGRP-1362.
--------------------------
    Resolution: Cannot Reproduce


> NAKACK: second line of defense for requested retransmissions that are not found
> -------------------------------------------------------------------------------
>
>                 Key: JGRP-1362
>                 URL: https://issues.jboss.org/browse/JGRP-1362
>             Project: JGroups
>          Issue Type: Enhancement
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.6.8
>
>
> When the original sender B is asked by A to retransmit message M, but doesn't have M in its retransmission table anymore, it should tell A, or else A will send retransmission requests to B until A or B leave.
> This problem should have been fixed by JGRP-1251, but if it turns out it wasn't, then this JIRA is (1) a second line of defense to stop the endless retransmission requests and (2) will give us valuable diagnostic information to fix the underlying problem (should there still be one).
> Problem:
> - A has a NakReceiverWindow (NRW) of 50 (highest_delivered seqno) for B
> - B's NRW, however, is 200. B garbage collected messages up to 150.
> - When B sends message 201, A will ask B for retransmission of [51-200]
> - B will retransmit messages [150-200], but it cannot send messages 51-149, as it doesn't have them anymore !
> - A will add messages [150-200], but its NRW is still 50 (highest_delivered)
> - A will continue asking B for messages [51-149] (it does have [150-201])
> - This will go on forever, or until B or A leaves
> SOLUTION:
> - When the *original sender* B of message M receives a retransmission request for M (from A), and it doesn't have M in its retransmission table, it should send back a MSG_NOT_FOUND message to A including B's digest
> - When A receives the MSG_NOT_FOUND message, it does the following:
>   - It logs it own NRW for B
>   - It logs B's digest
>   - It logs its digest history
>   (This information is valuable for investigating the underlying issue)
>   - Then A's NRW for B is adjusted:
>     - The highest_delivered seqno is set to B.digest.highest_delivered
>     - All messages in xmit_table below B.digest.highest_delivered are removed
>     - All retransmission tasks in the retransmitter <= B.digest.highest_delivered are cancelled and removed
>       (This will stop the retransmission)
> Again, this is a second line of defense, which should never be used. If the underlying problem does occur, however, we'll have valuable information in the logs to diagnose what went wrong.



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the jboss-jira mailing list