[jboss-jira] [JBoss JIRA] (JGRP-1396) NAKACK2: merge NakReceiverWindow and Retransmitter

Bela Ban (Commented) (JIRA) jira-events at lists.jboss.org
Mon Jan 2 09:26:09 EST 2012


    [ https://issues.jboss.org/browse/JGRP-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653339#comment-12653339 ] 

Bela Ban commented on JGRP-1396:
--------------------------------

Here's another problem: we want to advance 'low' when removing a message (and nulling it). This happens at the receivers when discard_delivered_msgs is true. But there is a problem with the following scenario (threads T1 and T2):
- HD=9
- T1 calls add(10). 10 already exists
- T2 calls remove() with nullify=true
- T1 reads HD, value is 9, continues (would terminate if HD was 10)
- T2 checks that element at index HD+1 is not null
- T2 gets element at index HD+1
- T2 increments HD to 10 and nulls the element at index 10
- T1 does a CAS(null, 10) and succeeds because T2 just nulled the element
==> We now deliver the message at index 10 TWICE (or multiple times) !

SOLUTION:
- Interleave the reads and writes of T1 and T2 such that this outcome is impossible:
- remove():
  #1 write HD
  #2 null element
- add(seqno) [with nullify=true]:
  #1 read HD (return if seqno <= HD)
  #2 read element (return if element != null)
  #3 read HD (again!) (return if seqno <= HD)
  #4 CAS(null, element): if true: pass message up, else: discard

There is no sequence of add() and remove() with the above solution that ends up with an incorrect delivery !
                
> NAKACK2: merge NakReceiverWindow and Retransmitter
> --------------------------------------------------
>
>                 Key: JGRP-1396
>                 URL: https://issues.jboss.org/browse/JGRP-1396
>             Project: JGroups
>          Issue Type: Enhancement
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.1
>
>
> Both NakReceiverWindow and Retransmitter use their own data structures to keep a list of messages received (NRW) and seqnos to be retransmitted (Retrasmitter). This is redundant and costly memory-wise.
> I suggest let's merge the 2 classes, or at least let them share the data structure which keeps track of received messages.
> Suggestion II: create a ring buffer with a (changeable) capacity that keeps track of received messages and messages to be retransmitted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list