[jboss-jira] [JBoss JIRA] (JGRP-1807) UNICAST: skipping of seqnos

Bela Ban (JIRA) issues at jboss.org
Wed Mar 12 02:17:10 EDT 2014


    [ https://issues.jboss.org/browse/JGRP-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12952170#comment-12952170 ] 

Bela Ban commented on JGRP-1807:
--------------------------------

Same change for NAKACK2
                
> UNICAST: skipping of seqnos
> ---------------------------
>
>                 Key: JGRP-1807
>                 URL: https://issues.jboss.org/browse/JGRP-1807
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.2.13, 3.5
>
>
> {noformat}
> The log starts with:
> 10-Mar-2014 13:21:47 WARN  [org.jgroups.protocols.UNICAST2] (OOB-105,shared=tcp) AS_DR_IBE03/web: (requester=AS_DR_IBE06/web) message AS_DR_IBE06/web::1511786 not found in retransmission table of AS_DR_IBE06/web:
> [1511785 | 1511785 | 1511857] (53 elements, 19 missing)
> The numbers are 1511786-1511804  for "not found in retransmission...."
> And end:
> 10-Mar-2014 14:48:26 WARN  [org.jgroups.protocols.UNICAST2] (OOB-118,shared=tcp) AS_DR_IBE03/web: (requester=AS_DR_IBE06/web) message AS_DR_IBE06/web::1511804 not found in retransmission table of AS_DR_IBE06/web:
> [1511785 | 1511785 | 1514802] (2998 elements, 19 missing) 
> {noformat}
> It seems that 03  is missing messages 1511785-1511804 which it sent to 06. Since a null message cannot be added to the sender table (due to the {{msg.isFlagSet()}} which would throw an NPE), I asume we're skipping a seqno:
> In {{UNICAST}}, {{UNICAST2}} and {{UNICAST3}} {{down()}}, if a seqno is skipped, we get endless retransmissions. Example: 
> * We get the next seqno 1, add the message to the table and send it
> * We get the next seqno 2. However, if {{running}} is false, we don't add the message
> * We get the next seqno 3. Now {{running}} is true, and we add 3 to the table
> --> Now we have a missing message 2 which will always be null as it hasn't been added to the table
> This is highly unlikely, as I haven't been able to find a scenario where running flips from true to false to true quickly. If it flips from true to false, this is because {{stop()}} has been called. Also, in {{down()}}, we actually check {{running}} and return if false.
> In this scenario, the connections are all removed, so seqno is reset to 1.
> Anyway, I'm going to replace the {{while(running)}} loop with a {{do while(running)}} loop, so we always add the message to the table, even if running=false.
> [1] https://github.com/belaban/JGroups/blob/Branch_JGroups_3_2/src/org/jgroups/protocols/UNICAST2.java#L490

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list