[jboss-jira] [JBoss JIRA] (JGRP-1563) UNICAST2: first-message-lost problem

Dan Berindei (JIRA) jira-events at lists.jboss.org
Sun Feb 10 04:55:56 EST 2013


    [ https://issues.jboss.org/browse/JGRP-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753315#comment-12753315 ] 

Dan Berindei commented on JGRP-1563:
------------------------------------

Bela, I think there may be a concurrency issue in your fix. I got this NPE when testing a large cluster:

{noformat}
21:00:39,998 TRACE (OOB-1,ISPN,NodeJ-57905:) [RequestCorrelator] sending rsp for 1054 to NodeAB-41979
21:00:40,002 ERROR (Timer-4,ISPN,NodeJ-57905:) [DefaultTimeScheduler] exception executing task UNICAST2: RetransmitTask (interval=1000 ms): java.lang.NullPointerException
21:00:39,999 TRACE (OOB-1,ISPN,NodeJ-57905:) [UNICAST2] NodeJ-57905: created connection to NodeAB-41979 (conn_id=19)
21:00:40,004 TRACE (OOB-1,ISPN,NodeJ-57905:) [UNICAST2] NodeJ-57905 --> DATA(NodeAB-41979: #1, conn_id=19, first)
21:00:40,011 TRACE (OOB-1,ISPN,NodeJ-57905:) [TCP] sending msg to NodeAB-41979, src=NodeJ-57905, headers are RequestCorrelator: id=200, type=RSP, id=1054, rsp_expected=false, UNICAST2: DATA, seqno=1, conn_id=19, first, TCP: [channel_name=ISPN]
21:00:40,011 TRACE (OOB-1,ISPN,NodeJ-57905:) [TCP] dest=127.0.0.1:7927 (81 bytes)
{noformat}

                
> UNICAST2: first-message-lost problem
> ------------------------------------
>
>                 Key: JGRP-1563
>                 URL: https://issues.jboss.org/browse/JGRP-1563
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.2.6, 3.3
>
>
> If A sends its first message A#1 to B, but the network drops A#1, B won't be able to receive A#1 unless A sends another message.
> Even the stable task won't help: as B only sends stable messages to connected members (and A isn't), it will never send a stable message to A.
> SOLUTION:
> - Use acking for the first message
> - Continue sending the first message until it has been acked, or a configurable time period has elapsed
> - Note that we cannot cancel this task based on *membership* (view changes), as unicast messages can be sent to members outside of our view

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list