[jboss-jira] [JBoss JIRA] (JGRP-2167) Highest seqno is not resent nor recorded on receivers
Bela Ban (JIRA)
issues at jboss.org
Thu May 11 04:59:00 EDT 2017
[ https://issues.jboss.org/browse/JGRP-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404904#comment-13404904 ]
Bela Ban edited comment on JGRP-2167 at 5/11/17 4:58 AM:
---------------------------------------------------------
OK, I added unit test {{LastMessageDroppedTest.testLastMessageAndLastSeqnoDropped()}} with {{STABLE.desired_avg_gossip_time}} set to 3000. This passes fine.
The test drops msg #3 out of 3 messages, and also drops the subsequent LAST_SEQNO (HIGHEST_SEQNO) message. However, STABLE kicks in and fixes this. The time range is {{\[0 .. avg_desired_gossip_time*2\]}}, so in the above test the max is ~6 seconds.
If you want, we can add another fixed time interval {{stable_interval}} which would send at fixed time (not random times within a range), to make this more predictable. This would (if set) override {{avg_desired_gossip_time}}.
Note that STABLE messages are sent _unreliably_, so if a STABLE message is dropped, we have to wait for the next round to get stability.
was (Author: belaban):
OK, I added unit test {{LastMessageDroppedTest.testLastMessageAndLastSeqnoDropped()}} with {{STABLE.desired_avg_gossip_time}} set to 3000.
The test drops msg #3 out of 3 messages, and also drops the subsequent LAST_SEQNO (HIGHEST_SEQNO) message. However, STABLE kicks in and fixes this. The time range is {{\[0 .. avg_desired_gossip_time*2\]}}, so in the above test the max is ~6 seconds.
If you want, we can add another fixed time interval {{stable_interval}} which would send at fixed time (not random times within a range), to make this more predictable. This would (if set) override {{avg_desired_gossip_time}}.
Note that STABLE messages are sent _unreliably_, so if a STABLE message is dropped, we have to wait for the next round to get stability.
> Highest seqno is not resent nor recorded on receivers
> -----------------------------------------------------
>
> Key: JGRP-2167
> URL: https://issues.jboss.org/browse/JGRP-2167
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.1
> Reporter: Radim Vansa
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 4.0.4
>
>
> I am investigating an issue in a stress test which leads me to a situation where in a TCP-based configuration a {{GMS[VIEW]}} is broadcast to all nodes, but it is not received by some of them. Soon after that there's a {{NAKACK2.HIGHEST_SEQNO}} that causes the node that is missing the last seqno to resend it, but the retransmit is not received either. There are no further retries, and generally no NAKACK2 activity until about 30 seconds later (when another node leaves after some timeout in the test).
> The receiver does not keep asking for retransmissions until it gets them, but it seems that {{NAKACK2.handleHighestSeqno}} doesn't update {{Table.hr}} (not sure if having highest received set to non-received msg would be legal, though).
> The sender uses default value {{NAKACK2.resend_last_seqno_max_times=1}}, and as there are no further mcast messages, the highest sent seqno does not change on sender.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
More information about the jboss-jira
mailing list