[jboss-jira] [JBoss JIRA] Commented: (JGRP-523) Multicasts fail after a merge
Bela Ban (JIRA)
jira-events at lists.jboss.org
Fri Jun 8 03:38:11 EDT 2007
[ http://jira.jboss.com/jira/browse/JGRP-523?page=comments#action_12364650 ]
Bela Ban commented on JGRP-523:
-------------------------------
Done as suggested. If we now have the agreed-upon digest D1, but members continue sending multicast messages, then we have a new digest D2, which is higher than D1. However, members which didn't generate D2 will install D1, therefore messages will get retransmitted (delta between D2 and D1). One of the sent messages is the merge view, which will now be delivered, but that's not incorrect as the merge view was already received and is therefore discarded.
Example:
- Members 192.168.5.2:4624, 192.168.5.2:4629 (4624 and 4629)
- Digest agreed upon between coordinator 4624 and 4629 is
192.168.5.2:4624: [26 : 29 (29)], 192.168.5.2:4629: [31 : 34 (34)]
The merge of the digest happens as follows:
4624:
xmit_table before: 192.168.5.2:4624: [26 : 32 (32) (size=6, missing=0, highest stability=26)]
digest received: 192.168.5.2:4624: [26 : 29 (29)], 192.168.5.2:4629: [31 : 34 (34)]
xmit_table after:
192.168.5.2:4624: [26 : 32 (32) (size=6, missing=0, highest stability=26)]
192.168.5.2:4629: [31 : 34 (34)]
4624 doesn't set its own digest (because it's already there), and adds the digest from 4629. Note that, if 4624 set its own digest to the previously agreed-upon digest, 4629 would not receive messages 30-32 from 4624. This is certainly not incorrect, but we chose the other approach.
4629:
xmit_table before: 192.168.5.2:4629: [31 : 37 (37) (size=6, missing=0, highest stability=31)]
digest received: 192.168.5.2:4624: [26 : 29 (29)], 192.168.5.2:4629: [31 : 34 (34)]
xmit_table after:
192.168.5.2:4624: [26 : 29 (29)]
192.168.5.2:4629: [31 : 37 (37) (size=6, missing=0, highest stability=31)]
As we can see, 4629 thinks the highest message received from 4624 is 29, but it actually is 32. So 4629 will ask 4624 for retransmission of messages 30-32:
NAKACK.retransmit(): 192.168.5.2:4629: sending XMIT_REQ ([30, 32]) to 192.168.5.2:4624
NAKACK.up(): received missing messages [30 : 32]
> Multicasts fail after a merge
> -----------------------------
>
> Key: JGRP-523
> URL: http://jira.jboss.com/jira/browse/JGRP-523
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 2.5
> Reporter: Bela Ban
> Assigned To: Bela Ban
> Priority: Critical
> Fix For: 2.5
>
>
> To reproduce:
> - Start GossipRouter
> - Start 2 Draws with tunnel.xml, connecting to the GR
> - Paint a bit in the 2 windows, to generate messages (and thus change the digests)
> - Kill GR
> - When both Draw instances become cluster singletons, start GR again
> ==> Observe that we get messages that indicate a requested retransmission couldn't be satisfied because the message was not found:
> 14:38:16,203 [WARN] [OOB Thread,demo,192.168.5.2:4397] NAKACK.handleMessage(): 192.168.5.2:4397] discarded message from non-member 192.168.5.2:4393, my view is [192.168.5.2:4397|2] [192.168.5.2:4397]
> 14:38:16,890 [WARN] [OOB Thread,demo,192.168.5.2:4393] NAKACK.handleMessage(): 192.168.5.2:4393] discarded message from non-member 192.168.5.2:4397, my view is [192.168.5.2:4393|2] [192.168.5.2:4393]
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list