[jboss-jira] [JBoss JIRA] Commented: (JGRP-659) Merge and UNICAST sequencing problem
Bela Ban (JIRA)
jira-events at lists.jboss.org
Fri Apr 25 04:10:08 EDT 2008
[ http://jira.jboss.com/jira/browse/JGRP-659?page=comments#action_12410664 ]
Bela Ban commented on JGRP-659:
-------------------------------
I like the idea of sending the view ID (VID) with each unicast message. (However, because this changes the serialization format, we can not back port this to 2.6.3).
The receivers maintain a hashmap, keyed by VID. The values are receiver windows (Entry with AckReceiverWindow). Upon reception of a message with a new VID, we create a new entry and add the message to the corresponding receiver window.
Example:
- V2={A,B} and V3={C,D}.
- Now a merge occurs with MergeView V5. A and C install V5 in their respective subgroups
- At time T, A and B install V5
- At time T+5, C and D install V5
- At time T+2 (*before* {C,D} install V5), B sends a unicast message M1 to C
- C's receiver window is keyed by V3 and contains (C:50, D:20).
- Now C creates a new entry for M1: V5 with B:2
- At T+5, {C,D} receive MergeView V5
- {C,D} trash all connections in VIDs smaller than V5, so V3's receiver window trashes all connections
- C's receiver window for V5 is now A:1, B:2, C:1, D:1
- At T+8, B sends another unicast message M2 to C
- C looks up the correct receiver window for V5 (sent with M2) and adds M2
- Now the receiver window at C for V5 is A:1, B:3, C:1, D:1
> Merge and UNICAST sequencing problem
> ------------------------------------
>
> Key: JGRP-659
> URL: http://jira.jboss.com/jira/browse/JGRP-659
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 2.6, 2.4, 2.5
> Reporter: Vladimir Blagojevic
> Assigned To: Bela Ban
> Fix For: 2.7
>
> Attachments: ConcurrentMemberTest.java
>
>
> The problem is related to trashing of connection table in UNICAST during merge. Consider following scenario:
> There are 4 nodes in a cluster A,B,C, and D. After network split we have two islands A,B and C,D. When the network healing starts eventually MergeView gets installed in both islands. MergeView installation causes trashing of UNICAST connection table [1].
> However if we have a scenario where MergeView gets installed in A,B island at time T and it gets installed in island C,D at time T+N msec and a node from island A,B sends a unicast message in this N msec time window then we'll run into problems with unicast sequencing at C and D. Why? Because next message coming from island A,B into C,D will be will with sequence number > 1 and sequencing in UNICAST of C,D after connection trashing (from merge) expects starting sequence of 1. This causes UNICAST in C and/or D to wait forever for missing messages. Final outcome is thus that no more unicast message coming from A and/or B will ever be delivered at C and/or D!
> [1]http://jira.jboss.com/jira/browse/JGRP-348
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list