[jboss-jira] [JBoss JIRA] Commented: (JGRP-659) Merge and UNICAST sequencing problem

Troy Schulz (JIRA) jira-events at lists.jboss.org
Tue Mar 25 13:15:40 EDT 2008


    [ http://jira.jboss.com/jira/browse/JGRP-659?page=comments#action_12404536 ] 
            
Troy Schulz commented on JGRP-659:
----------------------------------

As for resolution of the problem without FLUSH, I would just like to 
throw out an idea:

Since the Receiver and Sender windows are directly related to MergeView 
boundaries, seems like passing the 'version' of the window along with 
the message would help the receiver identify if it needs to add the 
message to a new version of the window, rather than add it to the 
existing window. It would continue to use the old 'version' of the 
windows until it processes the MergeView.  Then when it receives the 
MergeView it would clean out the old version of the window and then 
process the messages associated with the new window 'version'.  It may 
also be a good idea to pass the last id of the previous 'version' of the 
window so that before the old window gets cleaned up, all of its 
messages are properly processed before moving on to the new version of 
the windows.

> Merge and UNICAST sequencing problem
> ------------------------------------
>
>                 Key: JGRP-659
>                 URL: http://jira.jboss.com/jira/browse/JGRP-659
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.6, 2.4, 2.5
>            Reporter: Vladimir Blagojevic
>         Assigned To: Bela Ban
>             Fix For: 2.7
>
>         Attachments: ConcurrentMemberTest.java
>
>
> The problem is related to trashing of connection table in UNICAST during merge. Consider following scenario:
> There are 4 nodes in a cluster A,B,C, and D. After network split we have two islands A,B and C,D. When the network healing starts eventually MergeView gets installed in both islands. MergeView installation causes trashing of UNICAST connection table [1].
> However if we have a scenario where MergeView gets installed in A,B island at time T and it gets installed in island C,D at time T+N msec and a node from island A,B sends a unicast message in this N msec time window then we'll run into problems with unicast sequencing at C and D. Why? Because next message coming from island A,B into C,D will be will with sequence number > 1 and sequencing in UNICAST of C,D after connection trashing (from merge) expects starting sequence of 1. This causes UNICAST in C and/or D to wait forever for missing messages. Final outcome is thus that no more unicast message coming from A and/or B will ever be delivered at C and/or D!
> [1]http://jira.jboss.com/jira/browse/JGRP-348

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list