[jboss-jira] [JBoss JIRA] Commented: (JGRP-665) Merge and Multiplexer have a race issue

Wed Jan 16 19:24:19 EST 2008

    [ http://jira.jboss.com/jira/browse/JGRP-665?page=comments#action_12395419 ] 

Vladimir Blagojevic commented on JGRP-665:
------------------------------------------

The solution entails each multiplexer to send its service list repeatedly until either merge timeouts or we get notification that all members merged correctly. It is a rather simple algorithm and tcp mux merging now works 100% - ran the test 50 times. Oddly enough udp mux merge started to work as well. We though it would fail because of that UNICAST issue - http://jira.jboss.com/jira/browse/JGRP-659. 

> Merge and Multiplexer have a race issue
> ---------------------------------------
>
>                 Key: JGRP-665
>                 URL: http://jira.jboss.com/jira/browse/JGRP-665
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.6, 2.7
>            Reporter: Vladimir Blagojevic
>         Assigned To: Vladimir Blagojevic
>             Fix For: 2.6.2, 2.7
>
>
> After introduction of a new MergeTest the following race condition has been observed. 
> Say we have A and B nodes each having one service S. They split and then heal again. Recall that we first install view down the stack and then up the stack. When MergeView travels up the stack to multiplexer we do service view merge consolidation in Multiplexer#handleMergeView.
> However, MergeView might arrive at node A at time T and at node B at time T+N msec. What happens is that one node installs view sooner than the other one and then this leads to discarding of service ack messages (in NAKACK layer) during service view merge consolidation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira