[jboss-jira] [JBoss JIRA] (JGRP-1524) RELAY2 looses route to site if a partition occurs within the site

Bela Ban (JIRA) jira-events at lists.jboss.org
Mon Oct 15 03:59:03 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726338#comment-12726338 ] 

Bela Ban commented on JGRP-1524:
--------------------------------

The algorithm to establish routes on view changes is:
- add all the routes in the existing view
- remove the routes of members that left

This leaves us with no route to A as the second step removes it.

The simple fix is to first remove the routes of all members that left and then add the routes to all members in the new view: 
- the route to A is removed (as A2 ceases to be coordinator)
- the route to A is added (for A1)

Note that this also works if A2 actually remains coordinator and A1 stops being the coordinator.
                
> RELAY2 looses route to site if a partition occurs within the site
> -----------------------------------------------------------------
>
>                 Key: JGRP-1524
>                 URL: https://issues.jboss.org/browse/JGRP-1524
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.2
>            Reporter: Radim Vansa
>            Assignee: Bela Ban
>             Fix For: 3.2
>
>
> When a partition occurs within site A, resulting in two coordinators in site A (say A1 and A2) and site master in site B notices these two site masters, after the partition is healed, the B master looses route to site A.
> Steps:
> 1. A1 coordinator of A, A2 common node
> 2. B1 has route to site A as to node A1
> 3. A partition split, A2 becomes coordinator
> 4. B1 notices that A2 joins the site-to-site view and adds another route to site A, overwriting the original route (to A1)
> 5. A is healed, A2 ceases to be the bridge
> 6. B1 notices that A2 left the site-to-site view and removes the route to site A
> 7. There is no route from A to B as the original route was overwritten in 4 and then the record was deleted in 6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list