[jboss-jira] [JBoss JIRA] (JGRP-1486) Merge failure when dead instances remain in view

Bela Ban (JIRA) jira-events at lists.jboss.org
Thu Jul 5 02:06:12 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704596#comment-12704596 ] 

Bela Ban commented on JGRP-1486:
--------------------------------

Have you had a chance to test whether JGRP-1489 fixed this issue as well ?
                
> Merge failure when dead instances remain in view
> ------------------------------------------------
>
>                 Key: JGRP-1486
>                 URL: https://issues.jboss.org/browse/JGRP-1486
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.0.10
>            Reporter: David Hotham
>            Assignee: Bela Ban
>             Fix For: 3.1
>
>
> I've hit this testing my JGRP-1485 fix, but I think it's a logically independent issue.
> So, I've reached a point where:
> -  A, B and C all have view {C,A,B}
> -  D has view {B', D', D, A', C}, in which B', D' and A' are all dead instances
> As in JGRP-1485, an optimal fix would surely be to allow D to recover all by itself, but it's not clear to me how to do that.  However, my expectation was that a merge should sort things out; and I think that if it did then that ought to be good enough.
> But what's actually happening is this:
> -  C becomes merge leader
> -  determines that merge participants are C, D', D, A'
> -  sends MERGE_REQ to those members
> -  the MERGE_REQ to D' reaches D (and that to A' reaches A)
> -  D sends a positive response for the MERGE_REQ that was meant for it, but after 2.5 seconds also sends a negative response to the MERGE_REQ meant for D'.  (I think that the negative response is because it can't fetch the digest from D')
> -  likewise A sends a negative response to the MERGE_REQ meant for A'
> So what C sees is:
> -  good responses from C and D, followed by merge_rejected responses from A and D
> -  so it removes A' and D' from the merge (it didn't get responses from them)
> -  then it removes D from the merge (because the most recent response from D said merge_rejected)
> -  so it is left only with itself, and comes up with a consolidated view that is identical to its original view
> in short: the merge doesn't do anything useful after all.
> I think that the key here is the confusion between D and D'.  Possibly the fix is as simple as: ignore MERGE_REQs where the destination address on the message is not the local address.
> I'll try this out and, if it looks good, submit a pull request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list