[jboss-jira] [JBoss JIRA] (JGRP-1493) Merge fails because failing to get physical address takes too long

David Hotham (JIRA) jira-events at lists.jboss.org
Fri Aug 31 07:18:34 EDT 2012


    [ https://issues.jboss.org/browse/JGRP-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715104#comment-12715104 ] 

David Hotham commented on JGRP-1493:
------------------------------------

A's view is {A,B,C}.  A has no way to suspect C'.
                
> Merge fails because failing to get physical address takes too long
> ------------------------------------------------------------------
>
>                 Key: JGRP-1493
>                 URL: https://issues.jboss.org/browse/JGRP-1493
>             Project: JGroups
>          Issue Type: Feature Request
>    Affects Versions: 3.1
>            Reporter: David Hotham
>            Assignee: Bela Ban
>             Fix For: 3.2
>
>
> Start with the following views:
> -  A, B and C all have {A,B,C}
> -  D has {B', D, A, C'}, where B' and C' are dead.
> A decides to lead a merge (he's the only 'actual' coordinator).  By the time we've been through view-sanitization and so on and reached getMergeDataFromSubgroupCoordinators(), coords are {D, C', A}.
> Here A tries to send MERGE_REQ to those elements.  However, A does not have a physical address for C', and in fact nor does anyone else.  So when trying to send the MERGE_REQ to C', A will always spend a little over 5 seconds in TP.sendToSingleMember() - trying and failing to discover that physical address.
> Of course A won't get a response from C' either, so it will take another 5 seconds for merge_rsps.waitForAllResponses to time out.
> But that means that it's a sure thing that the MergeKiller will kick in first.
> Therefore the merge can never progress.  
> (Presumably the situation would be even worse if D's view had contained further dead members).
> I expect to work around this by tweaking the timings somewhere: probably in startMergeKiller, so that the MergeKiller takes longer to be scheduled.
> I'd think that the right fix would be to arrange that the MergeTask is not blocked by TP having no physical address for a member.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list