[jboss-jira] [JBoss JIRA] Resolved: (JGRP-1282) Race condition in FLUSH when master leaves cluster

Vladimir Blagojevic (JIRA) jira-events at lists.jboss.org
Wed Feb 9 08:54:47 EST 2011


     [ https://issues.jboss.org/browse/JGRP-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vladimir Blagojevic resolved JGRP-1282.
---------------------------------------

    Resolution: Done


On master https://github.com/belaban/JGroups/commit/8e5eafa
On 2.11 https://github.com/belaban/JGroups/commit/41078c9
On 2.6 https://github.com/belaban/JGroups/commit/e319bb5

> Race condition in FLUSH when master leaves cluster
> --------------------------------------------------
>
>                 Key: JGRP-1282
>                 URL: https://issues.jboss.org/browse/JGRP-1282
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.6.16
>            Reporter: Dennis Reed
>            Assignee: Vladimir Blagojevic
>             Fix For: 2.6.19, 2.11.2, 2.12
>
>
> There's a race condition in FLUSH when the master node is leaving the cluster,
> that can cause the master to not send a new view (with a new master) before leaving.
> The FLUSH is started when GMS sends down an Event.SUSPEND.
> FLUSH.down calls FLUSH.startFlush, which calls FLUSH.onSuspend.
> onSuspend sends a START_FLUSH message down.
> In the working case, the local node gets the START_FLUSH first.
> FLUSH.up calls FLUSH.handleStartFlush, which calls FLUSH.onStartFlush.
> onStartFlush sets the member variable "flushMembers".
> Then the other nodes reply to the START_FLUSH with a FLUSH_COMPLETED.
> FLUSH.up calls FLUSH.onFlushCompleted.  
> onFlushCompleted checks "flushMembers" against the list of replies.  
> If they match (and flushMembers is not null), the flush completes.
> But in the non-working case, the FLUSH_COMPLETED from the other
> nodes is processed before the local START_FLUSH.
> In this case, flushMembers has not been set, and onFlushCompleted
> does nothing, expecting more replies (which never come).
> I believe this will only be triggered when the master is leaving,
> because it does not include itself in the FLUSH.  If it was a flush
> member, there would be a FLUSH_COMPLETED reply from itself to
> trigger setting flushMembers at some point.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list