[jboss-jira] [JBoss JIRA] (JGRP-1674) STOP_FLUSH race condition
Dennis Reed (JIRA)
jira-events at lists.jboss.org
Fri Aug 9 15:30:26 EDT 2013
[ https://issues.jboss.org/browse/JGRP-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796090#comment-12796090 ]
Dennis Reed commented on JGRP-1674:
-----------------------------------
The only information currently sent in STOP_FLUSH is the view ID.
This is not sufficient for this case, as the joiner's FLUSH viewID is the same as the master's STOP_FLUSH.
I don't see a way to fix this without adding the member list to the STOP_FLUSH message, which introduces a backwards compatibility issue.
> STOP_FLUSH race condition
> -------------------------
>
> Key: JGRP-1674
> URL: https://issues.jboss.org/browse/JGRP-1674
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 2.6.21
> Reporter: Dennis Reed
> Assignee: Bela Ban
>
> There is a race condition in STOP_FLUSH when a node joins the cluster.
> JOINER sends JOIN_REQ to MASTER
> MASTER does a flush on the existing members (START_FLUSH, ...)
> MASTER sends JOIN_RSP
> MASTER sends STOP_FLUSH
> JOINER receives JOIN_RSP
> JOINER fetches state, sends START_FLUSH
> JOINER receives STOP_FLUSH from MASTER
> onStopFlush never verifies that the current node was part of the FLUSH, and therefore is valid for the current node.
> So this STOP_FLUSH corrupts JOINER's FLUSH by resetting all the member variables (and probably unblocking).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list