[jboss-jira] [JBoss JIRA] (JGRP-1742) BARRIER: minimize closing time
Bela Ban (JIRA)
jira-events at lists.jboss.org
Wed Dec 4 09:18:05 EST 2013
[ https://issues.jboss.org/browse/JGRP-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928556#comment-12928556 ]
Bela Ban commented on JGRP-1742:
--------------------------------
OK, so the following things might solve this puzzle:
h5. Coordinator blocked during fetching of digest
* Since BARRIER is closed during a state transfer, the coordinator will not only drop messages from other members (except from members for which holes were punched into BARRIER), but also *from itself*
* Currently, the only message that causes problems when dropped is a VIEW change multicast
* SOLUTION: when multicasting a view V, *have the coord install V locally before multicasting it*. When receiving V, it will get dropped as it is already installed
h5. BARRIER skips threads in BLOCKED or WAITING state
* Don't skip these, as a blocked thread might simply block on a lock, before changing state (e.g. Infinispan)
* If message P:10 was blocked, and we skipped it when fetching the digest, we'd include P:10 in the digest, but not in the state. This would mean that the state requester will never get P:10, neither as part of the state, nor as a retransmission from P
h5. Flushing of threads in BARRIER should time out
* We cannot wait forever for the threads to time out
* The timeout passed to {{getState(timeout)}} should be used to bound the max duration for flushing the threads
* A timeout of 0 means wait forever
* Closing the channel should terminate the flush
> BARRIER: minimize closing time
> ------------------------------
>
> Key: JGRP-1742
> URL: https://issues.jboss.org/browse/JGRP-1742
> Project: JGroups
> Issue Type: Enhancement
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.5
>
>
> During a state transfer, BARRIER.up() waits until all incoming threads (delivering messages to the application) are done, and blocks further incoming messages. This is done to get the digest and the state.
> However, duing the block, the following messages are not sent up:
> * Views !
> * STABLE messages, triggering retransmissions
> This is bad, so we should try to minimize the time BARRIER is closed. This can be done with JGRP-1352.
> However, we could also do the following:
> * A state request is received
> * Close BARRIER and flush all pending threads. This ensures that any message which updated the *digest* also updated the *application state*
> * Get the digest D
> * *Open* BARRIER. Messages will now be delivered and thus applied to the state
> * Get the application state S
> * When done, return D and S to the state requester
> The difference to JGRP-1352 is that we don't queue messages during state transfer. How does this work ? It is critical to ensure that all mesages which updated the digest D also updated the state S, or else messages present in D but not in S would not be retransmitted. However, if there are more messages in S than in D, this is not an issue as they will be retransmitted again.
> Example:
> * BARRIER is closed and pending threads are flushed
> * Digest D is (only for a given member P) 5, state S is 5 as well
> * Now we open BARRIER
> * P sends a few more messages (6, 7 and 8)
> * The digest is now 8, but the copy we have is still 5
> * State S is 8
> * We return D=5 and S=8
> * The state requester closes BARRIER and sets its digest to 5 and its state to 8
> * Since the digest is only 5 for P, the state requester asks P for retransmission of messages 6, 7 and 8
> * Messages 6, 7 and 8 from P are received and applied to the state
> * The assumption here is that if messages 6, 7 and 8 are applied twice, the state doesn't change (idempotency). This should be the case with Infinispan.
> The advantage of this issue over JGRP-1352 is that we don't need to queue messages for a long time if the state is large.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list