[jboss-jira] [JBoss JIRA] (JGRP-1389) Synchronous messages
Bela Ban (Commented) (JIRA)
jira-events at lists.jboss.org
Wed Nov 30 04:40:41 EST 2011
[ https://issues.jboss.org/browse/JGRP-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646556#comment-12646556 ]
Bela Ban commented on JGRP-1389:
--------------------------------
This *has* to be done in NAKACK and UNICAST2 directly, because retransmission in a separate protocol would send *new* messages, compare to sending the same message in NAKACK or UNICAST2 !
> Synchronous messages
> --------------------
>
> Key: JGRP-1389
> URL: https://issues.jboss.org/browse/JGRP-1389
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.1
>
>
> FLUSH ensures that every member has received all messages from *all* other members. This is quite costly, especially if we have a large cluster.
> However, sometimes, it is only necessary to ensure that a given message M sent by P has been received by everyone, for example, if P is the coordinator and installs view V, then it would be sufficient to ensure that everyone received V.
> So synchronous messages are messages that block the sender until all of the (non-faulty) recipients have ack'ed their reception, or a timeout occurs.
> A user should be able to configure whether the message send is complete when all receipients have *received* the message, or when they have *delivered* it.
> Reception of a message means that the message was added to the receipient's buffer, delivery means that the message was consumed by the application.
> A message M4 sent by P is sometimes delivered late because M is the last message sent by P, and if members Q and R don't receive M4 (e.g. because it was dropped), and P doesn't send another message M5, then Q and R have no means of detecting that M4 was sent by P and thus ask P for retransmission of M4.
> This is of course solved by STABLE which periodically broadcasts the highest seqnos sent, and then Q and R can ask P for retransmission of M4.
> However, if we don't want to wait that long, and don't want to risk Q and R to leave before they've received M4, we can implement a *sender-flush* of M4 sent by P.
> This works as follows:
> - P can add a RSVP flag to M4
> - P adds M4 to a retransmit table and keeps retransmitting M4 until it has received ACKs from every non-faulty member in the target set, or P leaves
> - When a message is received that is tagged with RSVP, an ACK is sent back to the sender
> - When P has received ACKs from everyone, the message send returns. The caller is blocked until this is the case (maybe bound with a timeout ?)
> The reason why this works is that every receiver gets the latest message M4 from P, and adds it to its retransmit table. If there is a gap, it will ask P for retransmission of missing messages. This way, a receiver won't have to wait until STABLE sends a digest to find out it's missing messages from P.
> There could be an option for a receiver R to delay sending an ACK back to P until it has actually *received* P's missing messages. If this isn't the case, P could leave or crash *before* R got all of P's missing messages.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list