[jboss-jira] [JBoss JIRA] (JGRP-1389) Synchronous messages

Bela Ban (Issue Comment Edited) (JIRA) jira-events at lists.jboss.org
Wed Nov 30 05:40:40 EST 2011


    [ https://issues.jboss.org/browse/JGRP-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646576#comment-12646576 ] 

Bela Ban edited comment on JGRP-1389 at 11/30/11 5:40 AM:
----------------------------------------------------------

Perhaps we can use a separate RSVP protocol anyway:
- Each message carries a UUID to identify it (we cannot use seqnos as we don't know them in RSVP)
- Ack collection, ack sending, timeout handling and caller blocking is done in RSVP
- When a message carries an RSVP tag, NAKACK and UNICAST2 need to make sure the message is sent until it is acked
- We could then use this also for UNICAST (which doesn't need to be changed as it retransmits messages anyway)
                
      was (Author: belaban):
    Perhaps we can use a separate RSVP protocol anyway:
- Ack collection, ack sending, timeout handling and caller blocking is done in RSVP
- When a message carries an RSVP tag, NAKACK and UNICAST2 need to make sure the message is sent until it is acked
- We could then use this also for UNICAST (which doesn't need to be changed as it retransmits messages anyway)
                  
> Synchronous messages
> --------------------
>
>                 Key: JGRP-1389
>                 URL: https://issues.jboss.org/browse/JGRP-1389
>             Project: JGroups
>          Issue Type: Feature Request
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.1
>
>
> FLUSH ensures that every member has received all messages from *all* other members. This is quite costly, especially if we have a large cluster.
> However, sometimes, it is only necessary to ensure that a given message M sent by P has been received by everyone, for example, if P is the coordinator and installs view V, then it would be sufficient to ensure that everyone received V.
> So synchronous messages are messages that block the sender until all of the (non-faulty) recipients have ack'ed their reception, or a timeout occurs.
> Note that P when sending messages 5,6,7,8, and only tagging 8 as synchronous, when returning from sending 8, JGroups guarantees that everyone will also have received all messages from P lower than 8.
> A user should be able to configure whether the message send is complete when all receipients have *received* the message, or when they have *delivered* it.
> Reception of a message means that the message was added to the receipient's buffer, delivery means that the message was consumed by the application.
> A message M4 sent by P is sometimes delivered late because M is the last message sent by P, and if members Q and R don't receive M4 (e.g. because it was dropped), and P doesn't send another message M5, then Q and R have no means of detecting that M4 was sent by P and thus ask P for retransmission of M4.
> This is of course solved by STABLE which periodically broadcasts the highest seqnos sent, and then Q and R can ask P for retransmission of M4.
> However, if we don't want to wait that long, and don't want to risk Q and R to leave before they've received M4, we can implement a *sender-flush* of M4 sent by P.
> This works as follows:
> - P can add a RSVP flag to M4
> - P adds M4 to a retransmit table and keeps retransmitting M4 until it has received ACKs from every non-faulty member in the target set, or P leaves
> - When a message is received that is tagged with RSVP, an ACK is sent back to the sender
> - When P has received ACKs from everyone, the message send returns. The caller is blocked until this is the case (maybe bound with a timeout ?)
> The reason why this works is that every receiver gets the latest message M4 from P, and adds it to its retransmit table. If there is a gap, it will ask P for retransmission of missing messages. This way, a receiver won't have to wait until STABLE sends a digest to find out it's missing messages from P.
> There could be an option for a receiver R to delay sending an ACK back to P until it has actually *received* P's missing messages. If this isn't the case, P could leave or crash *before* R got all of P's missing messages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list