[jboss-jira] [JBoss JIRA] (JGRP-1389) Synchronous messages
Bela Ban (Updated) (JIRA)
jira-events at lists.jboss.org
Tue Nov 29 05:44:40 EST 2011
[ https://issues.jboss.org/browse/JGRP-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bela Ban updated JGRP-1389:
---------------------------
Summary: Synchronous messages (was: Per-sender flush)
Description:
FLUSH ensures that every member has received all messages from *all* other members. This is quite costly, especially if we have a large cluster.
However, sometimes, it is only necessary to ensure that a given message M sent by P has been received by everyone, for example, if P is the coordinator and installs view V, then it would be sufficient to ensure that everyone received V.
So synchronous messages are messages that block the sender until all of the (non-faulty) recipients have ack'ed their reception, or a timeout occurs.
A user should be able to configure whether the message send is complete when all receipients have *received* the message, or when they have *delivered* it.
Reception of a message means that the message was added to the receipient's buffer, delivery means that the message was consumed by the application.
A message M4 sent by P is sometimes delivered late because M is the last message sent by P, and if members Q and R don't receive M4 (e.g. because it was dropped), and P doesn't send another message M5, then Q and R have no means of detecting that M4 was sent by P and thus ask P for retransmission of M4.
This is of course solved by STABLE which periodically broadcasts the highest seqnos sent, and then Q and R can ask P for retransmission of M4.
However, if we don't want to wait that long, and don't want to risk Q and R to leave before they've received M4, we can implement a *sender-flush* of M4 sent by P.
This works as follows:
- P can add a RSVP flag to M4
- P adds M4 to a retransmit table and keeps retransmitting M4 until it has received ACKs from every non-faulty member in the target set, or P leaves
- When a message is received that is tagged with RSVP, an ACK is sent back to the sender
- When P has received ACKs from everyone, the message send returns. The caller is blocked until this is the case (maybe bound with a timeout ?)
The reason why this works is that every receiver gets the latest message M4 from P, and adds it to its retransmit table. If there is a gap, it will ask P for retransmission of missing messages. This way, a receiver won't have to wait until STABLE sends a digest to find out it's missing messages from P.
There could be an option for a receiver R to delay sending an ACK back to P until it has actually *received* P's missing messages. If this isn't the case, P could leave or crash *before* R got all of P's missing messages.
was:
FLUSH ensures that every member has received all messages from *all* other members. This is quite costly, especially if we have a large cluster.
However, sometimes, it is only necessary to ensure that a given message M sent by P has been received by everyone, for example, if P is the coordinator and installs view V, then it would be sufficient to ensure that everyone received V.
A message M4 sent by P is sometimes delivered late because M is the last message sent by P, and if members Q and R don't receive M4 (e.g. because it was dropped), and P doesn't send another message M5, then Q and R have no means of detecting that M4 was sent by P and thus ask P for retransmission of M4.
This is of course solved by STABLE which periodically broadcasts the highest seqnos sent, and then Q and R can ask P for retransmission of M4.
However, if we don't want to wait that long, and don't want to risk Q and R to leave before they've received M4, we can implement a *sender-flush* of M4 sent by P.
This works as follows:
- P can add a RSVP flag to M4
- P adds M4 to a retransmit table and keeps retransmitting M4 until it has received ACKs from every non-faulty member in the target set, or P leaves
- When a message is received that is tagged with RSVP, an ACK is sent back to the sender
- When P has received ACKs from everyone, the message send returns. The caller is blocked until this is the case (maybe bound with a timeout ?)
The reason why this works is that every receiver gets the latest message M4 from P, and adds it to its retransmit table. If there is a gap, it will ask P for retransmission of missing messages. This way, a receiver won't have to wait until STABLE sends a digest to find out it's missing messages from P.
There could be an option for a receiver R to delay sending an ACK back to P until it has actually *received* P's missing messages. If this isn't the case, P could leave or crash *before* R got all of P's missing messages.
> Synchronous messages
> --------------------
>
> Key: JGRP-1389
> URL: https://issues.jboss.org/browse/JGRP-1389
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 3.1
>
>
> FLUSH ensures that every member has received all messages from *all* other members. This is quite costly, especially if we have a large cluster.
> However, sometimes, it is only necessary to ensure that a given message M sent by P has been received by everyone, for example, if P is the coordinator and installs view V, then it would be sufficient to ensure that everyone received V.
> So synchronous messages are messages that block the sender until all of the (non-faulty) recipients have ack'ed their reception, or a timeout occurs.
> A user should be able to configure whether the message send is complete when all receipients have *received* the message, or when they have *delivered* it.
> Reception of a message means that the message was added to the receipient's buffer, delivery means that the message was consumed by the application.
> A message M4 sent by P is sometimes delivered late because M is the last message sent by P, and if members Q and R don't receive M4 (e.g. because it was dropped), and P doesn't send another message M5, then Q and R have no means of detecting that M4 was sent by P and thus ask P for retransmission of M4.
> This is of course solved by STABLE which periodically broadcasts the highest seqnos sent, and then Q and R can ask P for retransmission of M4.
> However, if we don't want to wait that long, and don't want to risk Q and R to leave before they've received M4, we can implement a *sender-flush* of M4 sent by P.
> This works as follows:
> - P can add a RSVP flag to M4
> - P adds M4 to a retransmit table and keeps retransmitting M4 until it has received ACKs from every non-faulty member in the target set, or P leaves
> - When a message is received that is tagged with RSVP, an ACK is sent back to the sender
> - When P has received ACKs from everyone, the message send returns. The caller is blocked until this is the case (maybe bound with a timeout ?)
> The reason why this works is that every receiver gets the latest message M4 from P, and adds it to its retransmit table. If there is a gap, it will ask P for retransmission of missing messages. This way, a receiver won't have to wait until STABLE sends a digest to find out it's missing messages from P.
> There could be an option for a receiver R to delay sending an ACK back to P until it has actually *received* P's missing messages. If this isn't the case, P could leave or crash *before* R got all of P's missing messages.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list