[
https://issues.jboss.org/browse/JGRP-1904?page=com.atlassian.jira.plugin....
]
Bela Ban updated JGRP-1904:
---------------------------
Description:
When we multicast messages (with {{NAKACK2}}), if the last message M is dropped (e.g.
because the receiver has a full thread pool), then it will take a while for M to be
retransmitted.
Currently, detection of missing messages is done in {{STABLE}}: this protocol periodically
(or based on receiving a certain number of bytes) broadcasts its digest. When the
coordinator has received all digests, it computes the min and broadcasts a {{STABILITY}}
message.
When members receive this message, they can purge all messages below the minimum vector.
Receivers also detect if the digest contains messages higher than the ones they actually
received and can thus perform retransmission, if needed.
So the main purpose of {{STABLE}} is purging of messages seen by everyone, *not* detecting
missing messages. It also takes a while (consensus from all members) to generate a
{{STABILITY}} message, so detecting a missing message takes long.
We therefore need a quicker way to detect and retransmit missing messages.
h5. Solution
* Add a flag to {{NAKACK2}} (*not* {{STABLE}}) which - if set - periodically (or based on
the number of bytes sent), broadcasts the seqno of the highest message it sent.
** {{xmit_interval}} can be reused as interval
* When a receiver detects that it didn't receive this message, it can ask the sender
for retransmission
* The task should acquiesce when no messages are missing (i.e. the highest delivered
message == the highest received message)
** When the current seqno hasn't changed for N times, don't send the highest
seqno
* When messages are sent, the task should not run either
* It should also be possible to trigger the sending programmatically, or via JMX
was:
When we multicast messages (with {{NAKACK2}}), if the last message M is dropped (e.g.
because the receiver has a full thread pool), then it will take a while for M to be
retransmitted.
Currently, detection of missing messages is done in {{STABLE}}: this protocol periodically
(or based on receiving a certain number of bytes) broadcasts its digest. When the
coordinator has received all digests, it computes the min and broadcasts a {{STABILITY}}
message.
When members receive this message, they can purge all messages below the minimum vector.
Receivers also detect if the digest contains messages higher than the ones they actually
received and can thus perform retransmission, if needed.
So the main purpose of {{STABLE}} is purging of messages seen by everyone, *not* detecting
missing messages. It also takes a while (consensus from all members) to generate a
{{STABILITY}} message, so detecting a missing message takes long.
We therefore need a quicker way to detect and retransmit missing messages.
h5. Solution
* Add a flag to {{NAKACK2}} (*not* {{STABLE}}) which - if set - periodically (or based on
the number of bytes sent), broadcasts the seqno of the highest message it sent.
* When a receiver detects that it didn't receive this message, it can ask the sender
for retransmission
* The task should acquiesce when no messages are missing (i.e. the highest delivered
message == the highest received message)
** When messages are sent, the task should not run either
* It should also be possible to trigger the sending programmatically, or via JMX
NAKACK2: retransmit the last-message-missing sooner
---------------------------------------------------
Key: JGRP-1904
URL:
https://issues.jboss.org/browse/JGRP-1904
Project: JGroups
Issue Type: Enhancement
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 3.6.1
When we multicast messages (with {{NAKACK2}}), if the last message M is dropped (e.g.
because the receiver has a full thread pool), then it will take a while for M to be
retransmitted.
Currently, detection of missing messages is done in {{STABLE}}: this protocol
periodically (or based on receiving a certain number of bytes) broadcasts its digest. When
the coordinator has received all digests, it computes the min and broadcasts a
{{STABILITY}} message.
When members receive this message, they can purge all messages below the minimum vector.
Receivers also detect if the digest contains messages higher than the ones they actually
received and can thus perform retransmission, if needed.
So the main purpose of {{STABLE}} is purging of messages seen by everyone, *not*
detecting missing messages. It also takes a while (consensus from all members) to generate
a {{STABILITY}} message, so detecting a missing message takes long.
We therefore need a quicker way to detect and retransmit missing messages.
h5. Solution
* Add a flag to {{NAKACK2}} (*not* {{STABLE}}) which - if set - periodically (or based on
the number of bytes sent), broadcasts the seqno of the highest message it sent.
** {{xmit_interval}} can be reused as interval
* When a receiver detects that it didn't receive this message, it can ask the sender
for retransmission
* The task should acquiesce when no messages are missing (i.e. the highest delivered
message == the highest received message)
** When the current seqno hasn't changed for N times, don't send the highest
seqno
* When messages are sent, the task should not run either
* It should also be possible to trigger the sending programmatically, or via JMX
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)