[jboss-jira] [JBoss JIRA] Updated: (JGRP-510) NAKACK: adjust retransmission times based on statistics

Fri Aug 3 09:27:57 EDT 2007

     [ http://jira.jboss.com/jira/browse/JGRP-510?page=all ]

Bela Ban updated JGRP-510:
--------------------------

    Description: 
NAKACK can maintain a rolling average of the time it takes for a missing message to get retransmitted (time between sending the XMIT-REQ and reception of the missing message). This can be done per sender.

The retransmit_timeout values can be based on the rolling average, e.g. make this completely dynamic: instead of a set of retransmit timeouts we only define a retransmit_timeout, e.g. 30ms. 

If a message is not able to get retransmitted the first time, we simply double that value (exponential backoff), until a max limit is reached. For each successfully retransmitted message, we reduce the timeout linearly. This is similar to slow start in TCP.

When we discover that the rolling timeout is less than 30ms (e.g. 4ms), then we lower the retransmission timeout, e.g. to 4 ms (plus possibly a safety time, say 2ms), to a total of 6ms. This will make message retransmission faster, since we don't have to wait for 30ms to ask for retransmission.

OTOH, if the rolling average increases, we can also increase our retransmission timeout, to avoid overloading the network with spurious retransmission requests.

RESULT: message retransmission is more dynamic:
- Retransmit timeout is *per sender*
- On low retransmission ack times, we ask for retransmission sooner, therefore increasing message rates
- On high retransmission ack times, we throttle retransmission timeouts (and use exponential backoff), therefore reducing traffic

  was:
NAKACK can maintain a rolling average of the time it takes for a missing message to get retransmitted (time between sending the XMIT-REQ and reception of the missing message). This can be done per sender.

The retransmit_timeout values can be based on the rolling average, e.g. make this completely dynamic: instead of a set of retransmit timeouts we only define a retransmit_timeout, e.g. 30ms. 

If a message is not able to get retransmitted the first time, we simply double that value (exponential backoff), until a max limit is reached.

When we discover that the rolling timeout is less than 30ms (e.g. 4ms), then we lower the retransmission timeout, e.g. to 4 ms (plus possibly a safety time, say 2ms), to a total of 6ms. This will make message retransmission faster, since we don't have to wait for 30ms to ask for retransmission.

OTOH, if the rolling average increases, we can also increase our retransmission timeout, to avoid overloading the network with spurious retransmission requests.

RESULT: message retransmission is more dynamic:
- Retransmit timeout is *per sender*
- On low retransmission ack times, we ask for retransmission sooner, therefore increasing message rates
- On high retransmission ack times, we throttle retransmission timeouts (and use exponential backoff), therefore reducing traffic

> NAKACK: adjust retransmission times based on statistics
> -------------------------------------------------------
>
>                 Key: JGRP-510
>                 URL: http://jira.jboss.com/jira/browse/JGRP-510
>             Project: JGroups
>          Issue Type: Feature Request
>            Reporter: Bela Ban
>         Assigned To: Bela Ban
>            Priority: Minor
>             Fix For: 2.6
>
>
> NAKACK can maintain a rolling average of the time it takes for a missing message to get retransmitted (time between sending the XMIT-REQ and reception of the missing message). This can be done per sender.
> The retransmit_timeout values can be based on the rolling average, e.g. make this completely dynamic: instead of a set of retransmit timeouts we only define a retransmit_timeout, e.g. 30ms. 
> If a message is not able to get retransmitted the first time, we simply double that value (exponential backoff), until a max limit is reached. For each successfully retransmitted message, we reduce the timeout linearly. This is similar to slow start in TCP.
> When we discover that the rolling timeout is less than 30ms (e.g. 4ms), then we lower the retransmission timeout, e.g. to 4 ms (plus possibly a safety time, say 2ms), to a total of 6ms. This will make message retransmission faster, since we don't have to wait for 30ms to ask for retransmission.
> OTOH, if the rolling average increases, we can also increase our retransmission timeout, to avoid overloading the network with spurious retransmission requests.
> RESULT: message retransmission is more dynamic:
> - Retransmit timeout is *per sender*
> - On low retransmission ack times, we ask for retransmission sooner, therefore increasing message rates
> - On high retransmission ack times, we throttle retransmission timeouts (and use exponential backoff), therefore reducing traffic

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira