[jboss-jira] [JBoss JIRA] (JGRP-2171) New bundler with max_bundle_size for each destination

Tue Jun 13 09:58:00 EDT 2017

    [ https://issues.jboss.org/browse/JGRP-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418615#comment-13418615 ] 

Bela Ban edited comment on JGRP-2171 at 6/13/17 9:57 AM:
---------------------------------------------------------

h2. Alternative designs

h3. Alternating bundler
This bundler sends a batch as soon as the target destination changes, e.g. for sequence {{\[A A B C C C B A A A\]}}, batches {{\[A A A\]}}, {{B}} (single message), {{\[C C C\]}}, {{B}} and {{\[A A\]}} will be sent.
Of course, {{max_bundle_size}} is still observed; if we encounter a sequence whose accumulated size exceeds it, then a batch is sent immediately.

The advantage is that messages or message batches are sent immediately (reducing latency) and that this is a simple design (similar to the one above with {{num_flips=1}}. The disadvantage is that for sequences such as {{\[A B A B A B\]}}, we'll send 6 single messages, so this degenerates into a {{NoBundler}}.

h3. Remove-queue based bundler
This bundler drains messages from the main queue (to which sender threads add their messages) into a remove-queue of fixed length. Then we iterate through the queue and add messages to lists keyed by the target destination and finally send a batch (or single message) for each destination.

In the above example, we'd send 3 batches {{\[A A A A A\]}}, {{\[B B\]}} and {{\[C C C\]}}. Contrast this to the 5 batches (or single messages) that we send with the alternating bundler above.
The size of the queue determines the max latency: a bigger queue will result in more throughput but also higher latency. A queue of 1 is more or less the {{NoBundler}}.

Contrary to {{TransferQueueBundler}}, this bundler uses {{RingBuffer}} rather than an {{ArrayBlockingQueue}} and the size of the remove queue is fixed. {{TransferQueueBundler}} increases the size of the remove queue dynamically, which leads to higher latency if the remove queue grows too much.

was (Author: belaban):
h2. Alternative designs

h3. Alternating bundler
This bundler sends a batch as soon as the target destination changes, e.g. for sequence {{\[A A B C C C B A A A\]}}, batches {{\[A A A\]}}, {{B}} (single message), {{\[C C C\]}}, {{B}} and {{\[A A\]}} will be sent.
Of course, {{max_bundle_size}} is still observed; if we encounter a sequence whose accumulated size exceeds it, then a batch is sent immediately.

The advantage is that messages or message batches are sent immediately (reducing latency) and that this is a simple design (similar to the one above with {{num_flips=1}}. The disadvantage is that for sequences such as {{\[A B A B A B\]}}, we'll send 6 single messages, so this degenerates into a {{NoBundler}}.

h3. Queue-based bundler
This bundler drains messages from the main queue (to which sender threads add their messages) into a remove-queue of fixed length. Then we iterate through the queue and add messages to lists keyed by the target destination and finally send a batch (or single message) for each destination.

In the above example, we'd send 3 batches {{\[A A A A A\]}}, {{\[B B\]}} and {{\[C C C\]}}. Contrast this to the 5 batches (or single messages) that we send with the alternating bundler above.
The size of the queue determines the max latency: a bigger queue will result in more throughput but also higher latency. A queue of 1 is more or less the {{NoBundler}}.

> New bundler with max_bundle_size for each destination
> -----------------------------------------------------
>
>                 Key: JGRP-2171
>                 URL: https://issues.jboss.org/browse/JGRP-2171
>             Project: JGroups
>          Issue Type: Feature Request
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 4.0.4
>
>
> The current bundlers queue all messages and when the total size of all messages for all destinations would exceed {{max_bundle_size}}, message batches for each destination are sent.
> This negatively affects latency-sensitive applications, e.g. when we have a queue such as this: {{A B B C B B D B B}}, then the message for A has to wait until either the queue is full ({{max_bundle_size exceeded}}), or no more messages are received (and then we send the batches anyway).
> The goal is to write a new bundler which keeps a count for _each destination_ and sends batches to different destinations sooner. Also introduce a counter {{num_flips}} (find a better name!), which determines when a message batch is to be sent.
> This counter is decremented when a message to be sent has a destination that's different from the previous destination. When the counter is 0, we send the batch to the previous destination(s).
> We have a main queue, into which the senders write, and a runner thread (same as {{run()}} in TransferQueueBundler), which continually removes messages from the main queue and inserts them into queues for each destination.
> So 1 main queue and 1 queue for each destination.
> h4. Example:
> * {{num_flips}} is 2
> * A message for A is sent, added to the main queue and removed by the runner. It is queued in A's queue
> * Another message for A is sent. Also queued (A's queue: {{A A}})
> * A message to B is sent: A's {{num_flips}} is now 1. A's queue is {{A A}}, B's queue is {{B}}
> * Another message to A is sent. This resets A's {{num_flips}} to 2, B's {{num_flips}} is now 1
> * 2 messages to C are sent. This causes {{num_flips}} for A and B to be 0, so the batches to A (with 3 msgs) and B (1 msg) are also sent
> * No more messages are received, so the batch to C is also sent
> The value of {{num_flips}} should be computed as the rolling (weighted) average of the number of *adjacent messages to the same destination*. It is maintained for each destination separately (probably in the queue for that destination).
> h4. Misc
> * Should the sending of batches be delegated to a thread pool?
> * Should the senders add their messages directly to the destination queues instead of the main queue? That would result in less contention on the main queue, but it would also require 1 thread per destination queue, which creates too many threads...

--
This message was sent by Atlassian JIRA
(v7.2.3#72005)