[JBoss JIRA] (JGRP-2281) MERGE3 blocks unnecessarily in discovery when non-multicast transport is used

Wednesday, 27 June 2018

     [
https://issues.jboss.org/browse/JGRP-2281?page=com.atlassian.jira.plugin....
]

Bela Ban updated JGRP-2281:
---------------------------
    Description: 
When MERGE3 uses TCP, it cannot multicast its INFO message, and therefore uses the
discovery protocol (e.g. MPING) to fetch the targets to send the INFO message to.

Since we don't know how many responses to expect, we simply block for {{(min_interval
+ max_interval /2) ms}}. This is bad, as it delays the sending of INFO messages, which
results in a partial merge as we're likely not to get responses from *all* members.
This delays a full merge, e.g. when we have many singleton subclusters. A heavily split
cluster will therefore likely require more merge rounds than necessary when using TCP,
compared to (e.g.) UDP.

h4. Solution:
* The discovery process should be _reactive_ rather than blocking: instead of waiting for
N seconds, we simply pass a function to the discovery protocol that gets invoked whenever
a response has been received
* When the function gets invoked, it sends an INFO to the respective member
* This prevents 1 thread from blocking for N seconds

See \[1\] for details.
\[1\] https://github.com/belaban/JGroups/pull/389

  was:
When MERGE3 uses TCP, it cannot multicast its INFO message, and therefore uses the
discovery protocol (e.g. MPING) to fetch the targets to send the INFO message to.

Since we don't know how many responses to expect, we simply block for {{(min_interval
+ max_interval /2) ms}}. This is bad, as it delays the sending of INFO messages, which
results in a partial merge as we're likely not to get responses from *all* members.
This delays a full merge, e.g. when we have many singleton subclusters. A heavily split
cluster will therefore likely require more merge rounds than necessary when using TCP,
compared to (e.g.) UDP.

h4. Solution:
* The discovery process should be _reactive_ rather than blocking: instead of waiting for
N seconds, we simply pass a function to the discovery protocol that gets invoked whenever
a response has been received
* When the function gets invoked, it sends an INFO to the respective member
* This prevents up 1 thread from blocking for N seconds

See \[1\] for details.
\[1\] https://github.com/belaban/JGroups/pull/389

...
 MERGE3 blocks unnecessarily in discovery when non-multicast transport
is used
 -----------------------------------------------------------------------------

                 Key: JGRP-2281
                 URL: https://issues.jboss.org/browse/JGRP-2281
             Project: JGroups
          Issue Type: Enhancement
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 4.0.13

 When MERGE3 uses TCP, it cannot multicast its INFO message, and therefore uses the
discovery protocol (e.g. MPING) to fetch the targets to send the INFO message to.
 Since we don't know how many responses to expect, we simply block for {{(min_interval
+ max_interval /2) ms}}. This is bad, as it delays the sending of INFO messages, which
results in a partial merge as we're likely not to get responses from *all* members.
This delays a full merge, e.g. when we have many singleton subclusters. A heavily split
cluster will therefore likely require more merge rounds than necessary when using TCP,
compared to (e.g.) UDP.
 h4. Solution:
 * The discovery process should be _reactive_ rather than blocking: instead of waiting for
N seconds, we simply pass a function to the discovery protocol that gets invoked whenever
a response has been received
 * When the function gets invoked, it sends an INFO to the respective member
 * This prevents 1 thread from blocking for N seconds
 See \[1\] for details.
 \[1\] https://github.com/belaban/JGroups/pull/389 

--
This message was sent by Atlassian JIRA
(v7.5.0#75005)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006