[jboss-jira] [JBoss JIRA] (JGRP-1902) Simplify failure detection and merge timeout configuration

Tuesday, 2 December 2014

     [
https://issues.jboss.org/browse/JGRP-1902?page=com.atlassian.jira.plugin....
]

Bela Ban updated JGRP-1902:
---------------------------
    Fix Version/s: 3.6.2
                       (was: 3.6.1)

...
 Simplify failure detection and merge timeout configuration
 ----------------------------------------------------------

                 Key: JGRP-1902
                 URL: https://issues.jboss.org/browse/JGRP-1902
             Project: JGroups
          Issue Type: Enhancement
    Affects Versions: 3.6
            Reporter: Dan Berindei
            Assignee: Bela Ban
            Priority: Minor
             Fix For: 3.6.2, 4.0

 FD/FD_ALL/FD_ALL2/FD_SOCK javadoc doesn't give any guidance as to how long it would
take to detect a leaving member. MERGE2/MERGE3 javadoc also doesn't say how much it
would take to detect that the network has healed.
 For an example of how misleading the current settings can be, I have seen MERGE3 take
more than 20s to merge two partitions with min_interval=1000 and max_interval=5000. FD
also detects a leaver after {{timeout * max_tries}} in the best case, and twice that if 2
consecutive nodes (in the members list) leave at the same time.
 The maximum time it takes to detect a leaver is of particular interest to Infinispan
users, because Infinispan is supposed to protect against nodes leaving. But if the users
don't configure a high enough RPC timeout in Infinispan, we don't get to detect
the node leaving.
 Ideally, the user should be able to specify a maximum detection time, and the protocol
should adjust the existing settings to meet that (most of the time). 

--
This message was sent by Atlassian JIRA
(v6.3.8#6338)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-jira] [JBoss JIRA] (JGRP-1902) Simplify failure detection and merge timeout configuration