[JBoss JIRA] (JGRP-1956) S3_PING / FILE_PING: remove failed members

Friday, 28 August 2015

    [
https://issues.jboss.org/browse/JGRP-1956?page=com.atlassian.jira.plugin....
] 

Karsten Ohme edited comment on JGRP-1956 at 8/28/15 8:53 PM:
-------------------------------------------------------------

This seems to be open again. My developer system is running on local host and is working
in single mode.
When starting the server a new file in the S3 bucket with the server name plus a random
number is created. When the server is restarted, this old address is read from the bucket
and a new one is generated. So e.g. after 7 restarts there a 7 servers address stored in
the bucket which are tried to be reached when the server is starting up to find other
members. I have set the timeout to one second to limit the effect, but the server still
tries to connect 10 times before it is switching to single mode.

The stale files should be removed somehow, also if the server is crashing or the method
for the unique server name calculation should be deterministic. This was working with
lower versions than 3.6.4

was (Author: k_o_):
This seems to be open again. When starting the server a new file in the S3 bucket with the
single DNS name plus a random number is created. When the server is restarted, this old
address is read from the bucket and a new one generated. After 7 restarts there a 7
servers address stored in the bucket which are tried to be reached. I have set the timeout
to one second to limit the effect, but the server still tries to connect 10 times before
it is switching to single mode.

The stale files should be removed somehow, also if the server is crashing or the method
for the unique server name calculation should be deterministic. This was working with
lower versions than 3.6.4

...
 S3_PING / FILE_PING: remove failed members
 ------------------------------------------

                 Key: JGRP-1956
                 URL: https://issues.jboss.org/browse/JGRP-1956
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 3.6.4
            Reporter: Karsten Ohme
            Assignee: Bela Ban

 When we terminate a member (EC2's "terminate" function) or kill -9 it, then
the file (or bucket data in S3) won't get removed. This leads to stale data. On EC2, I
expect that virtualized instances are often simply terminated, so this problem is
compounded there.
 SOLUTION:
 - Periodically write own data to the file system (FILE_PING) or S3 (S3_PING)
 - On a view change: remove all data that's not in the current view 

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006