[jboss-jira] [JBoss JIRA] (JGRP-2245) JGroup JDBC_PING is not clearing the crashed members

Wednesday, 31 January 2018



    [
https://issues.jboss.org/browse/JGRP-2245?page=com.atlassian.jira.plugin....
] 

Bela Ban edited comment on JGRP-2245 at 1/31/18 10:30 AM:
----------------------------------------------------------

Will you be able to make JGroups writes only its own address to DB or just the new view or
latest


was (Author: sibin.karnavar):
Will you be able to make J GROUP writes only its own address to DB or just the new view or
latest

...
 JGroup JDBC_PING is not clearing the crashed members
 ----------------------------------------------------

                 Key: JGRP-2245
                 URL: https://issues.jboss.org/browse/JGRP-2245
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 4.0.8
            Reporter: Sibin Karnavar
            Assignee: Bela Ban
             Fix For: 4.0.10


 1) In AWS cloud environments, IP address will be different when a node crashes and when a
new cluster node gets recreated.
 2) In this situation, JGroup is not clearing logical_addr_cache and it gets confused,
when we restart the cluster nodes.
 3)logical_addr_cache_max_size and the eviction did not work because, the cache is again
getting updated from the ping and it never getting marked as removable.
 I think the issue is 
 handleView method is always re writing the entire cache on view change to the db. So even
if we clear the table with the help of above mentioned flags
(remove_all_data_on_view_change && remove_old_coords_on_view_change) , its getting
re written to the table.
 {code:java}
  // remove all files which are not from the current members
     protected void handleView(View new_view, View old_view, boolean coord_changed) {
         if(is_coord) {
             if(coord_changed) {
                 if(remove_all_data_on_view_change)
                     removeAll(cluster_name);
                 else if(remove_old_coords_on_view_change) {
                     Address old_coord=old_view != null? old_view.getCreator() : null;
                     if(old_coord != null)
                         remove(cluster_name, old_coord);
                 }
             }
             if(coord_changed || View.diff(old_view, new_view)[1].length > 0) {
                 writeAll();
                 if(remove_all_data_on_view_change || remove_old_coords_on_view_change)
                     startInfoWriter();
             }
         }
         else if(coord_changed) // I'm no longer the coordinator
             remove(cluster_name, local_addr);
     }
 {code}
 4) Because of the crashed members  (non existing ip address), we are getting lot of
socket timeouts
 sendToMembers of TP is trying to send messages to old crashed members and writing error
logs while startup. 


--
This message was sent by Atlassian JIRA
(v7.5.0#75005)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-jira] [JBoss JIRA] (JGRP-2245) JGroup JDBC_PING is not clearing the crashed members