[
https://issues.jboss.org/browse/JGRP-2253?page=com.atlassian.jira.plugin....
]
Sibin Karnavar reopened JGRP-2253:
----------------------------------
I would like to reopen this case. In AWS environment sometimes the FD_SOCK is not working
when we kill the process using kill-9 or while terminating the EC2 instance from the
console.
Most of the times, It detects the connection breakage immediately with in 3 seconds post
crashing the node ( e.g kill -9 -> not gracefully shutting down a node from the
cluster).
Sometimes, it do not detect immediately. it takes up to 12 to 13 seconds. So I am assuming
that FD configurations are helping to detect the node failure instead of FD SOCK in this
case.
There are no warnings related to FD SOCK in the log files.
TCP: Configurations
jgroup-ext-addr is the IP address of the each nodes.
<TCP
external_addr="${jgroup-ext-addr}"
bind_addr="match-interface:eth0"
bind_port="7803"
port_range="0"
diagnostics_port="7806"
recv_buf_size="20M"
send_buf_size="10M"
max_bundle_size="64k"
enable_diagnostics="true"
thread_naming_pattern="cl"
thread_pool.enabled="true"
thread_pool.min_threads="2"
thread_pool.max_threads="16"
thread_pool.keep_alive_time="5000" />
FD protocols:
<FD_SOCK start_port="7804" external_addr="${jgroup-ext-addr}"
bind_addr="match-interface:eth0" client_bind_port="7805"/>
<FD timeout="3000" max_tries="3" />
<VERIFY_SUSPECT timeout="3000" />
Thanks
FD_SOCK is not working in AWS environment
-----------------------------------------
Key: JGRP-2253
URL:
https://issues.jboss.org/browse/JGRP-2253
Project: JGroups
Issue Type: Bug
Affects Versions: 4.0.10
Environment: AWS - EC2
Reporter: Sibin Karnavar
Assignee: Bela Ban
We have our failure detection defined like below.
<FD_SOCK external_port="7804" />
<FD timeout="3000" max_tries="3" />
<VERIFY_SUSPECT timeout="3000" />
Please note that we have used FD instead of FD_ALL in AWS. We will be changing it to
FD_ALL later after detailed testing.
In my local, this is working perfect. As soon as I kill my node, I was able to see that
view change was happening immediately with FD_SOCK.
We were not mentioning the external_port in the FD_SOCK but later I thought it may be an
issue with the port and defined it as 7804 and added the same port to the security group
that allows to access this port among all the nodes. So no issue with the port.
Can you please let us know if we need any additional configurations to make FD_SOCK works
well in AWS.
Thanks,
Sibin
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)