[jboss-jira] [JBoss JIRA] (JGRP-1497) FD_SOCK server socket is never closed after network interruption
Dennis Reed (JIRA)
jira-events at lists.jboss.org
Fri Jul 27 11:47:06 EDT 2012
[ https://issues.jboss.org/browse/JGRP-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Reed updated JGRP-1497:
------------------------------
Steps to Reproduce:
Start two clustered instances.
Interrupt the communication between them (pull the network plug, or simulate it using a firewall) for long enough to split the cluster apart.
After TCP keepalive kicks in, the existing FD_SOCK TCP sockets between the instances should be closed on both sides.
In EAP 5.1 and later, the default FD_SOCK ports are in the ranges 5420x, 5320x, 5790x.
Before the patch, the server side of the FD_SOCK tcp connections is never closed until the JGroups channel is shut down, and every time the cluster splits and rejoins new connections are created.
The default TCP keepalive time is a little over 2 hours on most operating systems. Reducing it in an OS dependent way can speed up this test.
On Linux, it's controlled by
/proc/sys/net/ipv4/tcp_keepalive_time
/proc/sys/net/ipv4/tcp_keepalive_intvl
/proc/sys/net/ipv4/tcp_keepalive_probes
The socket should be closed at (keepalive_time + keepalive_invl * keepalive_probes), from the point the socket is created, if the connection is no longer valid.
On Linux with instances running on 127.0.0.1 and 127.0.0.2 with -u 230.1.2.3, the network split can be simulated with:
iptables -I INPUT -s 127.0.0.1 -d 127.0.0.2 -j DROP
iptables -I INPUT -s 127.0.0.2 -d 127.0.0.1 -j DROP
iptables -I INPUT -d 230.1.2.3 -j DROP
And restored with:
iptables -D INPUT 1
iptables -D INPUT 1
iptables -D INPUT 1
was:
Start two clustered instances.
Interrupt the communication between them (pull the network plug, or simulate it using a firewall) for long enough to split the cluster apart.
After TCP keepalive kicks in, the existing FD_SOCK TCP sockets between the instances should be closed on both sides.
In EAP 5.1 and later, the default FD_SOCK ports are in the ranges 5420x, 5320x, 5790x.
Before the patch, the server side of the FD_SOCK tcp connections is never closed until the JGroups channel is shut down, and every time the cluster splits and rejoins new connections are created.
The default TCP keepalive time is a little over 2 hours on most operating systems. Reducing it in an OS dependent way can speed up this test.
On Linux, it's controlled by
/proc/sys/net/ipv4/tcp_keepalive_time
/proc/sys/net/ipv4/tcp_keepalive_intvl
/proc/sys/net/ipv4/tcp_keepalive_probes
The socket should be closed at (keepalive_time + keepalive_invl * keepalive_probes), from the point the socket is created, if the network is down.
On Linux with instances running on 127.0.0.1 and 127.0.0.2 with -u 230.1.2.3, the network split can be simulated with:
iptables -I INPUT -s 127.0.0.1 -d 127.0.0.2 -j DROP
iptables -I INPUT -s 127.0.0.2 -d 127.0.0.1 -j DROP
iptables -I INPUT -d 230.1.2.3 -j DROP
And restored with:
iptables -D INPUT 1
iptables -D INPUT 1
iptables -D INPUT 1
> FD_SOCK server socket is never closed after network interruption
> ----------------------------------------------------------------
>
> Key: JGRP-1497
> URL: https://issues.jboss.org/browse/JGRP-1497
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 2.6.21
> Reporter: Dennis Reed
> Assignee: Dennis Reed
> Fix For: 3.2
>
>
> The server side of the FD_SOCK socket is never closed after a network interruption during which the client side is closed.
> JGRP-195 added TCP keepalive to FD_SOCK, but only to the client side.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list