[jboss-jira] [JBoss JIRA] (JGRP-1571) Race condition in tcpgossip discovery

Ramky Kandula (JIRA) jira-events at lists.jboss.org
Fri Jan 25 00:06:47 EST 2013


Ramky Kandula created JGRP-1571:
-----------------------------------

             Summary: Race condition in tcpgossip discovery
                 Key: JGRP-1571
                 URL: https://issues.jboss.org/browse/JGRP-1571
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 3.1
            Reporter: Ramky Kandula
            Assignee: Bela Ban


We are observing a race condition in TCPGOSSIP based discovery.  

A is gossip router and B is a node connecting to gossip router. 

Timer-1 on B starts connecting, does ping. 
Timer-2 on B starts reconnecting 
A sees this new connection and closes the old one from Timer-1 from B. 
Timer-2 on B does ping 
Timer-1 on B tries to do gossip_get and encounters broken pipe as the socket 
is closed. 

This process goes on forever.

Logs from Gossip Router A 
========================= 

2012-12-14 13:20:47,687 DEBUG  [GossipRouter][] org.jgroups.stack.GossipRouter- 
Accepted connection, socket is 
Socket[addr=/192.168.1.11,port=21009,localport=12001] 
2012-12-14 13:20:47,687 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] entering 
receive loop 
2012-12-14 13:20:47,749 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] received 
CONNECT(group=TestCluster, addr=dd-ace1-20957, 
logical_name=dd-ace1-20957, physical_addrs=192.168.1.11:7800) 
2012-12-14 13:20:47,749 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- Found old connection[ConnectionHandler[peer: 
/192.168.1.11, logical_addrs: dd-ace1-20957, dd-ace1-20957]] for 
addr[dd-ace1-20957]. Closing old connection ... 
2012-12-14 13:20:47,749 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] connection handshake completed, added 
dd-ace1-20957 to group TestCluster 
2012-12-14 13:20:47,749 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] processed  CONNECT(group=TestCluster, 
addr=dd-ace1-20957, logical_name=dd-ace1-20957, 
physical_addrs=192.168.1.11:7800) 
2012-12-14 13:20:48,875 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] received PING(group=null, addr=null) 
2012-12-14 13:20:48,875 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] processed  PING(group=null, addr=null) 
2012-12-14 13:20:53,735 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] received CONNECT(group=TestCluster, 
addr=dd-ace1-20957, logical_name=dd-ace1-20957, 
physical_addrs=192.168.1.11:7800) 
2012-12-14 13:20:53,736 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- Found old connection[ConnectionHandler[peer: 
/192.168.1.11, logical_addrs: dd-ace1-20957]] for addr[dd-ace1-20957]. Closing 
old connection ... 
2012-12-14 13:20:53,736 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] is being closed 
2012-12-14 13:20:53,838 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957, dd-ace1-20957] connection handshake completed, 
added dd-ace1-20957 to group TestCluster 
2012-12-14 13:20:53,838 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957, dd-ace1-20957] processed 
CONNECT(group=TestCluster, addr=dd-ace1-20957, 
logical_name=dd-ace1-20957, physical_addrs=192.168.1.11:7800) 
2012-12-14 13:21:04,023 DEBUG  [GossipRouter][] org.jgroups.stack.GossipRouter- 
Accepted connection, socket is 
Socket[addr=/192.168.1.11,port=37357,localport=12001] 
2012-12-14 13:21:04,023 DEBUG  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] entering 
receive loop 
2012-12-14 13:21:04,085 TRACE  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] received 
CONNECT(group=TestCluster, addr=dd-ace1-20957, 
logical_name=dd-ace1-20957, physical_addrs=192.168.1.11:7800) 
2012-12-14 13:21:04,085 DEBUG  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- Found old connection[ConnectionHandler[peer: 
/192.168.1.11, logical_addrs: dd-ace1-20957, dd-ace1-20957]] for 
addr[dd-ace1-20957]. Closing old connection ... 
2012-12-14 13:21:04,086 DEBUG  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] connection handshake completed, added 
dd-ace1-20957 to group TestCluster 
2012-12-14 13:21:04,086 TRACE  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] processed  CONNECT(group=TestCluster, 
addr=dd-ace1-20957, logical_name=dd-ace1-20957, 
physical_addrs=192.168.1.11:7800) 

Logs from Gossip Client B 
========================= 

2012-12-14 13:20:47,687 DEBUG  [GossipRouter][] org.jgroups.stack.GossipRouter- 
Accepted connection, socket is 
Socket[addr=/192.168.1.11,port=21009,localport=12001] 
2012-12-14 13:20:47,687 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] entering 
receive loop 
2012-12-14 13:20:47,749 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] received 
CONNECT(group=TestCluster, addr=dd-ace1-20957, 
logical_name=dd-ace1-20957, physical_addrs=192.168.1.11:7800) 
2012-12-14 13:20:47,749 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- Found old connection[ConnectionHandler[peer: 
/192.168.1.11, logical_addrs: dd-ace1-20957, dd-ace1-20957]] for 
addr[dd-ace1-20957]. Closing old connection ... 
2012-12-14 13:20:47,749 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] connection handshake completed, added 
dd-ace1-20957 to group TestCluster 
2012-12-14 13:20:47,749 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] processed  CONNECT(group=TestCluster, 
addr=dd-ace1-20957, logical_name=dd-ace1-20957, 
physical_addrs=192.168.1.11:7800) 
2012-12-14 13:20:48,875 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] received PING(group=null, addr=null) 
2012-12-14 13:20:48,875 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] processed  PING(group=null, addr=null) 
2012-12-14 13:20:53,735 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] received CONNECT(group=TestCluster, 
addr=dd-ace1-20957, logical_name=dd-ace1-20957, 
physical_addrs=192.168.1.11:7800) 
2012-12-14 13:20:53,736 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- Found old connection[ConnectionHandler[peer: 
/192.168.1.11, logical_addrs: dd-ace1-20957]] for addr[dd-ace1-20957]. Closing 
old connection ... 
2012-12-14 13:20:53,736 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] is being closed 
2012-12-14 13:20:53,838 DEBUG  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957, dd-ace1-20957] connection handshake completed, 
added dd-ace1-20957 to group TestCluster 
2012-12-14 13:20:53,838 TRACE  [gossip-handlers--9882][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957, dd-ace1-20957] processed 
CONNECT(group=TestCluster, addr=dd-ace1-20957, 
logical_name=dd-ace1-20957, physical_addrs=192.168.1.11:7800) 
2012-12-14 13:21:04,023 DEBUG  [GossipRouter][] org.jgroups.stack.GossipRouter- 
Accepted connection, socket is 
Socket[addr=/192.168.1.11,port=37357,localport=12001] 
2012-12-14 13:21:04,023 DEBUG  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] entering 
receive loop 
2012-12-14 13:21:04,085 TRACE  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11] received 
CONNECT(group=TestCluster, addr=dd-ace1-20957, 
logical_name=dd-ace1-20957, physical_addrs=192.168.1.11:7800) 
2012-12-14 13:21:04,085 DEBUG  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- Found old connection[ConnectionHandler[peer: 
/192.168.1.11, logical_addrs: dd-ace1-20957, dd-ace1-20957]] for 
addr[dd-ace1-20957]. Closing old connection ... 
2012-12-14 13:21:04,086 DEBUG  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] connection handshake completed, added 
dd-ace1-20957 to group TestCluster 
2012-12-14 13:21:04,086 TRACE  [gossip-handlers--9881][] 
org.jgroups.stack.GossipRouter- ConnectionHandler[peer: /192.168.1.11, 
logical_addrs: dd-ace1-20957] processed  CONNECT(group=TestCluster, 
addr=dd-ace1-20957, logical_name=dd-ace1-20957, 
physical_addrs=192.168.1.11:7800) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the jboss-jira mailing list