[jboss-jira] [JBoss JIRA] Created: (JGRP-1261) Get Socket closed exception, the nodes leaves the group and doesn't re-join anymore

Nick Semchenkov (JIRA) jira-events at lists.jboss.org
Fri Dec 10 09:55:52 EST 2010


Get Socket closed exception, the nodes leaves the group and doesn't re-join anymore
-----------------------------------------------------------------------------------

                 Key: JGRP-1261
                 URL: https://issues.jboss.org/browse/JGRP-1261
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 2.6.10
         Environment: JBoss 5.1GA, RHEL 5.3
            Reporter: Nick Semchenkov
            Assignee: Bela Ban


The JBoss server works fine as a cluster member for a while and then all of a sudden gets "Socket closed" exception. After that its not a member of the cluster anymore.

Please note, we have set shun=false trying to keep it in the cluster but it didnt help

In the past we had JBoss 4.2 running on the same hardware and had no issues. So It's probably not a networking problem

Please find below an excerpt from the cluster.log and attached jgroups config 

cluster.log ------------------------------------------------

2010-12-10 09:16:14,160 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:16:14,161 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:16:14,161 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:16:14,548 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:7901 (own address=10.170.1.24:7900)
2010-12-10 09:17:06,591 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.24:7901: java.net.SocketException: Socket closed
2010-12-10 09:17:06,628 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:47583 (own address=10.170.1.24:38487)
2010-12-10 09:17:06,638 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:47583 (number=0)
2010-12-10 09:17:06,640 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:7901 (own address=10.170.1.24:7900)
2010-12-10 09:17:06,644 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] heartbeat missing from 10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,852 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.25:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:07,960 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.17:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:08,113 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.16:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:08,167 WARN  [org.jgroups.protocols.FD] I was suspected by 10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,176 WARN  [org.jgroups.protocols.FD] I was suspected by 10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,177 WARN  [org.jgroups.protocols.FD] I was suspected by 10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,177 WARN  [org.jgroups.protocols.FD] I was suspected by 10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,213 WARN  [org.jgroups.protocols.FD] I was suspected by 10.170.1.15:33585; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,256 WARN  [org.jgroups.protocols.FD] I was suspected by 10.170.1.15:33585; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,297 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.15:7901: java.net.SocketException: Broken pipe
2010-12-10 09:17:08,309 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,309 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,310 DEBUG [org.jgroups.protocols.pbcast.GMS] view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399, 10.170.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,310 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399, 10.17
0.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,310 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399
, 10.170.1.15:33585, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,311 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,311 DEBUG [org.jgroups.protocols.pbcast.GMS] view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399, 10.170.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,311 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399, 10.17
0.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,312 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399
, 10.170.1.15:33585, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,312 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,314 DEBUG [org.jgroups.protocols.pbcast.GMS] view=[10.170.1.25:56405|11] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,314 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|11] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,314 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am not a member of view [10.170.1.25:56405|11] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS] view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS] view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,315 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,316 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,316 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,317 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,378 DEBUG [org.jgroups.protocols.pbcast.GMS] view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,378 DEBUG [org.jgroups.protocols.pbcast.GMS] [local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,379 WARN  [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,379 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,678 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.15:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:08,877 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to 10.170.1.16:7901: java.net.SocketException: Socket closed
2010-12-10 09:17:12,126 WARN  [org.jgroups.protocols.VIEW_SYNC] discarding view as I (10.170.1.24:60948) am not member of view ([10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389,
10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588])
2010-12-10 09:17:12,648 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to 10.170.1.24:39588 (own address=10.170.1.24:60948)






-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list