]
Bela Ban commented on JGRP-1261:
--------------------------------
This is a very imprecise description, just cut&pasting a log won't cut it. If you
have something to reproduce this, I might be willing to take a look, although I don't
really support 2.6 (see [1])...
[1]
Get Socket closed exception, the nodes leaves the group and
doesn't re-join anymore
-----------------------------------------------------------------------------------
Key: JGRP-1261
URL:
https://issues.jboss.org/browse/JGRP-1261
Project: JGroups
Issue Type: Bug
Affects Versions: 2.6.10
Environment: JBoss 5.1GA, RHEL 5.3
Reporter: Nick Semchenkov
Assignee: Bela Ban
Attachments: jgroups-channelfactory-stacks.xml
The JBoss server works fine as a cluster member for a while and then all of a sudden gets
"Socket closed" exception. After that its not a member of the cluster anymore.
Please note, we have set shun=false trying to keep it in the cluster but it didnt help
In the past we had JBoss 4.2 running on the same hardware and had no issues. So It's
probably not a networking problem
Please find below an excerpt from the cluster.log and attached jgroups config
cluster.log ------------------------------------------------
2010-12-10 09:16:14,160 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:16:14,161 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:16:14,161 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:16:14,548 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:7901 (own address=10.170.1.24:7900)
2010-12-10 09:17:06,591 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.24:7901: java.net.SocketException: Socket closed
2010-12-10 09:17:06,628 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:47583 (own address=10.170.1.24:38487)
2010-12-10 09:17:06,638 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:47583 (number=0)
2010-12-10 09:17:06,640 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:7901 (own address=10.170.1.24:7900)
2010-12-10 09:17:06,644 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,645 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
2010-12-10 09:17:06,646 DEBUG [org.jgroups.protocols.FD] heartbeat missing from
10.170.1.24:39588 (number=0)
2010-12-10 09:17:06,852 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.25:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:07,960 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.17:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:08,113 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.16:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:08,167 WARN [org.jgroups.protocols.FD] I was suspected by
10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,176 WARN [org.jgroups.protocols.FD] I was suspected by
10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,177 WARN [org.jgroups.protocols.FD] I was suspected by
10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,177 WARN [org.jgroups.protocols.FD] I was suspected by
10.170.1.15:54399; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,213 WARN [org.jgroups.protocols.FD] I was suspected by
10.170.1.15:33585; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,256 WARN [org.jgroups.protocols.FD] I was suspected by
10.170.1.15:33585; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
2010-12-10 09:17:08,297 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.15:7901: java.net.SocketException: Broken pipe
2010-12-10 09:17:08,309 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH
at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,309 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH
at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,310 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984,
10.170.1.16:40389, 10.170.1.15:54399, 10.170.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,310 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399, 10.17
0.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,310 WARN [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am
not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656,
10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399
, 10.170.1.15:33585, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,311 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948
received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,311 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984,
10.170.1.16:40389, 10.170.1.15:54399, 10.170.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,311 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399, 10.17
0.1.15:33585, 10.170.1.24:39588]
2010-12-10 09:17:08,312 WARN [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am
not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656,
10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:54399
, 10.170.1.15:33585, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,312 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948
received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH
at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH
at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH
at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,314 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=[10.170.1.25:56405|11] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984,
10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,314 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|11] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,314 WARN [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am
not a member of view [10.170.1.25:56405|11] [10.170.1.25:56405, 10.170.1.17:39656,
10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984,
10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984,
10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,315 WARN [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am
not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656,
10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,315 WARN [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am
not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656,
10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,316 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948
received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,316 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948
received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,317 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948
received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,313 DEBUG [org.jgroups.protocols.pbcast.FLUSH] Received START_FLUSH
at 10.170.1.24:60948 but I am not flush participant, not responding
2010-12-10 09:17:08,378 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=[10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656, 10.170.1.16:33984,
10.170.1.16:40389, 10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,378 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=10.170.1.24:60948] view is [10.170.1.25:56405|12] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585, 10.17
0.1.15:54399, 10.170.1.24:39588]
2010-12-10 09:17:08,379 WARN [org.jgroups.protocols.pbcast.GMS] I (10.170.1.24:60948) am
not a member of view [10.170.1.25:56405|12] [10.170.1.25:56405, 10.170.1.17:39656,
10.170.1.16:33984, 10.170.1.16:40389, 10.170.1.15:33585
, 10.170.1.15:54399, 10.170.1.24:39588]; discarding view
2010-12-10 09:17:08,379 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At 10.170.1.24:60948
received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
2010-12-10 09:17:08,678 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.15:7900: java.net.SocketException: Socket closed
2010-12-10 09:17:08,877 ERROR [org.jgroups.blocks.ConnectionTable] failed sending data to
10.170.1.16:7901: java.net.SocketException: Socket closed
2010-12-10 09:17:12,126 WARN [org.jgroups.protocols.VIEW_SYNC] discarding view as I
(10.170.1.24:60948) am not member of view ([10.170.1.25:56405|12] [10.170.1.25:56405,
10.170.1.17:39656, 10.170.1.16:33984, 10.170.1.16:40389,
10.170.1.15:33585, 10.170.1.15:54399, 10.170.1.24:39588])
2010-12-10 09:17:12,648 DEBUG [org.jgroups.protocols.FD] sending are-you-alive msg to
10.170.1.24:39588 (own address=10.170.1.24:60948)
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: