[
https://issues.jboss.org/browse/JGRP-1299?page=com.atlassian.jira.plugin....
]
Igor M updated JGRP-1299:
-------------------------
Description:
This is what we see in production:
1. Node 1 does not send pings for 25 seconds
2. Node 2 notices 6 lost pings (in 15 seconds)
3. Node 2 starts sending "broadcast SUSPECT"
4. Node 1 replies to a few of them
5. Node 2 does not receive replies until after 15 seconds after it suspected node 1
6. Node 2 removes Node 1 from the view
7. Node 1 keeps sending "are-you-alive" and Node 2 is now discarding them
At this time Node 1 believe there are two nodes in the cluster, and Node 2 only sees
itself.
In the lab we were able to reproduce the problem by stopping Node 1 process:
pstop {PID} ; sleep 35 ; prun {PID}
Once the process is resumed it can never join the cluster.
Here is the log snipped from Node 1. The first two lines show 26 seconds interval between
pings while it should have been 2.5 seconds. Node 2 logs for the same time interval are
after Node 1 logs
I traced the 26 seconds delay to the GC cycle on Node 1. pstop/sleep/prun have almost the
same effect.
was:
This is what we see in production:
1. Node 1 does not send pings for 25 seconds
2. Node 2 notices 6 lost pings (in 15 seconds)
3. Node 2 starts sending "broadcast SUSPECT"
4. Node 1 replies to a few of them
5. Node 2 does not receive replies until after 15 seconds after it suspected node 1
6. Node 2 removes Node 1 from the view
7. Node 1 keeps sending "are-you-alive" and Node 2 is now discarding them
At this time Node 1 believe there are two nodes in the cluster, and Node 2 only sees
itself.
In the lab we were able to reproduce the problem by stopping Node 1 process:
pstop {PID} ; sleep 35 ; prun {PID}
Once the process is resumed it can never join the cluster.
Here is the log snipped from Node 1. The first two lines show 26 seconds interval between
pings while it should have been 2.5 seconds. Node 2 logs for the same time interval are
after Node 1 logs
Node 1 logs:
[Mar 05 21:34:20] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:34:46] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:34:46] [0001DE2E] org.jgroups.blocks.ConnectionTable ERROR failed sending data
to 10.10.10.165:32012: java.net.SocketException: Broken pipe
[Mar 05 21:34:47] [0001DE2E] org.jgroups.protocols.FD WARN I was suspected by
10.10.10.165:32012; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
[Mar 05 21:34:47] [0001DE38] org.jgroups.blocks.ConnectionTable ERROR failed sending data
to 10.10.10.165:32012: java.net.SocketException: Socket closed
[Mar 05 21:34:49] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:34:49] [0001DE38] org.jgroups.protocols.FD WARN I was suspected by
10.10.10.165:32012; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
[Mar 05 21:34:51] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:34:51] [0001DE38] org.jgroups.protocols.FD WARN I was suspected by
10.10.10.165:32012; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
[Mar 05 21:34:54] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:34:54] [0001DE38] org.jgroups.protocols.FD WARN I was suspected by
10.10.10.165:32012; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
[Mar 05 21:34:56] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:34:59] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:01] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:04] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:06] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:09] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:11] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:14] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:16] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:19] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:21] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:24] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:26] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:29] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:31] [0001D86E] org.jgroups.blocks.VotingAdapter VERBOSE Voting on decree
Vote[topic=DefaultDataSource_failover_server,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER_DSN=PC02ZONE1;TTC_SERVER=PC02ZONE1;TCP_PORT=17013;TTC_Timeout=180;,
index=1, requesterId=PC02ZONE1_PC02_1]] : VoteResult: up=2, down=0
[Mar 05 21:35:31] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:34] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:37] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:39] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:42] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:44] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:47] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:49] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:52] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:54] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:57] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:35:59] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:02] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:04] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:07] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:09] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:12] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:14] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:17] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:19] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:22] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:24] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:27] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:29] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:32] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:34] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:37] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:39] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:42] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:44] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:47] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:49] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:52] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:54] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:57] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:36:59] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:02] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:04] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:07] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:09] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:12] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:14] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:17] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:19] [0000000A] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
[Mar 05 21:37:22] [0000005C] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.165:32012 (own address=10.10.10.164:32012)
Node 2 logs
[Mar 05 21:34:26] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:26] [00000009] org.jgroups.protocols.FD VERBOSE heartbeat missing from
10.10.10.164:32012 (number=0)
[Mar 05 21:34:29] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:29] [00000009] org.jgroups.protocols.FD VERBOSE heartbeat missing from
10.10.10.164:32012 (number=1)
[Mar 05 21:34:31] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:31] [00000009] org.jgroups.protocols.FD VERBOSE heartbeat missing from
10.10.10.164:32012 (number=2)
[Mar 05 21:34:32] [0002AD0D] org.jgroups.blocks.VotingAdapter VERBOSE Checking responses.
[Mar 05 21:34:32] [0002AD0D] org.jgroups.blocks.VotingAdapter VERBOSE Response from node
10.10.10.164:32012 was not received.
[Mar 05 21:34:32] [0002A9F8] org.jgroups.blocks.VotingAdapter VERBOSE Voting on decree
Vote[topic=DefaultDataSource_failover_server,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER_DSN=PC02ZONE1;TTC_SERVER=PC02ZONE1;TCP_PORT=17013;TTC_Timeout=180;,
index=1, requesterId=PC02ZONE1_PC02_1]] : VoteResult: up=2, down=0
[Mar 05 21:34:34] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:34] [00000009] org.jgroups.protocols.FD VERBOSE heartbeat missing from
10.10.10.164:32012 (number=3)
[Mar 05 21:34:36] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:36] [00000185] org.jgroups.protocols.FD VERBOSE heartbeat missing from
10.10.10.164:32012 (number=4)
[Mar 05 21:34:39] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:39] [00000001] org.jgroups.protocols.FD VERBOSE [10.10.10.165:32012]:
received no heartbeat ack from 10.10.10.164:32012 for 6 times (15000 milliseconds),
suspecting it
[Mar 05 21:34:39] [00000185] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:41] [00000001] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:44] [00000001] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:46] [00000001] org.jgroups.blocks.ConnectionTable ERROR failed sending data
to 10.10.10.164:32012: java.net.SocketException: Socket closed
[Mar 05 21:34:46] [00000185] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:49] [0000000B] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:51] [00000001] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:54] [0002AF48] org.jgroups.protocols.pbcast.FLUSH VERBOSE Suspect is
10.10.10.164:32012,completed false, flushOkSet {} flushMembers []
[Mar 05 21:34:54] [0002AF48] org.jgroups.blocks.RequestCorrelator VERBOSE
suspect=10.10.10.164:32012
[Mar 05 21:34:54] [0002AF48] org.jgroups.blocks.RequestCorrelator VERBOSE
suspect=10.10.10.164:32012
[Mar 05 21:34:54] [0002AF49] org.jgroups.blocks.VotingAdapter VERBOSE Conducting voting on
decree Vote[topic=DefaultDataSource_failover_server,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER=PC01ZONE1;TCP_PORT=17013;TTC_Server_DSN=PC01ZONE1;TTC_Timeout=120;,
index=0, requesterId=PC02ZONE1_PC02_1]], consensus type VOTE_ALL, timeout 60500
[Mar 05 21:34:54] [0002AF48] org.jgroups.blocks.RequestCorrelator VERBOSE
suspect=10.10.10.164:32012
[Mar 05 21:34:54] [0002AF48] org.jgroups.blocks.RequestCorrelator VERBOSE
suspect=10.10.10.164:32012
[Mar 05 21:34:54] [0002AF49] org.jgroups.blocks.VotingAdapter VERBOSE Calling remote
methods...
[Mar 05 21:34:54] [0002AF48] org.jgroups.blocks.RequestCorrelator VERBOSE
suspect=10.10.10.164:32012
[Mar 05 21:34:54] [0000000B] org.jgroups.protocols.FD VERBOSE broadcasting SUSPECT message
[suspected_mbrs=[10.10.10.164:32012]] to group
[Mar 05 21:34:54] [0002AF4B] org.jgroups.protocols.pbcast.GMS VERBOSE new=[],
suspected=[10.10.10.164:32012], leaving=[], new view: [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:34:54] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
Event[type=SUSPEND, arg=[10.10.10.165:32012]] at 10.10.10.165:32012. Running FLUSH...
[Mar 05 21:34:54] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Flush coordinator
10.10.10.165:32012 is starting FLUSH with participants [10.10.10.165:32012]
[Mar 05 21:34:54] [0002AF4D] org.jgroups.protocols.FD VERBOSE member is
10.10.10.164:32012
[Mar 05 21:34:54] [0002AF4D] org.jgroups.protocols.FD_SOCK VERBOSE member is
10.10.10.164:32012
[Mar 05 21:34:56] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 timed out waiting for flush responses after 2000 msec. Rejecting flush
to participants [10.10.10.165:32012]
[Mar 05 21:34:56] [0002AF65] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received ABORT_FLUSH from flush coordinator 10.10.10.165:32012, am i
flush participant=true
[Mar 05 21:34:57] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:34:58] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
Event[type=SUSPEND, arg=[10.10.10.165:32012]] at 10.10.10.165:32012. Running FLUSH...
[Mar 05 21:34:58] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Flush coordinator
10.10.10.165:32012 is starting FLUSH with participants [10.10.10.165:32012]
[Mar 05 21:34:59] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:00] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 timed out waiting for flush responses after 2000 msec. Rejecting flush
to participants [10.10.10.165:32012]
[Mar 05 21:35:00] [0002AF65] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received ABORT_FLUSH from flush coordinator 10.10.10.165:32012, am i
flush participant=true
[Mar 05 21:35:01] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
Event[type=SUSPEND, arg=[10.10.10.165:32012]] at 10.10.10.165:32012. Running FLUSH...
[Mar 05 21:35:01] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Flush coordinator
10.10.10.165:32012 is starting FLUSH with participants [10.10.10.165:32012]
[Mar 05 21:35:02] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:03] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 timed out waiting for flush responses after 2000 msec. Rejecting flush
to participants [10.10.10.165:32012]
[Mar 05 21:35:03] [0002AF65] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received ABORT_FLUSH from flush coordinator 10.10.10.165:32012, am i
flush participant=true
[Mar 05 21:35:04] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:07] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:07] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
Event[type=SUSPEND, arg=[10.10.10.165:32012]] at 10.10.10.165:32012. Running FLUSH...
[Mar 05 21:35:07] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Flush coordinator
10.10.10.165:32012 is starting FLUSH with participants [10.10.10.165:32012]
[Mar 05 21:35:09] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:09] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 timed out waiting for flush responses after 2000 msec. Rejecting flush
to participants [10.10.10.165:32012]
[Mar 05 21:35:09] [0002AF65] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received ABORT_FLUSH from flush coordinator 10.10.10.165:32012, am i
flush participant=true
[Mar 05 21:35:11] [0002AF4B] org.jgroups.protocols.pbcast.GMS WARN GMS flush by
coordinator at 10.10.10.165:32012 failed
[Mar 05 21:35:12] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:13] [0002AF4B] org.jgroups.protocols.pbcast.GMS WARN 10.10.10.165:32012
failed to collect all ACKs (1) for mcasted view [10.10.10.165:32012|6]
[10.10.10.165:32012] after 2000ms, missing ACKs from [10.10.10.165:32012],
local_addr=10.10.10.165:32012
[Mar 05 21:35:13] [0002AF4B] org.jgroups.protocols.pbcast.GMS VERBOSE 10.10.10.165:32012
sending RESUME event
[Mar 05 21:35:13] [0002AF4B] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received RESUME at
10.10.10.165:32012, sent STOP_FLUSH to all
[Mar 05 21:35:14] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:17] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:19] [0000000B] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:22] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:24] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:27] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:29] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:32] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:34] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:37] [00000001] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:39] [0000000B] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:42] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:44] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:47] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:49] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:52] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:54] [00000185] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:54] [0002AF49] org.jgroups.blocks.VotingAdapter VERBOSE Checking responses.
[Mar 05 21:35:54] [0002AF49] org.jgroups.blocks.VotingAdapter VERBOSE Response from node
10.10.10.164:32012 was not received.
[Mar 05 21:35:54] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Voting on decree
Vote[topic=DefaultDataSource_failover_server,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER=PC01ZONE1;TCP_PORT=17013;TTC_Server_DSN=PC01ZONE1;TTC_Timeout=120;,
index=0, requesterId=PC02ZONE1_PC02_1]] : VoteResult: up=2, down=0
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
START_FLUSH at 10.10.10.165:32012 responded with FLUSH_COMPLETED to 10.10.10.165:32012
[Mar 05 21:35:57] [0002B0DC] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 FLUSH_COMPLETED from 10.10.10.165:32012,completed true,flushMembers
[10.10.10.165:32012],flushCompleted [10.10.10.165:32012]
[Mar 05 21:35:57] [0002B0DC] org.jgroups.protocols.pbcast.FLUSH VERBOSE All
FLUSH_COMPLETED received at 10.10.10.165:32012
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
START_FLUSH at 10.10.10.165:32012 responded with FLUSH_NOT_COMPLETED to
10.10.10.165:32012
[Mar 05 21:35:57] [0002B0DC] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received FLUSH_NOT_COMPLETED from 10.10.10.165:32012
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
START_FLUSH at 10.10.10.165:32012 responded with FLUSH_NOT_COMPLETED to
10.10.10.165:32012
[Mar 05 21:35:57] [0002B0DD] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received FLUSH_NOT_COMPLETED from 10.10.10.165:32012
[Mar 05 21:35:57] [0002B0DC] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received FLUSH_NOT_COMPLETED from 10.10.10.165:32012 collision=false
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.FLUSH VERBOSE Received
START_FLUSH at 10.10.10.165:32012 responded with FLUSH_NOT_COMPLETED to
10.10.10.165:32012
[Mar 05 21:35:57] [0002B0DD] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received FLUSH_NOT_COMPLETED from 10.10.10.165:32012 collision=false
[Mar 05 21:35:57] [0002B0DE] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received FLUSH_NOT_COMPLETED from 10.10.10.165:32012
[Mar 05 21:35:57] [0002B0DE] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received FLUSH_NOT_COMPLETED from 10.10.10.165:32012 collision=false
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.GMS VERBOSE
view=[10.10.10.165:32012|6] [10.10.10.165:32012]
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.GMS VERBOSE
[local_addr=10.10.10.165:32012] view is [10.10.10.165:32012|6] [10.10.10.165:32012]
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.NAKACK VERBOSE removing
10.10.10.164:32012 from xmit_table (not member anymore)
[Mar 05 21:35:57] [00000001] org.jgroups.protocols.FD_SOCK VERBOSE VIEW_CHANGE received:
[10.10.10.165:32012]
[Mar 05 21:35:57] [00000009] org.jgroups.protocols.FD VERBOSE sending are-you-alive msg to
10.10.10.164:32012 (own address=10.10.10.165:32012)
[Mar 05 21:35:57] [00002A22] org.jgroups.protocols.FD_SOCK VERBOSE socket to null was
reset
[Mar 05 21:35:57] [00002A22] org.jgroups.protocols.FD_SOCK VERBOSE pinger thread
terminated
[Mar 05 21:35:57] [00000009] org.jgroups.blocks.ConnectionTable ERROR exception is
java.io.InterruptedIOException
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.FLUSH VERBOSE Installing view at
10.10.10.165:32012 view is [10.10.10.165:32012|6] [10.10.10.165:32012]
[Mar 05 21:35:57] [0002B0DD] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:35:57] [0002B0DE] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:35:57] [0002B0DD] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:35:57] [0002AF49] org.jgroups.protocols.pbcast.FLUSH VERBOSE At
10.10.10.165:32012 received STOP_FLUSH, unblocking FLUSH.down() and sending UNBLOCK up
[Mar 05 21:35:58] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:00] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:00] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:01] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002B0DD] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002B0DE] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF49] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002B13F] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002B0DE] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:02] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:03] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Conducting voting on
decree Vote[topic=DefaultDataSource_failover_server,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER_DSN=PC02ZONE1;TTC_SERVER=PC02ZONE1;TCP_PORT=17013;TTC_Timeout=180;,
index=1, requesterId=PC02ZONE1_PC02_1]], consensus type VOTE_ALL, timeout 60500
[Mar 05 21:36:04] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Calling remote
methods...
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:04] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:05] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:05] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:05] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:05] [0002A120] org.jgroups.blocks.VotingAdapter VERBOSE Voting on decree
Vote[topic=DefaultDataSource_failover_server,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER_DSN=PC02ZONE1;TTC_SERVER=PC02ZONE1;TCP_PORT=17013;TTC_Timeout=180;,
index=1, requesterId=PC02ZONE1_PC02_1]] : VoteResult: up=2, down=0
[Mar 05 21:36:05] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Checking responses.
[Mar 05 21:36:05] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Conducting voting on
decree Vote[topic=DefaultDataSource_trigger_server_failover,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER_DSN=PC02ZONE1;TTC_SERVER=PC02ZONE1;TCP_PORT=17013;TTC_Timeout=180;,
index=1, requesterId=PC02ZONE1_PC02_1]], consensus type VOTE_ALL, timeout 60500
[Mar 05 21:36:05] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Calling remote
methods...
[Mar 05 21:36:05] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:05] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:06] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:06] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:07] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:09] [0002A120] org.jgroups.blocks.VotingAdapter VERBOSE Voting on decree
Vote[topic=DefaultDataSource_trigger_server_failover,
data=ConsensualTargetServerVote[targetName=DefaultDataSource,
url=jdbc:timesten:client:TTC_SERVER_DSN=PC02ZONE1;TTC_SERVER=PC02ZONE1;TCP_PORT=17013;TTC_Timeout=180;,
index=1, requesterId=PC02ZONE1_PC02_1]] : VoteResult: up=2, down=0
[Mar 05 21:36:09] [00029C9D] org.jgroups.blocks.VotingAdapter VERBOSE Checking responses.
[Mar 05 21:36:09] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:09] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:10] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:10] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:11] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:11] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
[Mar 05 21:36:11] [0002AF43] org.jgroups.protocols.pbcast.NAKACK WARN 10.10.10.165:32012]
discarded message from non-member 10.10.10.164:32012, my view is [10.10.10.165:32012|6]
[10.10.10.165:32012]
Steps to Reproduce:
1. Start two apps and let them form a cluster
2. Run this on one of the nodes: pstop {PID} ; sleep 35 ; prun {PID}
3. Watch jgroups logs
The sleep timeout has to be set to exceed the time it takes to have a node suspected PLUS
the time on the SUSPECT list. In our case it is 15 seconds each
was:
1. Start two apps and let them form a cluster
2. Run this on one of the nodes: pstop {PID} ; sleep 35 ; prun {PID}
3. Watch jgroups logs
The sleep timeout has to be set to exceed the time it takes to have a node suspected PLUS
the time on the SUSPECT list. In our case is 15 seconds each
Node does not re-join the cluster after several lost pings
----------------------------------------------------------
Key: JGRP-1299
URL:
https://issues.jboss.org/browse/JGRP-1299
Project: JGroups
Issue Type: Bug
Affects Versions: 2.6.15
Environment: Solaris OS 10 & Java 1.5 & 1.6
Reporter: Igor M
Assignee: Bela Ban
Priority: Critical
This is what we see in production:
1. Node 1 does not send pings for 25 seconds
2. Node 2 notices 6 lost pings (in 15 seconds)
3. Node 2 starts sending "broadcast SUSPECT"
4. Node 1 replies to a few of them
5. Node 2 does not receive replies until after 15 seconds after it suspected node 1
6. Node 2 removes Node 1 from the view
7. Node 1 keeps sending "are-you-alive" and Node 2 is now discarding them
At this time Node 1 believe there are two nodes in the cluster, and Node 2 only sees
itself.
In the lab we were able to reproduce the problem by stopping Node 1 process:
pstop {PID} ; sleep 35 ; prun {PID}
Once the process is resumed it can never join the cluster.
Here is the log snipped from Node 1. The first two lines show 26 seconds interval between
pings while it should have been 2.5 seconds. Node 2 logs for the same time interval are
after Node 1 logs
I traced the 26 seconds delay to the GC cycle on Node 1. pstop/sleep/prun have almost the
same effect.
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira