[infinispan-issues] [JBoss JIRA] (ISPN-7489) org.jgroups.protocols.TCP emits errors when node leaves the cluster
Sebastian Łaskawiec (JIRA)
issues at jboss.org
Mon Feb 27 05:24:00 EST 2017
[ https://issues.jboss.org/browse/ISPN-7489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13369482#comment-13369482 ]
Sebastian Łaskawiec commented on ISPN-7489:
-------------------------------------------
With Infinispan {{9.0.0.CR2}} it's much worse. The messages keep showing in a loop:
{code}
[transactions-repository-1-44qbv] 10:14:46,375 INFO [org.infinispan.CLUSTER] (transport-thread--p4-t25) ISPN000310: Starting cluster-wide rebalance for cache transactions, topology CacheTopology{id=7, rebalanceId=3, currentCH=DefaultConsistentHash{ns=20, owners = (2)[transactions-repository-1-p3ghx: 10+5, transactions-repository-1-44qbv: 10+4]}, pendingCH=DefaultConsistentHash{ns=20, owners = (3)[transactions-repository-1-p3ghx: 6+6, transactions-repository-1-44qbv: 7+6, transactions-repository-2-3v86c: 7+8]}, unionCH=null, actualMembers=[transactions-repository-1-p3ghx, transactions-repository-1-44qbv, transactions-repository-2-3v86c], persistentUUIDs=[8719fa74-ec0d-4b0d-a3c8-8d9d996b13b9, f0baa91e-a685-4483-85a7-ff58c1137705, 7d231114-72e4-485b-a697-3fac399bc1dc]}
[transactions-repository-1-44qbv] 10:14:46,394 INFO [org.infinispan.CLUSTER] (transport-thread--p4-t25) [Context=transactions][Context=transactions-repository-1-44qbv]ISPN100002: Started local rebalance
[transactions-repository-1-d6j5g] *** JBossAS process (81) received TERM signal ***
[transactions-repository-1-p3ghx] 10:14:47,644 INFO [org.jboss.as.protocol] (management I/O-2) WFLYPRT0057: cancelled task by interrupting thread Thread[management-handler-thread - 1,5,management-handler-thread]
[transactions-repository-1-44qbv] 10:14:47,725 ERROR [org.jgroups.protocols.TCP] (jgroups-21,transactions-repository-1-44qbv) JGRP000029: transactions-repository-1-44qbv: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4243, TP: [cluster_name=cluster]
[transactions-repository-1-p3ghx] 10:14:47,924 ERROR [org.jgroups.protocols.TCP] (jgroups-19,transactions-repository-1-p3ghx) JGRP000029: transactions-repository-1-p3ghx: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=3647, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:14:47,934 ERROR [org.jgroups.protocols.TCP] (jgroups-17,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-d6j5g (70 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=35, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] [GC (Allocation Failure) 342729K->59071K(1013632K), 0.0709969 secs]
[transactions-repository-1-44qbv] [GC (Allocation Failure) 627366K->340840K(1013632K), 0.0618653 secs]
[transactions-repository-1-44qbv] 10:14:48,569 ERROR [org.jgroups.protocols.TCP] (jgroups-21,transactions-repository-1-44qbv) JGRP000029: transactions-repository-1-44qbv: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4243, TP: [cluster_name=cluster]
[transactions-repository-1-p3ghx] 10:14:48,726 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-1-p3ghx) JGRP000029: transactions-repository-1-p3ghx: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=3647, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:14:48,741 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-d6j5g (70 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=35, TP: [cluster_name=cluster]
[transactions-repository-1-44qbv] 10:14:49,371 ERROR [org.jgroups.protocols.TCP] (jgroups-4,transactions-repository-1-44qbv) JGRP000029: transactions-repository-1-44qbv: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4243, TP: [cluster_name=cluster]
[transactions-repository-1-p3ghx] [GC (Allocation Failure) 630424K->385468K(1013632K), 0.1033192 secs]
[transactions-repository-1-p3ghx] 10:14:49,528 ERROR [org.jgroups.protocols.TCP] (jgroups-26,transactions-repository-1-p3ghx) JGRP000029: transactions-repository-1-p3ghx: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=3647, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:14:49,557 ERROR [org.jgroups.protocols.TCP] (jgroups-7,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-d6j5g (70 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=35, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] [GC (Allocation Failure) 338687K->59794K(1013632K), 0.0479196 secs]
[transactions-repository-1-44qbv] 10:14:50,178 ERROR [org.jgroups.protocols.TCP] (jgroups-3,transactions-repository-1-44qbv) JGRP000029: transactions-repository-1-44qbv: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4243, TP: [cluster_name=cluster]
[transactions-repository-1-44qbv] [GC (Allocation Failure) 620456K->345034K(1013632K), 0.0384319 secs]
[transactions-repository-1-p3ghx] 10:14:50,336 ERROR [org.jgroups.protocols.TCP] (jgroups-7,transactions-repository-1-p3ghx) JGRP000029: transactions-repository-1-p3ghx: failed sending message to transactions-repository-1-d6j5g (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=3647, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:14:50,363 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-d6j5g (70 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=35, TP: [cluster_name=cluster]
...
[transactions-repository-2-3v86c] 10:16:22,972 ERROR [org.jgroups.protocols.TCP] (jgroups-15,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-p3ghx (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=9006, conn_id=1, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:16:23,772 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-44qbv (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4222, conn_id=3, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:16:24,073 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-p3ghx (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=9006, conn_id=1, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:16:24,875 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-44qbv (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4222, conn_id=3, TP: [cluster_name=cluster]
[transactions-repository-2-3v86c] 10:16:25,176 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000029: transactions-repository-2-3v86c: failed sending message to transactions-repository-1-p3ghx (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=9006, conn_id=1, TP: [cluster_name=cluster]
{code}
Full logs: https://gist.github.com/slaskawi/b016250a867134e667502d3690a3eea1
After a while those messages are becoming even more scary:
{code}
10:18:40,778 WARN [org.jgroups.protocols.TCP] (jgroups-15,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-44qbv, dropping message
10:18:40,778 WARN [org.jgroups.protocols.TCP] (jgroups-15,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-p3ghx, dropping message
10:18:40,778 WARN [org.jgroups.protocols.TCP] (jgroups-15,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-d6j5g, dropping message
10:18:42,779 WARN [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-44qbv, dropping message
10:18:42,779 WARN [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-p3ghx, dropping message
10:18:42,779 WARN [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-d6j5g, dropping message
10:18:44,780 WARN [org.jgroups.protocols.TCP] (jgroups-17,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-44qbv, dropping message
10:18:44,780 WARN [org.jgroups.protocols.TCP] (jgroups-17,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-p3ghx, dropping message
10:18:44,780 WARN [org.jgroups.protocols.TCP] (jgroups-17,transactions-repository-2-3v86c) JGRP000032: transactions-repository-2-3v86c: no physical address for transactions-repository-1-d6j5g, dropping message
{code}
And after a while ({{10:20:04,819}}) they become silent. They appeared for the first time in {{10:14:47,934}}, so it has been about 5 mins.
> org.jgroups.protocols.TCP emits errors when node leaves the cluster
> -------------------------------------------------------------------
>
> Key: ISPN-7489
> URL: https://issues.jboss.org/browse/ISPN-7489
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations, Core
> Affects Versions: 9.0.0.CR1
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Custom Infinispan Server build (based on [these instructions|https://github.com/slaskawi/infinispan-1/tree/custom_image]). SHA1 {{2b0731b21649a88a75ed71d21b9cc06ba365e947}}
> Reporter: Sebastian Łaskawiec
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-04x09] 18:09:12,193 ERROR [org.jgroups.protocols.TCP] (jgroups-30,transactions-repository-1-04x09) JGRP000029: transactions-repository-1-04x09: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=5262, TP: [cluster_name=cluster]
> [transactions-repository-1-1f8dx] 18:09:12,310 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-1-1f8dx) JGRP000029: transactions-repository-1-1f8dx: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=6259, TP: [cluster_name=cluster]
> [transactions-repository-1-04x09] 18:09:12,997 ERROR [org.jgroups.protocols.TCP] (jgroups-22,transactions-repository-1-04x09) JGRP000029: transactions-repository-1-04x09: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=5262, TP: [cluster_name=cluster]
> [transactions-repository-1-1f8dx] 18:09:13,113 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-1-1f8dx) JGRP000029: transactions-repository-1-1f8dx: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=6259, TP: [cluster_name=cluster]
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/530241bb695f1f490bcb25eabaf9d676
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Do the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
More information about the infinispan-issues
mailing list