July 2018 - jboss-jira - Jboss List Archives

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4 run 20|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 18|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 21|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} was: h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4 run 20|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 18|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 21|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4 run 20|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 18|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 21|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: > {noformat} > <FD_ALL timeout="60000" > interval="15000" > timeout_check_interval="5000" > /> > {noformat} > In this run, the error is observed on node dev212: > {noformat} > [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected > {noformat} > but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: > {noformat} > [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) > 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! > [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) > [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) > [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4-20|https://...] This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB-18|http...] The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB-21|http...] The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} was: h3. first run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...] This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4-20|https://...] > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB-18|http...] > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB-21|http...] > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: > {noformat} > <FD_ALL timeout="60000" > interval="15000" > timeout_check_interval="5000" > /> > {noformat} > In this run, the error is observed on node dev212: > {noformat} > [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected > {noformat} > but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: > {noformat} > [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) > 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! > [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) > [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) > [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4 run 20|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 18|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 21|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} was: h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4-20|https://...] This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB-18|http...] The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB-21|http...] The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > h2. First run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4 run 20|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > h2. Second run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 18|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > h2. Third run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4_JJB run 21|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EA...] > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: > {noformat} > <FD_ALL timeout="60000" > interval="15000" > timeout_check_interval="5000" > /> > {noformat} > In this run, the error is observed on node dev212: > {noformat} > [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected > {noformat} > but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: > {noformat} > [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) > 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! > [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) > [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) > [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: h3. first run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...] This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} was: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > h3. first run [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...] > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: > {noformat} > <FD_ALL timeout="60000" > interval="15000" > timeout_check_interval="5000" > /> > {noformat} > In this run, the error is observed on node dev212: > {noformat} > [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected > {noformat} > but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: > {noformat} > [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) > 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! > [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) > [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) > [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} In this run, the error is observed on node dev212: {noformat} [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected {noformat} but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: {noformat} [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message ... {noformat} was: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: > {noformat} > <FD_ALL timeout="60000" > interval="15000" > timeout_check_interval="5000" > /> > {noformat} > In this run, the error is observed on node dev212: > {noformat} > [JBossINF] [0m[33m03:56:59,728 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev212) JGRP000032: dev212: no physical address for 2806f77e-ee15-45dc-283d-683a4828e878, dropping message > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,336 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-30,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[31m03:58:02,339 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 21-26 30 46 53-54 58-59 64 69 75 82-83 88 142 150 233 236} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215] > [JBossINF] [0m[33m03:58:02,340 WARN [org.infinispan.statetransfer.InboundTransferTask] (stateTransferExecutor-thread--p20-t14) ISPN000210: Failed to request state of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar from node dev214, segments {47-48 65 87 102 157 163 187-188 190-191 221-223 228 232}: org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: Node dev214 was suspected > {noformat} > but the logs on dev214 show the node wasn't down; it was just restarted and logged the following: > {noformat} > [JBossINF] [0m[0m03:56:14,093 INFO [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://10.16.176.60:9990/management > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://10.16.176.60:9990 > [JBossINF] [0m[0m03:56:14,095 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Full 14.0.0.Beta2-SNAPSHOT (WildFly Core 6.0.0.Alpha4) started in 8533ms - Started 1156 of 1353 services (511 services are lazy, passive or on-demand) > 2018/07/29 03:56:14:095 EDT [DEBUG][Thread-89] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - JBossStartup, server started! > [JBossINF] [0m[33m03:57:13,441 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 43 from non-member dev213 (view=[dev214|0] (1) [dev214]) (received 17 identical messages from dev213 in the last 61714 ms) > [JBossINF] [0m[33m03:57:15,289 WARN [org.jgroups.protocols.pbcast.NAKACK2] (thread-8,ejb,dev214) JGRP000011: dev214: dropped message 90 from non-member dev215 (view=[dev214|0] (1) [dev214]) (received 3 identical messages from dev215 in the last 61551 ms) > [JBossINF] [0m[33m03:57:57,334 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:57:59,339 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:01,342 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[0m03:58:02,337 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,338 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,339 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,340 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,341 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[0m03:58:02,342 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev212|10] (3) [dev212, dev213, dev214], 1 subgroups: [dev214|0] (1) [dev214] > [JBossINF] [0m[0m03:58:02,343 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev212 joined the cluster > [JBossINF] [0m[0m03:58:02,344 INFO [org.infinispan.CLUSTER] (thread-13,ejb,dev214) ISPN100000: Node dev213 joined the cluster > [JBossINF] [0m[33m03:58:03,345 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:05,347 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > [JBossINF] [0m[33m03:58:07,350 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ejb,dev214) JGRP000032: dev214: no physical address for 710670e7-7bb0-9e01-743e-abad40b595ec, dropping message > ... > {noformat} -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: {noformat} <FD_ALL timeout="60000" interval="15000" timeout_check_interval="5000" /> {noformat} was: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were. > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were: > {noformat} > <FD_ALL timeout="60000" > interval="15000" > timeout_check_interval="5000" > /> > {noformat} -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10755) ISPN000208: No live owners found for segments

by tommaso borgato (JIRA)

[ https://issues.jboss.org/browse/WFLY-10755?page=com.atlassian.jira.plugin... ] tommaso borgato updated WFLY-10755: ----------------------------------- Description: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were. was: This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: {noformat} <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> <transport lock-timeout="60000"/> <distributed-cache owners="2" name="dist"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </distributed-cache> <replicated-cache name="repl"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <file-store/> </replicated-cache> <invalidation-cache name="offload"> <locking isolation="REPEATABLE_READ"/> <transaction mode="BATCH"/> <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> <table prefix="s"> <id-column name="id" type="VARCHAR(255)"/> <data-column name="datum" type="BYTEA"/> <timestamp-column name="version" type="BIGINT"/> </table> </jdbc-store> </invalidation-cache> </cache-container> {noformat} The error is observed on node dev212: right after Node dev214 left the cluster: {noformat} [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [JBossINF] at java.lang.Thread.run(Thread.java:748) [JBossINF] ... [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] {noformat} right after Node dev215 left the cluster: {noformat} [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] {noformat} Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. This run already used modified jgroups time-outs: {noformat} <protocol type="FD_ALL"> <property name="timeout">10000</property> <property name="interval">2000</property> <property name="timeout_check_interval">1000</property> </protocol> <protocol type="VERIFY_SUSPECT"> <property name="timeout">1000</property> </protocol> {noformat} but the error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > ISPN000208: No live owners found for segments > --------------------------------------------- > > Key: WFLY-10755 > URL: https://issues.jboss.org/browse/WFLY-10755 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 14.0.0.CR1 > Reporter: tommaso borgato > Assignee: Paul Ferraro > > This error was observed in scenario [eap-7x-db-failover-db-session-shutdown-repl-sync-postgres-9-4|https://jen...]. > The scenario is composed of 4 nodes cluster configured with an invalidation cache backed by a PostreSQL database: > {noformat} > <cache-container name="web" default-cache="repl" module="org.wildfly.clustering.web.infinispan"> > <transport lock-timeout="60000"/> > <distributed-cache owners="2" name="dist"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </distributed-cache> > <replicated-cache name="repl"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <file-store/> > </replicated-cache> > <invalidation-cache name="offload"> > <locking isolation="REPEATABLE_READ"/> > <transaction mode="BATCH"/> > <jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="POSTGRES"> > <table prefix="s"> > <id-column name="id" type="VARCHAR(255)"/> > <data-column name="datum" type="BYTEA"/> > <timestamp-column name="version" type="BIGINT"/> > </table> > </jdbc-store> > </invalidation-cache> > </cache-container> > {noformat} > The error is observed on node dev212: > right after Node dev214 left the cluster: > {noformat} > [JBossINF] [0m[0m09:08:34,196 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN000094: Received new cluster view for channel ejb: [dev212|8] (3) [dev212, dev213, dev215] > [JBossINF] [0m[0m09:08:34,197 INFO [org.infinispan.CLUSTER] (thread-22,ejb,dev212) ISPN100001: Node dev214 left the cluster > [JBossINF] [0m[33m09:08:34,362 WARN [org.infinispan.interceptors.impl.InvalidationInterceptor] (timeout-thread--p10-t1) ISPN000268: Unable to broadcast evicts as a part of the prepare phase. Rolling back.: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 33 from dev215 > [JBossINF] at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) > [JBossINF] at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) > [JBossINF] at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > [JBossINF] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [JBossINF] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [JBossINF] at java.lang.Thread.run(Thread.java:748) > [JBossINF] > ... > [JBossINF] [0m[31m09:08:52,772 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {4 7-9 12-13 30-31 37 49 59 76-77 88-89 92 118-120 156-157 196 205 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev214] > {noformat} > right after Node dev215 left the cluster: > {noformat} > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100000: Node dev214 joined the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev213 left the cluster > [JBossINF] [0m[0m09:11:32,029 INFO [org.infinispan.CLUSTER] (thread-24,ejb,dev212) ISPN100001: Node dev215 left the cluster > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-3 36 48 55-58 65 75 90 93 108-109 126 150 172 176-177 179-180 204 229-230} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[31m09:11:32,030 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (transport-thread--p15-t15) ISPN000208: No live owners found for segments {2-4 7-9 12-13 30-31 36-37 48-49 55-59 65 75-77 88-90 92-93 108-109 118-120 126 150 156-157 172 176-177 179-180 196 204-205 229-230 235 251} of cache clusterbench-ee7.ear/clusterbench-ee7-ejb.jar. Excluded owners: [dev215, dev214] > [JBossINF] [0m[0m09:12:29,829 INFO [org.infinispan.CLUSTER] (thread-21,ejb,dev212) ISPN000093: Received new, MERGED cluster view for channel ejb: MergeView::[dev214|10] (4) [dev214, dev212, dev213, dev215], 2 subgroups: [dev212|8] (3) [dev212, dev213, dev215], [dev214|9] (2) [dev214, dev212] > {noformat} > Please note that node dev213 didn't actually leave the cluster: it was started at 8:59:53 and then restarted at 9:12:29, so the log saying node dev213 left the cluster at 9:11:32 look suspicious. > This run already used modified jgroups time-outs: > {noformat} > <protocol type="FD_ALL"> > <property name="timeout">10000</property> > <property name="interval">2000</property> > <property name="timeout_check_interval">1000</property> > </protocol> > <protocol type="VERIFY_SUSPECT"> > <property name="timeout">1000</property> > </protocol> > {noformat} > The error was observed also in a [previous run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values were unmodified. > The error is observed also in [run|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/E...] where those values are set accordingly to what this [JIRA|https://issues.jboss.org/browse/ISPN-9087] states the previous values for FD_ALL were. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (JBMETA-407) jboss-common_7_0.xsd should not include itself

by Kaz Nishimura (JIRA)

Kaz Nishimura created JBMETA-407: ------------------------------------ Summary: jboss-common_7_0.xsd should not include itself Key: JBMETA-407 URL: https://issues.jboss.org/browse/JBMETA-407 Project: JBoss Metadata Issue Type: Bug Components: common Reporter: Kaz Nishimura Priority: Minor The file [jboss-common_7_0.xsd|https://github.com/jboss/metadata/blob/b55f691848d9d...] contains this line: {{<xsd:include schemaLocation="jboss-common_7_0.xsd"/>}} It includes the file itself. -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (JBMETA-406) Deprecate ReplicationConfig and remove from next schema version

by Paul Ferraro (JIRA)

[ https://issues.jboss.org/browse/JBMETA-406?page=com.atlassian.jira.plugin... ] Paul Ferraro updated JBMETA-406: -------------------------------- Git Pull Request: https://github.com/jboss/metadata/pull/120 > Deprecate ReplicationConfig and remove from next schema version > --------------------------------------------------------------- > > Key: JBMETA-406 > URL: https://issues.jboss.org/browse/JBMETA-406 > Project: JBoss Metadata > Issue Type: Task > Components: web > Reporter: Paul Ferraro > Assignee: Paul Ferraro > > This configuration is planning to grow in complexity and is moving into a WildFly subsystem. > See: https://issues.jboss.org/browse/WFLY-5550 -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

[JBoss JIRA] (WFLY-10754) NullPointerException using Stateless with configured interceptors

by Luca Stancapiano (JIRA)

[ https://issues.jboss.org/browse/WFLY-10754?page=com.atlassian.jira.plugin... ] Luca Stancapiano updated WFLY-10754: ------------------------------------ Description: I report a strange behavior on WildFly 13 when configuring interceptors within stateless. Below I describe the scenario: Here a simple interceptor: {code:java} package it.vige.injection.interceptors; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext; @Interceptor public class OKInterceptor { @AroundInvoke public Object aroundInvoke(InvocationContext ic) throws Exception { return ic.proceed(); } } {code} Here an annotation used as interceptor binding: {code:java} package it.vige.injection.interceptors; import static java.lang.annotation.ElementType.CONSTRUCTOR; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.interceptor.InterceptorBinding; @Retention(RUNTIME) @Target({ METHOD, TYPE, CONSTRUCTOR }) @InterceptorBinding public @interface NotOK { } {code} Here an interceptor annotated with the interceptor binding: {code:java} package it.vige.injection.interceptors; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext; @Interceptor @NotOK public class NotOKInterceptor { @AroundInvoke public Object aroundInvoke(InvocationContext ic) throws Exception { return ic.proceed(); } } {code} Here the stateless service configured with both the interceptors: {code:java} package it.vige.injection.interceptors; import javax.ejb.Stateless; import javax.interceptor.Interceptors; @Stateless public class SimpleService { @Interceptors({ OKInterceptor.class }) public void ok() { } @NotOK public void notOk() { } } {code} This service must have two methods, one attached to the simple interceptor and the other attached to the interceptor binding. Here the beans.xml configuration: {code:java} <beans xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/beans_2_0.xsd" version="2.0" bean-discovery-mode="all"> <interceptors> <class>it.vige.injection.interceptors.OKInterceptor</class> <class>it.vige.injection.interceptors.NotOKInterceptor</class> </interceptors> </beans> {code} And in the end the client who call the service: {code:java} .... @Inject private SimpleService simpleService; ... // this call works: simpleService.ok(); // this call starts a NullPointerException: simpleService.notOk(); ... {code} when I try to call the notOk method I get this exception: {code:java} javax.ejb.EJBException: java.lang.NullPointerException at deployment.test.war//it.vige.injection.test.InterceptorsTestCase.testNotOk(InterceptorsTestCase.java:52) Caused by: java.lang.NullPointerException at deployment.test.war//it.vige.injection.test.InterceptorsTestCase.testNotOk(InterceptorsTestCase.java:52) {code} The same thing was tested on WildFly 12.0.0.Final and it was ok. If on WildFfly 13.0.0.Final I remove the @Stateless annotation from the service it works was: I report a strange behavior on WildFly 13 when configuring interceptors within stateless. Below I describe the scenario: Here a simple interceptor: {code:java} package it.vige.injection.interceptors; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext; @Interceptor public class OKInterceptor { @AroundInvoke public Object aroundInvoke(InvocationContext ic) throws Exception { return ic.proceed(); } } {code} Here an annotation used as interceptor binding: {code:java} package it.vige.injection.interceptors; import static java.lang.annotation.ElementType.CONSTRUCTOR; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.interceptor.InterceptorBinding; @Retention(RUNTIME) @Target({ METHOD, TYPE, CONSTRUCTOR }) @InterceptorBinding public @interface NotOK { } {code} Here an interceptor annotated with the interceptor binding: {code:java} package it.vige.injection.interceptors; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext; @Interceptor @NotOK public class NotOKInterceptor { @AroundInvoke public Object aroundInvoke(InvocationContext ic) throws Exception { return ic.proceed(); } } {code} Here the stateless service configured with both the interceptors: {code:java} package it.vige.injection.interceptors; import javax.ejb.Stateless; import javax.interceptor.Interceptors; @Stateless public class SimpleService { @Interceptors({ OKInterceptor.class }) public void ok() { } @NotOK public void notOk() { } } {code} This service must have two methods, one attached to the simple nterceptor and the other attached to the interceptor binding. Here the beans.xml configuration: {code:java} <beans xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/beans_2_0.xsd" version="2.0" bean-discovery-mode="all"> <interceptors> <class>it.vige.injection.interceptors.OKInterceptor</class> <class>it.vige.injection.interceptors.NotOKInterceptor</class> </interceptors> </beans> {code} And in the end the client who call the service: {code:java} .... @Inject private SimpleService simpleService; ... // this call works: simpleService.ok(); // this call starts a NullPointerException: simpleService.notOk(); ... {code} when I try to call the notOk method I get this exception: {code:java} javax.ejb.EJBException: java.lang.NullPointerException at deployment.test.war//it.vige.injection.test.InterceptorsTestCase.testNotOk(InterceptorsTestCase.java:52) Caused by: java.lang.NullPointerException at deployment.test.war//it.vige.injection.test.InterceptorsTestCase.testNotOk(InterceptorsTestCase.java:52) {code} The same thing was tested on WildFly 12.0.0.Final and it was ok. If on WildFfly 13.0.0.Final I remove the @Stateless annotation from the service it works > NullPointerException using Stateless with configured interceptors > ----------------------------------------------------------------- > > Key: WFLY-10754 > URL: https://issues.jboss.org/browse/WFLY-10754 > Project: WildFly > Issue Type: Bug > Components: CDI / Weld > Affects Versions: 13.0.0.Final > Environment: WildFly 13.0.0.Final and java 10.0.1 > Reporter: Luca Stancapiano > Assignee: Matej Novotny > > I report a strange behavior on WildFly 13 when configuring interceptors within stateless. Below I describe the scenario: > Here a simple interceptor: > {code:java} > package it.vige.injection.interceptors; > import javax.interceptor.AroundInvoke; > import javax.interceptor.Interceptor; > import javax.interceptor.InvocationContext; > @Interceptor > public class OKInterceptor { > @AroundInvoke > public Object aroundInvoke(InvocationContext ic) throws Exception { > return ic.proceed(); > } > } > {code} > Here an annotation used as interceptor binding: > {code:java} > package it.vige.injection.interceptors; > import static java.lang.annotation.ElementType.CONSTRUCTOR; > import static java.lang.annotation.ElementType.METHOD; > import static java.lang.annotation.ElementType.TYPE; > import static java.lang.annotation.RetentionPolicy.RUNTIME; > import java.lang.annotation.Retention; > import java.lang.annotation.Target; > import javax.interceptor.InterceptorBinding; > @Retention(RUNTIME) > @Target({ METHOD, TYPE, CONSTRUCTOR }) > @InterceptorBinding > public @interface NotOK { > } > {code} > Here an interceptor annotated with the interceptor binding: > {code:java} > package it.vige.injection.interceptors; > import javax.interceptor.AroundInvoke; > import javax.interceptor.Interceptor; > import javax.interceptor.InvocationContext; > @Interceptor > @NotOK > public class NotOKInterceptor { > @AroundInvoke > public Object aroundInvoke(InvocationContext ic) throws Exception { > return ic.proceed(); > } > } > {code} > Here the stateless service configured with both the interceptors: > {code:java} > package it.vige.injection.interceptors; > import javax.ejb.Stateless; > import javax.interceptor.Interceptors; > @Stateless > public class SimpleService { > @Interceptors({ OKInterceptor.class }) > public void ok() { > } > @NotOK > public void notOk() { > } > } > {code} > This service must have two methods, one attached to the simple interceptor and the other attached to the interceptor binding. > Here the beans.xml configuration: > {code:java} > <beans xmlns="http://xmlns.jcp.org/xml/ns/javaee" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee > http://xmlns.jcp.org/xml/ns/javaee/beans_2_0.xsd" > version="2.0" bean-discovery-mode="all"> > <interceptors> > <class>it.vige.injection.interceptors.OKInterceptor</class> > <class>it.vige.injection.interceptors.NotOKInterceptor</class> > </interceptors> > </beans> > {code} > And in the end the client who call the service: > {code:java} > .... > @Inject > private SimpleService simpleService; > ... > // this call works: > simpleService.ok(); > // this call starts a NullPointerException: > simpleService.notOk(); > ... > {code} > when I try to call the notOk method I get this exception: > {code:java} > javax.ejb.EJBException: java.lang.NullPointerException > at deployment.test.war//it.vige.injection.test.InterceptorsTestCase.testNotOk(InterceptorsTestCase.java:52) > Caused by: java.lang.NullPointerException > at deployment.test.war//it.vige.injection.test.InterceptorsTestCase.testNotOk(InterceptorsTestCase.java:52) > {code} > The same thing was tested on WildFly 12.0.0.Final and it was ok. > If on WildFfly 13.0.0.Final I remove the @Stateless annotation from the service it works -- This message was sent by Atlassian JIRA (v7.5.0#75005)

7 years, 11 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

jboss-jira July 2018