[jboss-jira] [JBoss JIRA] (WFLY-10773) JGRP000029: failed sending message: java.io.IOException: Socket Closed

tommaso borgato (JIRA) issues at jboss.org
Tue Jul 31 08:07:00 EDT 2018


     [ https://issues.jboss.org/browse/WFLY-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

tommaso borgato updated WFLY-10773:
-----------------------------------
    Description: 
The error was observed in scenario {{*[eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-db-session-tests/job/eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7_JJB/22/]*}}: a 4 nodes cluster with a mod_jk load balancer where fail-over is introduced by server shut-down and re-start; 

The cluster nodes were configured to use TCP stack for communication:

{code:xml}
<subsystem xmlns="urn:jboss:domain:jgroups:6.0" default-stack="tcp">
    <channels default="ee">
        <channel name="ee" stack="tcp" cluster="ejb"/>
    </channels>
    <stacks>
        <stack name="udp">
            <transport type="UDP" socket-binding="jgroups-udp"/>
            <protocol type="PING"/>
            <protocol type="MERGE3"/>
            <protocol type="FD_SOCK"/>
            <protocol type="FD_ALL"/>
            <protocol type="VERIFY_SUSPECT"/>
            <protocol type="pbcast.NAKACK2"/>
            <protocol type="UNICAST3"/>
            <protocol type="pbcast.STABLE"/>
            <protocol type="pbcast.GMS"/>
            <protocol type="UFC"/>
            <protocol type="MFC"/>
            <protocol type="FRAG3"/>
        </stack>
        <stack name="tcp">
            <transport type="TCP" socket-binding="jgroups-tcp"/>
            <socket-protocol type="MPING" socket-binding="jgroups-mping"/>
            <protocol type="MERGE3"/>
            <protocol type="FD_SOCK"/>
            <protocol type="FD_ALL"/>
            <protocol type="VERIFY_SUSPECT"/>
            <protocol type="pbcast.NAKACK2"/>
            <protocol type="UNICAST3"/>
            <protocol type="pbcast.STABLE"/>
            <protocol type="pbcast.GMS"/>
            <protocol type="MFC"/>
            <protocol type="FRAG3"/>
        </stack>
    </stacks>
</subsystem>
{code}

The 4 cluster nodes store session data into an ivalidation cache backed by a MYSQL Database:

{code:xml}
<invalidation-cache name="offload">
	<locking isolation="REPEATABLE_READ"/>
	<transaction mode="BATCH"/>
	<jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="MYSQL">
		<table prefix="s">
			<id-column name="id" type="VARCHAR(255)"/>
			<data-column name="datum" type="VARBINARY(10000)"/>
			<timestamp-column name="version" type="BIGINT"/>
		</table>
	</jdbc-store>
</invalidation-cache>
{code}

The error was observed on node {{*[dev214|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-db-session-tests/job/eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7_JJB/22/console-dev214/]*}}; here and attempt to isolate the events that may be relevant:

* node dev213 was shut-down and re-started but had not yet re-joined the cluster:
{noformat}
[JBossINF] 02:19:07,082 INFO  [org.infinispan.CLUSTER] (thread-21,ejb,dev214) ISPN100001: Node dev213 left the cluster
{noformat}

* current node dev214 is initating shut-down:
{noformat}
2018/07/31 02:21:43:593 EDT [INFO ][Thread-88] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - [SHUTDOWN] JBossShutdown server host: dev214:9990
{noformat}

* then we observe the error:
{noformat}
[JBossINF] 02:21:44,588 ERROR [org.jgroups.protocols.TCP] (TQ-Bundler-30,ejb,dev214) JGRP000029: dev214: failed sending message to dev215 (59 bytes): java.io.IOException: Socket Closed, headers: UNICAST3: ACK, seqno=137, conn_id=1, ts=131, TP: [cluster_name=ejb]
{noformat}

* current node dev214 completes shut-down:
{noformat}
2018/07/31 02:21:45:459 EDT [DEBUG][RMI TCP Connection(27)-10.16.91.122] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - Server is down.
{noformat}

  was:
The error was observed in scenario {{*[eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-db-session-tests/job/eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7_JJB/22/]*}}: a 4 nodes cluster with a mod_jk load balancer where fail-over is introduced by server shut-down and re-start; 

The cluster nodes were configured to use the TCP stack for communication:

{code:xml}
<subsystem xmlns="urn:jboss:domain:jgroups:6.0" default-stack="tcp">
    <channels default="ee">
        <channel name="ee" stack="tcp" cluster="ejb"/>
    </channels>
    <stacks>
        <stack name="udp">
            <transport type="UDP" socket-binding="jgroups-udp"/>
            <protocol type="PING"/>
            <protocol type="MERGE3"/>
            <protocol type="FD_SOCK"/>
            <protocol type="FD_ALL"/>
            <protocol type="VERIFY_SUSPECT"/>
            <protocol type="pbcast.NAKACK2"/>
            <protocol type="UNICAST3"/>
            <protocol type="pbcast.STABLE"/>
            <protocol type="pbcast.GMS"/>
            <protocol type="UFC"/>
            <protocol type="MFC"/>
            <protocol type="FRAG3"/>
        </stack>
        <stack name="tcp">
            <transport type="TCP" socket-binding="jgroups-tcp"/>
            <socket-protocol type="MPING" socket-binding="jgroups-mping"/>
            <protocol type="MERGE3"/>
            <protocol type="FD_SOCK"/>
            <protocol type="FD_ALL"/>
            <protocol type="VERIFY_SUSPECT"/>
            <protocol type="pbcast.NAKACK2"/>
            <protocol type="UNICAST3"/>
            <protocol type="pbcast.STABLE"/>
            <protocol type="pbcast.GMS"/>
            <protocol type="MFC"/>
            <protocol type="FRAG3"/>
        </stack>
    </stacks>
</subsystem>
{code}

The 4 cluster nodes store session data into an ivalidation cache backed by a MYSQL Database:

{code:xml}
<invalidation-cache name="offload">
	<locking isolation="REPEATABLE_READ"/>
	<transaction mode="BATCH"/>
	<jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="MYSQL">
		<table prefix="s">
			<id-column name="id" type="VARCHAR(255)"/>
			<data-column name="datum" type="VARBINARY(10000)"/>
			<timestamp-column name="version" type="BIGINT"/>
		</table>
	</jdbc-store>
</invalidation-cache>
{code}

The error was observed on node {{*[dev214|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-db-session-tests/job/eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7_JJB/22/console-dev214/]*}}; here and attempt to isolate the events that may be relevant:

* node dev213 was shut-down and re-started but had not yet re-joined the cluster:
{noformat}
[JBossINF] 02:19:07,082 INFO  [org.infinispan.CLUSTER] (thread-21,ejb,dev214) ISPN100001: Node dev213 left the cluster
{noformat}

* current node dev214 is initating shut-down:
{noformat}
2018/07/31 02:21:43:593 EDT [INFO ][Thread-88] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - [SHUTDOWN] JBossShutdown server host: dev214:9990
{noformat}

* then we observe the error:
{noformat}
[JBossINF] 02:21:44,588 ERROR [org.jgroups.protocols.TCP] (TQ-Bundler-30,ejb,dev214) JGRP000029: dev214: failed sending message to dev215 (59 bytes): java.io.IOException: Socket Closed, headers: UNICAST3: ACK, seqno=137, conn_id=1, ts=131, TP: [cluster_name=ejb]
{noformat}

* current node dev214 completes shut-down:
{noformat}
2018/07/31 02:21:45:459 EDT [DEBUG][RMI TCP Connection(27)-10.16.91.122] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - Server is down.
{noformat}



> JGRP000029: failed sending message: java.io.IOException: Socket Closed
> ----------------------------------------------------------------------
>
>                 Key: WFLY-10773
>                 URL: https://issues.jboss.org/browse/WFLY-10773
>             Project: WildFly
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 14.0.0.CR1
>            Reporter: tommaso borgato
>            Assignee: Paul Ferraro
>
> The error was observed in scenario {{*[eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-db-session-tests/job/eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7_JJB/22/]*}}: a 4 nodes cluster with a mod_jk load balancer where fail-over is introduced by server shut-down and re-start; 
> The cluster nodes were configured to use TCP stack for communication:
> {code:xml}
> <subsystem xmlns="urn:jboss:domain:jgroups:6.0" default-stack="tcp">
>     <channels default="ee">
>         <channel name="ee" stack="tcp" cluster="ejb"/>
>     </channels>
>     <stacks>
>         <stack name="udp">
>             <transport type="UDP" socket-binding="jgroups-udp"/>
>             <protocol type="PING"/>
>             <protocol type="MERGE3"/>
>             <protocol type="FD_SOCK"/>
>             <protocol type="FD_ALL"/>
>             <protocol type="VERIFY_SUSPECT"/>
>             <protocol type="pbcast.NAKACK2"/>
>             <protocol type="UNICAST3"/>
>             <protocol type="pbcast.STABLE"/>
>             <protocol type="pbcast.GMS"/>
>             <protocol type="UFC"/>
>             <protocol type="MFC"/>
>             <protocol type="FRAG3"/>
>         </stack>
>         <stack name="tcp">
>             <transport type="TCP" socket-binding="jgroups-tcp"/>
>             <socket-protocol type="MPING" socket-binding="jgroups-mping"/>
>             <protocol type="MERGE3"/>
>             <protocol type="FD_SOCK"/>
>             <protocol type="FD_ALL"/>
>             <protocol type="VERIFY_SUSPECT"/>
>             <protocol type="pbcast.NAKACK2"/>
>             <protocol type="UNICAST3"/>
>             <protocol type="pbcast.STABLE"/>
>             <protocol type="pbcast.GMS"/>
>             <protocol type="MFC"/>
>             <protocol type="FRAG3"/>
>         </stack>
>     </stacks>
> </subsystem>
> {code}
> The 4 cluster nodes store session data into an ivalidation cache backed by a MYSQL Database:
> {code:xml}
> <invalidation-cache name="offload">
> 	<locking isolation="REPEATABLE_READ"/>
> 	<transaction mode="BATCH"/>
> 	<jdbc-store data-source="testDS" fetch-state="false" passivation="false" purge="false" shared="true" dialect="MYSQL">
> 		<table prefix="s">
> 			<id-column name="id" type="VARCHAR(255)"/>
> 			<data-column name="datum" type="VARBINARY(10000)"/>
> 			<timestamp-column name="version" type="BIGINT"/>
> 		</table>
> 	</jdbc-store>
> </invalidation-cache>
> {code}
> The error was observed on node {{*[dev214|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-db-session-tests/job/eap-7x-db-failover-db-session-shutdown-repl-sync-mysql-5-7_JJB/22/console-dev214/]*}}; here and attempt to isolate the events that may be relevant:
> * node dev213 was shut-down and re-started but had not yet re-joined the cluster:
> {noformat}
> [JBossINF] 02:19:07,082 INFO  [org.infinispan.CLUSTER] (thread-21,ejb,dev214) ISPN100001: Node dev213 left the cluster
> {noformat}
> * current node dev214 is initating shut-down:
> {noformat}
> 2018/07/31 02:21:43:593 EDT [INFO ][Thread-88] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - [SHUTDOWN] JBossShutdown server host: dev214:9990
> {noformat}
> * then we observe the error:
> {noformat}
> [JBossINF] 02:21:44,588 ERROR [org.jgroups.protocols.TCP] (TQ-Bundler-30,ejb,dev214) JGRP000029: dev214: failed sending message to dev215 (59 bytes): java.io.IOException: Socket Closed, headers: UNICAST3: ACK, seqno=137, conn_id=1, ts=131, TP: [cluster_name=ejb]
> {noformat}
> * current node dev214 completes shut-down:
> {noformat}
> 2018/07/31 02:21:45:459 EDT [DEBUG][RMI TCP Connection(27)-10.16.91.122] HOST dev220.mw.lab.eng.bos.redhat.com:rootProcess:test - Server is down.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)



More information about the jboss-jira mailing list