[jboss-jira] [JBoss JIRA] (WFLY-6781) Wildfly cluster's failover functionality doesn't work as expected
Preeta Kuruvilla (JIRA)
issues at jboss.org
Wed Jun 29 03:25:00 EDT 2016
[ https://issues.jboss.org/browse/WFLY-6781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258291#comment-13258291 ]
Preeta Kuruvilla commented on WFLY-6781:
----------------------------------------
The cluster that we have is a domain managed cluster where we have domain.xml and host.xml configured on Node1 and only host.xml configured on Node2. The jgroups is a subsystem in the domain.xml for the profile "ha".
Regarding our application, we have 2 components - RC.war and SL.war. The JMS is configured on SL. Only component RC is clustered. SL is not clustered.
Node 1 has - 2 server instances- RC and SL.
<servers>
<server name="server-host1-RC" group="main-server-group" auto-start="true">
<jvm name="default">
<heap size="2048m" max-size="2048m"/>
<permgen size="512m" max-size="512m"/>
<jvm-options>
<!--<option value="-Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n"/>-->
<option value="-XX:CompileCommand=exclude,com/newscale/bfw/signon/filters,AuthenticationFilter"/>
<option value="-XX:CompileCommand=exclude,org/apache/xml/dtm/ref/sax2dtm/SAX2DTM,startElement"/>
<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
<option value="-XX:CompileCommand=exclude,org/apache/xpath/compiler/XPathParser,UnionExpr"/>
</jvm-options>
</jvm>
<socket-bindings socket-binding-group="ha-sockets" port-offset="0"/>
</server>
<server name="server-host1-SL" group="other-server-group" auto-start="true">
<jvm name="default">
<heap size="2048m" max-size="2048m"/>
<permgen size="512m" max-size="512m"/>
<jvm-options>
<option value="-server"/>
</jvm-options>
</jvm>
<socket-bindings socket-binding-group="standard-sockets" port-offset="0"/>
</server>
</servers>
Node 2 : has only one server instance and that has RC
<servers>
<server name="server-host2-RC" group="main-server-group" auto-start="true">
<jvm name="default">
<heap size="2048m" max-size="2048m"/>
<permgen size="512m" max-size="512m"/>
<jvm-options>
<!--<option value="-Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n"/>-->
<option value="-XX:CompileCommand=exclude,com/newscale/bfw/signon/filters,AuthenticationFilter"/>
<option value="-XX:CompileCommand=exclude,org/apache/xml/dtm/ref/sax2dtm/SAX2DTM,startElement"/>
<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
<option value="-XX:CompileCommand=exclude,org/apache/xpath/compiler/XPathParser,UnionExpr"/>
</jvm-options>
</jvm>
<socket-bindings socket-binding-group="ha-sockets" port-offset="0"/>
</server>
</servers>
Now when I say its not working as expected, when we test failover, I mean - the communication of RC and SL is broken.
By the way RC communicates remotely with SL using the below url :
http-remoting://<ip address of SL which is Node1>:6080/
Just a note:- everything is working properly in production if we don't try disabling network or powering off etc.
Let me know if you need any other info.
Thanks,
Preeta
> Wildfly cluster's failover functionality doesn't work as expected
> -----------------------------------------------------------------
>
> Key: WFLY-6781
> URL: https://issues.jboss.org/browse/WFLY-6781
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> Following are the testing scenarios we did and the outcome:-
> 1. Network disabling on a VM for testing failover – Not working for both Linux and Windows environment.
> 2. Power off of a VM using VMware client for testing failover – Is working on Linux environment but not working on windows environment.
> 3. Ctrl + C method to stop services on a node for testing failover – works on both linux and windows environment
> 4. Stopping server running on Node /VM using Admin Console for testing failover - works on both linux and windows environment.
> Jgroups subsystem configuration in domain.xml we have is below:-
> <subsystem xmlns="urn:jboss:domain:jgroups:2.0" default-stack="udp">
> <stack name="udp">
> <transport type="UDP" socket-binding="jgroups-udp"/>
> <protocol type="PING"/>
> <protocol type="MERGE3"/>
> <protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
> <protocol type="FD_ALL"/>
> <protocol type="VERIFY_SUSPECT"/>
> <protocol type="pbcast.NAKACK2"/>
> <protocol type="UNICAST3"/>
> <protocol type="pbcast.STABLE"/>
> <protocol type="pbcast.GMS"/>
> <protocol type="UFC"/>
> <protocol type="MFC"/>
> <protocol type="FRAG2"/>
> <protocol type="RSVP"/>
> </stack>
> <stack name="tcp">
> <transport type="TCP" socket-binding="jgroups-tcp"/>
> <protocol type="MPING" socket-binding="jgroups-mping"/>
> <protocol type="MERGE2"/>
> <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
> <protocol type="FD"/>
> <protocol type="VERIFY_SUSPECT"/>
> <protocol type="pbcast.NAKACK2"/>
> <protocol type="UNICAST3"/>
> <protocol type="pbcast.STABLE"/>
> <protocol type="pbcast.GMS"/>
> <protocol type="MFC"/>
> <protocol type="FRAG2"/>
> <protocol type="RSVP"/>
> </stack>
> </subsystem>
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
More information about the jboss-jira
mailing list