[jboss-jira] [JBoss JIRA] (WFLY-6781) Wildfly cluster's failover functionality doesn't work as expected

Preeta Kuruvilla (JIRA) issues at jboss.org
Wed Jun 29 03:25:00 EDT 2016


    [ https://issues.jboss.org/browse/WFLY-6781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258291#comment-13258291 ] 

Preeta Kuruvilla commented on WFLY-6781:
----------------------------------------

The cluster that we have is a domain managed cluster where we have domain.xml and host.xml configured on Node1 and only host.xml configured on Node2. The jgroups is a subsystem in the domain.xml for the profile "ha".

Regarding our application, we have 2 components - RC.war and SL.war. The JMS is configured on SL. Only component RC is clustered. SL is not clustered.

Node 1 has - 2 server instances- RC and SL. 

 <servers>		
		<server name="server-host1-RC" group="main-server-group" auto-start="true">
		<jvm name="default">
			<heap size="2048m" max-size="2048m"/>
			<permgen size="512m" max-size="512m"/>
			<jvm-options>
        <!--<option value="-Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n"/>-->
				<option value="-XX:CompileCommand=exclude,com/newscale/bfw/signon/filters,AuthenticationFilter"/>
				<option value="-XX:CompileCommand=exclude,org/apache/xml/dtm/ref/sax2dtm/SAX2DTM,startElement"/>
				<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
				<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
				<option value="-XX:CompileCommand=exclude,org/apache/xpath/compiler/XPathParser,UnionExpr"/>
			</jvm-options>
		</jvm>
		<socket-bindings socket-binding-group="ha-sockets" port-offset="0"/>
		</server> 
		<server name="server-host1-SL" group="other-server-group" auto-start="true">
		<jvm name="default">
			<heap size="2048m" max-size="2048m"/>
			<permgen size="512m" max-size="512m"/>
			<jvm-options>
			<option value="-server"/>
			</jvm-options>
		</jvm>
		<socket-bindings socket-binding-group="standard-sockets" port-offset="0"/>
		</server>       
    </servers>

Node 2 : has only one server instance and that has RC

<servers>		
		<server name="server-host2-RC" group="main-server-group" auto-start="true">
		<jvm name="default">
			<heap size="2048m" max-size="2048m"/>
			<permgen size="512m" max-size="512m"/>
			<jvm-options>
        <!--<option value="-Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n"/>-->
				<option value="-XX:CompileCommand=exclude,com/newscale/bfw/signon/filters,AuthenticationFilter"/>
				<option value="-XX:CompileCommand=exclude,org/apache/xml/dtm/ref/sax2dtm/SAX2DTM,startElement"/>
				<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
				<option value="-XX:CompileCommand=exclude,org/exolab/castor/xml/Marshaller,marshal"/>
				<option value="-XX:CompileCommand=exclude,org/apache/xpath/compiler/XPathParser,UnionExpr"/>
			</jvm-options>
		</jvm>
		<socket-bindings socket-binding-group="ha-sockets" port-offset="0"/>
		</server>       
    </servers>

Now when I say its not working as expected, when we test failover, I mean - the communication of RC and SL is broken.

By the way RC communicates remotely with SL using the below url : 
http-remoting://<ip address of SL which is Node1>:6080/

Just a note:- everything is working properly in production if we don't try disabling network or powering off etc.

Let me know if you need any other info.

Thanks,
Preeta



> Wildfly cluster's failover functionality doesn't work as expected
> -----------------------------------------------------------------
>
>                 Key: WFLY-6781
>                 URL: https://issues.jboss.org/browse/WFLY-6781
>             Project: WildFly
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 8.2.0.Final
>            Reporter: Preeta Kuruvilla
>            Assignee: Paul Ferraro
>            Priority: Blocker
>
> Following are the testing scenarios we did and the outcome:-
> 1. Network disabling on a VM for testing failover – Not working for both Linux and Windows environment.
> 2. Power off of a VM using VMware client  for testing failover – Is working on Linux environment but not working on windows environment.
> 3. Ctrl + C method to stop services on a node for testing failover – works on both linux and windows environment
> 4. Stopping server running on Node /VM using Admin Console  for testing failover  - works on both linux and windows environment.
> Jgroups subsystem configuration in domain.xml we have is below:-
> <subsystem xmlns="urn:jboss:domain:jgroups:2.0" default-stack="udp">
>                 <stack name="udp">
>                     <transport type="UDP" socket-binding="jgroups-udp"/>
>                     <protocol type="PING"/>
>                     <protocol type="MERGE3"/>
>                     <protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
>                     <protocol type="FD_ALL"/>
>                     <protocol type="VERIFY_SUSPECT"/>
>                     <protocol type="pbcast.NAKACK2"/>
>                     <protocol type="UNICAST3"/>
>                     <protocol type="pbcast.STABLE"/>
>                     <protocol type="pbcast.GMS"/>
>                     <protocol type="UFC"/>
>                     <protocol type="MFC"/>
>                     <protocol type="FRAG2"/>
>                     <protocol type="RSVP"/>
>                 </stack>
>                 <stack name="tcp">
>                     <transport type="TCP" socket-binding="jgroups-tcp"/>
>                     <protocol type="MPING" socket-binding="jgroups-mping"/>
>                     <protocol type="MERGE2"/>
>                     <protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
>                     <protocol type="FD"/>
>                     <protocol type="VERIFY_SUSPECT"/>
>                     <protocol type="pbcast.NAKACK2"/>
>                     <protocol type="UNICAST3"/>
>                     <protocol type="pbcast.STABLE"/>
>                     <protocol type="pbcast.GMS"/>
>                     <protocol type="MFC"/>
>                     <protocol type="FRAG2"/>
>                     <protocol type="RSVP"/>
>                 </stack>
>             </subsystem>



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)



More information about the jboss-jira mailing list