[jboss-jira] [JBoss JIRA] (WFLY-6762) Wildfly cluster failover test not working as expected on windows OS, when network is disabled on a VM(Node) or by shutting down the VM (Node).
Preeta Kuruvilla (JIRA)
issues at jboss.org
Mon Jun 27 11:15:01 EDT 2016
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257507#comment-13257507 ]
Preeta Kuruvilla commented on WFLY-6762:
----------------------------------------
Oops! sorry. I just missed to state that we had this jgroups configuration for cache replication too and wanted to be sure that it does not interfere with the jgroups dealings with the cluster failover and hence I mentioned it in my previous comment.
As for the domain.xml for wildfly cluster -the jgroups subsytems, is as under:-
<subsystem xmlns="urn:jboss:domain:jgroups:2.0" default-stack="udp">
<stack name="udp">
<transport type="UDP" socket-binding="jgroups-udp"/>
<protocol type="PING"/>
<protocol type="MERGE3"/>
<protocol type="FD_SOCK" socket-binding="jgroups-udp-fd"/>
<protocol type="FD_ALL"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="UFC"/>
<protocol type="MFC"/>
<protocol type="FRAG2"/>
<protocol type="RSVP"/>
</stack>
<stack name="tcp">
<transport type="TCP" socket-binding="jgroups-tcp"/>
<protocol type="MPING" socket-binding="jgroups-mping"/>
<protocol type="MERGE2"/>
<protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
<protocol type="FD"/>
<protocol type="VERIFY_SUSPECT"/>
<protocol type="pbcast.NAKACK2"/>
<protocol type="UNICAST3"/>
<protocol type="pbcast.STABLE"/>
<protocol type="pbcast.GMS"/>
<protocol type="MFC"/>
<protocol type="FRAG2"/>
<protocol type="RSVP"/>
</stack>
</subsystem>
Thanks,
Preeta
> Wildfly cluster failover test not working as expected on windows OS, when network is disabled on a VM(Node) or by shutting down the VM (Node).
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you have said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down/power off a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-service-on-windows
> https://developer.jboss.org/thread/238135?tstart=0
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
More information about the jboss-jira
mailing list