[jboss-jira] [JBoss JIRA] (WFLY-6762) Wildfly cluster failover test not working as expected on windows OS, when network is disabled on a VM(Node) or by shutting down the VM (Node).

Preeta Kuruvilla (JIRA) issues at jboss.org
Mon Jun 27 10:14:00 EDT 2016


    [ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257451#comment-13257451 ] 

Preeta Kuruvilla commented on WFLY-6762:
----------------------------------------

Just to elaborate on our issues. The network disabling on VM for failover testing is not working for Wildfly cluster on Linux environment as well as Windows environment.

The power off of VM for failover testing is working on Linux environment but not working on windows environment of wildfly cluster.

Could you also explain the reason for the above?

Thanks,
Preeta

> Wildfly cluster failover test not working as expected on windows OS, when network is disabled on a VM(Node) or by shutting down the VM (Node).
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: WFLY-6762
>                 URL: https://issues.jboss.org/browse/WFLY-6762
>             Project: WildFly
>          Issue Type: Quality Risk
>          Components: Clustering
>    Affects Versions: 8.2.0.Final
>            Reporter: Preeta Kuruvilla
>            Assignee: Paul Ferraro
>            Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK 
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster. 
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down/power off a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment. 
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-service-on-windows
> https://developer.jboss.org/thread/238135?tstart=0
> Thanks,
> Preeta



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the jboss-jira mailing list