[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).

Monday, 27 June 2016

     [
https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin....
]

Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
    Description: 
In your mail related to WFLY-6749 you has said the below :-

**The default stack contains the following failure detection protocols:
FD_SOCK 
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host
machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only
after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout
property.
e.g.
<protocol type="FD_ALL" ><property
name="timeout">60000</property></protocol>
**************************************************************************************************

Thanks for the quick response on WFLY-6749.

Based on your suggestion, I had a taken a look at the testing scenarios mentioned in
"Table 29. Failure detection behavior" in the link that you provided-
http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that
disabling a network on a node, is a valid testing scenario in Wildfly cluster. 

The Failover is working properly when the network on a node is disabled on a weblogic
cluster for our application. However it doesn't work and it hampers the application
functionality on Wildfly cluster when we try to disable the network on a node in Wildfly
cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from
admin console or give Ctrl + C to stop the services on a node.

Would like to get a confirmation from you that disabling the network on a node is not the
valid failover testing scenario for wildfly cluster.

Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly
cluster. It did not work for Windows Environment although it worked for linux environment.

Note: we are using Windows 2012 environment. Here is a link I found:
http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...

Thanks,
Preeta

  was:
In your mail related to WFLY-6749 you has said the below :-

**The default stack contains the following failure detection protocols:
FD_SOCK 
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host
machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only
after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout
property.
e.g.
<protocol type="FD_ALL" ><property
name="timeout">60000</property></protocol>
**

Thanks for the quick response on WFLY-6749.

Based on your suggestion, I had a taken a look at the testing scenarios mentioned in
"Table 29. Failure detection behavior" in the link that you provided-
http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that
disabling a network on a node, is a valid testing scenario in Wildfly cluster. 

The Failover is working properly when the network on a node is disabled on a weblogic
cluster for our application. However it doesn't work and it hampers the application
functionality on Wildfly cluster when we try to disable the network on a node in Wildfly
cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from
admin console or give Ctrl + C to stop the services on a node.

Would like to get a confirmation from you that disabling the network on a node is not the
valid failover testing scenario for wildfly cluster.

Thanks,
Preeta

...
 Windows ennvironment: Wildfly failover test not working when Network
Disabling on a VM(Node) or by Shutting down the VM (Node).

-------------------------------------------------------------------------------------------------------------------------------

                 Key: WFLY-6762
                 URL: https://issues.jboss.org/browse/WFLY-6762
             Project: WildFly
          Issue Type: Quality Risk
          Components: Clustering
    Affects Versions: 8.2.0.Final
            Reporter: Preeta Kuruvilla
            Assignee: Paul Ferraro
            Priority: Blocker

 In your mail related to WFLY-6749 you has said the below :-
 **The default stack contains the following failure detection protocols:
 FD_SOCK 
 FD_ALL
 These protocols are described here:
 http://www.jgroups.org/manual/index.html#FailureDetection
 I suspect that your method of simulating a failure - by disabling the network of the host
machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only
after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout
property.
 e.g.
 <protocol type="FD_ALL" ><property
name="timeout">60000</property></protocol>

**************************************************************************************************
 Thanks for the quick response on WFLY-6749.
 Based on your suggestion, I had a taken a look at the testing scenarios mentioned in
"Table 29. Failure detection behavior" in the link that you provided-
http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that
disabling a network on a node, is a valid testing scenario in Wildfly cluster. 
 The Failover is working properly when the network on a node is disabled on a weblogic
cluster for our application. However it doesn't work and it hampers the application
functionality on Wildfly cluster when we try to disable the network on a node in Wildfly
cluster.
 However as I said earlier, the failover on wildfly cluster works when we stop a node from
admin console or give Ctrl + C to stop the services on a node.
 Would like to get a confirmation from you that disabling the network on a node is not the
valid failover testing scenario for wildfly cluster.
 Also we tried to test the same failover scenario by Shutting down a VM (node) in a
wildfly cluster. It did not work for Windows Environment although it worked for linux
environment. 
 Note: we are using Windows 2012 environment. Here is a link I found:
http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
 Thanks,
 Preeta 

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006