]
Preeta Kuruvilla commented on WFLY-6762:
----------------------------------------
Just to elaborate on our issues. The network disabling on VM for failover testing is not
working for Wildfly cluster on Linux environment as well as Windows environment.
The power off of VM for failover testing is working on Linux environment but not working
on windows environment of wildfly cluster.
Could you also explain the reason for the above?
Thanks,
Preeta
Wildfly cluster failover test not working as expected on windows OS,
when network is disabled on a VM(Node) or by shutting down the VM (Node).
----------------------------------------------------------------------------------------------------------------------------------------------
Key: WFLY-6762
URL:
https://issues.jboss.org/browse/WFLY-6762
Project: WildFly
Issue Type: Quality Risk
Components: Clustering
Affects Versions: 8.2.0.Final
Reporter: Preeta Kuruvilla
Assignee: Paul Ferraro
Priority: Blocker
In your mail related to WFLY-6749 you has said the below :-
**The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host
machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only
after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout
property.
e.g.
<protocol type="FD_ALL" ><property
name="timeout">60000</property></protocol>
**************************************************************************************************
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in
"Table 29. Failure detection behavior" in the link that you provided-
http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that
disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic
cluster for our application. However it doesn't work and it hampers the application
functionality on Wildfly cluster when we try to disable the network on a node in Wildfly
cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from
admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the
valid failover testing scenario for wildfly cluster.
Also we tried to test the same failover scenario by Shutting down/power off a VM (node)
in a wildfly cluster. It did not work for Windows Environment although it worked for linux
environment.
Note: we are using Windows 2012 environment. Here is a link I found:
http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
https://developer.jboss.org/thread/238135?tstart=0
Thanks,
Preeta