[jboss-jira] [JBoss JIRA] (WFLY-6762) Is Network Disabling on a Node, a valid test scenario for testing Wildfly Cluster Failover?
Preeta Kuruvilla (JIRA)
issues at jboss.org
Mon Jun 27 02:23:00 EDT 2016
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Description:
In your mail related to WFLY-6749 you has said the below :-
**The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
e.g.
<protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
**
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
Thanks,
Preeta
was:
In your mail related to WFLY-6749 you has said the below :-
*The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
e.g.
<protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
*
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
Thanks,
Preeta
> Is Network Disabling on a Node, a valid test scenario for testing Wildfly Cluster Failover?
> -------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: CTS Challenge
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Critical
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
More information about the jboss-jira
mailing list