[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Summary: Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by shutting down the VM (Node). (was: Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).)
> Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by shutting down the VM (Node).
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
> https://developer.jboss.org/thread/238135?tstart=0
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (DROOLS-1212) Input stream finalized or forced closed during server startup
by Petr Široký (JIRA)
[ https://issues.jboss.org/browse/DROOLS-1212?page=com.atlassian.jira.plugi... ]
Petr Široký edited comment on DROOLS-1212 at 6/27/16 6:27 AM:
--------------------------------------------------------------
Thanks for reporting this issue[~fmrj]. I fixed the first occurrence as part of https://github.com/droolsjbpm/drools/commit/31f9f994f74e95e5118f627e9338f.... The second one needs to be directly fixed in ECJ. I've proposed a fix via GitHub PR: https://github.com/eclipse/eclipse.jdt.core/pull/5 (not sure if it gets accepted though).
was (Author: psiroky):
Thanks for reporting his [~fmrj]. I fixed the first occurrence as part of https://github.com/droolsjbpm/drools/commit/31f9f994f74e95e5118f627e9338f.... The second one needs to be directly fixed in ECJ. I've proposed a fix via GitHub PR: https://github.com/eclipse/eclipse.jdt.core/pull/5
> Input stream finalized or forced closed during server startup
> -------------------------------------------------------------
>
> Key: DROOLS-1212
> URL: https://issues.jboss.org/browse/DROOLS-1212
> Project: Drools
> Issue Type: Bug
> Components: core engine
> Affects Versions: 6.3.0.Final
> Environment: Glassfish 4.1.1
> Reporter: Fernando Machado
> Assignee: Petr Široký
> Fix For: 7.0.0.Beta1
>
>
> I'm getting some exceptions during GF4 startup related with _"Input stream has been finalized or forced closed without being explicitly closed; stream instantiation reported in following stack trace"_ and it seems similar to DROOLS-812.
> org.drools.template.parser.DefaultTemplateRuleBase.readKnowledgeBase(String)
> {code}
> private InternalKnowledgeBase readKnowledgeBase(String drl) {
> try {
> // logger.info(drl);
> // read in the source
> Reader source = new StringReader(drl);
> KnowledgeBuilderImpl builder = new KnowledgeBuilderImpl();
> builder.addPackageFromDrl(source);
> InternalKnowledgePackage pkg = builder.getPackage();
> // add the package to a rulebase (deploy the rule package).
> InternalKnowledgeBase kBase = (InternalKnowledgeBase) KnowledgeBaseFactory.newKnowledgeBase();
> kBase.addPackage(pkg);
> return kBase;
> } catch (Exception e) {
> throw new RuntimeException(e);
> }
> }
> {code}
> Is related with:
> {noformat}
> [#|2016-06-15T13:51:07.671+0200|WARNING|glassfish 4.1|javax.enterprise.system.util|_ThreadID=74442;_ThreadName=RunLevelControllerThread-1465991462855;_TimeMillis=1465991467671;_LevelValue=900;_MessageID=NCLS-COMUTIL-00023;|Input stream has been finalized or forced closed without being explicitly closed; stream instantiation reported in following stack trace
> java.lang.Throwable
> at org.drools.template.parser.DefaultTemplateRuleBase.readKnowledgeBase(DefaultTemplateRuleBase.java:133)
> at org.drools.template.parser.DefaultTemplateRuleBase.<init>(DefaultTemplateRuleBase.java:56)
> at org.drools.template.parser.TemplateDataListener.<init>(TemplateDataListener.java:74)
> at org.drools.template.parser.TemplateDataListener.<init>(TemplateDataListener.java:50)
> at org.drools.template.ObjectDataCompiler.compile(ObjectDataCompiler.java:57)
> at com.mycompany.mypackage.MyClass.init(MyClass.java:xxx)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1035)
> at com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:72)
> at com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:205)
> at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73)
> at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:986)
> at com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:72)
> at com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:205)
> at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
> at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:125)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:986)
> at com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:72)
> at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:412)
> at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:375)
> at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:2014)
> at com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:468)
> at com.sun.ejb.containers.AbstractSingletonContainer.access$000(AbstractSingletonContainer.java:74)
> at com.sun.ejb.containers.AbstractSingletonContainer$SingletonContextFactory.create(AbstractSingletonContainer.java:647)
> at com.sun.ejb.containers.AbstractSingletonContainer.instantiateSingletonInstance(AbstractSingletonContainer.java:389)
> at org.glassfish.ejb.startup.SingletonLifeCycleManager.initializeSingleton(SingletonLifeCycleManager.java:219)
> at org.glassfish.ejb.startup.SingletonLifeCycleManager.initializeSingleton(SingletonLifeCycleManager.java:180)
> at org.glassfish.ejb.startup.SingletonLifeCycleManager.doStartup(SingletonLifeCycleManager.java:158)
> ...
> {noformat}
> There is another class that is also throwing the same exception:
> {noformat}
> [#|2016-06-14T11:08:19.953+0200|WARNING|glassfish 4.1|javax.enterprise.system.util|_ThreadID=68093;_ThreadName=RunLevelControllerThread-1465895294550;_TimeMillis=1465895299953;_LevelValue=900;_MessageID=NCLS-COMUTIL-00023;|Input stream has been finalized or forced closed without being explicitly closed;
> stream instantiation reported in following stack trace
> java.lang.Throwable
> at com.sun.enterprise.loader.ASURLClassLoader$SentinelInputStream.<init>(ASURLClassLoader.java:1278)
> at com.sun.enterprise.loader.ASURLClassLoader$InternalJarURLConnection.getInputStream(ASURLClassLoader.java:1386)
> at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:238)
> at com.sun.enterprise.loader.ASURLClassLoader.getResourceAsStream(ASURLClassLoader.java:930)
> at java.lang.Class.getResourceAsStream(Class.java:2223)
> at org.eclipse.jdt.internal.compiler.parser.Parser.readReadableNameTable(Parser.java:719)
> at org.eclipse.jdt.internal.compiler.parser.Parser.initTables(Parser.java:615)
> at org.eclipse.jdt.internal.compiler.parser.Parser.<clinit>(Parser.java:124)
> at org.eclipse.jdt.internal.compiler.Compiler.initializeParser(Compiler.java:687)
> at org.eclipse.jdt.internal.compiler.Compiler.<init>(Compiler.java:285)
> at org.eclipse.jdt.internal.compiler.Compiler.<init>(Compiler.java:206)
> at org.drools.compiler.commons.jci.compilers.EclipseJavaCompiler.compile(EclipseJavaCompiler.java:416)
> at org.drools.compiler.commons.jci.compilers.AbstractJavaCompiler.compile(AbstractJavaCompiler.java:49)
> at org.drools.compiler.rule.builder.dialect.java.JavaDialect.compileAll(JavaDialect.java:417)
> at org.drools.compiler.compiler.DialectCompiletimeRegistry.compileAll(DialectCompiletimeRegistry.java:61)
> at org.drools.compiler.compiler.PackageRegistry.compileAll(PackageRegistry.java:138)
> at org.drools.compiler.builder.impl.KnowledgeBuilderImpl.compileAll(KnowledgeBuilderImpl.java:1314)
> at org.drools.compiler.builder.impl.KnowledgeBuilderImpl.compileAllRules(KnowledgeBuilderImpl.java:953)
> at org.drools.compiler.builder.impl.KnowledgeBuilderImpl.addPackage(KnowledgeBuilderImpl.java:944)
> at org.drools.compiler.builder.impl.KnowledgeBuilderImpl.addPackageFromDrl(KnowledgeBuilderImpl.java:363)
> at org.drools.compiler.builder.impl.KnowledgeBuilderImpl.addPackageFromDrl(KnowledgeBuilderImpl.java:339)
> at org.drools.template.parser.DefaultTemplateRuleBase.readKnowledgeBase(DefaultTemplateRuleBase.java:133)
> at org.drools.template.parser.DefaultTemplateRuleBase.<init>(DefaultTemplateRuleBase.java:56)
> at org.drools.template.parser.TemplateDataListener.<init>(TemplateDataListener.java:74)
> at org.drools.template.parser.TemplateDataListener.<init>(TemplateDataListener.java:50)
> at org.drools.template.ObjectDataCompiler.compile(ObjectDataCompiler.java:57)
> at com.mycompany.mypackage.MyClass.init(MyClass.java:xxx)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1035)
> at com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:72)
> at com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:205)
> at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73)
> at org.jboss.weld.ejb.SessionBeanInterceptor.aroundInvoke(SessionBeanInterceptor.java:52)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:986)
> at com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:72)
> at com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:205)
> at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCall(SystemInterceptorProxy.java:163)
> at com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:125)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:986)
> at com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:72)
> at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:412)
> at com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:375)
> at com.sun.ejb.containers.BaseContainer.intercept(BaseContainer.java:2014)
> at com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:468)
> at com.sun.ejb.containers.AbstractSingletonContainer.access$000(AbstractSingletonContainer.java:74)
> at com.sun.ejb.containers.AbstractSingletonContainer$SingletonContextFactory.create(AbstractSingletonContainer.java:647)
> at com.sun.ejb.containers.AbstractSingletonContainer.instantiateSingletonInstance(AbstractSingletonContainer.java:389)
> at org.glassfish.ejb.startup.SingletonLifeCycleManager.initializeSingleton(SingletonLifeCycleManager.java:219)
> at org.glassfish.ejb.startup.SingletonLifeCycleManager.initializeSingleton(SingletonLifeCycleManager.java:180)
> at org.glassfish.ejb.startup.SingletonLifeCycleManager.doStartup(SingletonLifeCycleManager.java:158)
> at org.glassfish.ejb.startup.EjbApplication.start(EjbApplication.java:166)
> ...
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Description:
In your mail related to WFLY-6749 you has said the below :-
**The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
e.g.
<protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
**************************************************************************************************
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
https://developer.jboss.org/thread/238135?tstart=0
Thanks,
Preeta
was:
In your mail related to WFLY-6749 you has said the below :-
**The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
e.g.
<protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
**************************************************************************************************
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
Thanks,
Preeta
> Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
> https://developer.jboss.org/thread/238135?tstart=0
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Priority: Blocker (was: Critical)
> Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Summary: Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node). (was: Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).)
> Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Priority: Critical (was: Blocker)
> Windows ennvironment: Wildfly failover test not working when network is disabled on a VM(Node) or by Shutting down the VM (Node).
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Critical
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Description:
In your mail related to WFLY-6749 you has said the below :-
**The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
e.g.
<protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
**************************************************************************************************
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
Thanks,
Preeta
was:
In your mail related to WFLY-6749 you has said the below :-
**The default stack contains the following failure detection protocols:
FD_SOCK
FD_ALL
These protocols are described here:
http://www.jgroups.org/manual/index.html#FailureDetection
I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
e.g.
<protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
**
Thanks for the quick response on WFLY-6749.
Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
Thanks,
Preeta
> Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **************************************************************************************************
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Also we tried to test the same failover scenario by Shutting down a VM (node) in a wildfly cluster. It did not work for Windows Environment although it worked for linux environment.
> Note: we are using Windows 2012 environment. Here is a link I found: http://stackoverflow.com/questions/31218710/unable-to-stop-wildfly-8-2-se...
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Summary: Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node). (was: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).)
> Windows ennvironment: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).
> -------------------------------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (WFLY-6762) Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).
by Preeta Kuruvilla (JIRA)
[ https://issues.jboss.org/browse/WFLY-6762?page=com.atlassian.jira.plugin.... ]
Preeta Kuruvilla updated WFLY-6762:
-----------------------------------
Issue Type: Quality Risk (was: CTS Challenge)
Summary: Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node). (was: Is Network Disabling on a Node, a valid test scenario for testing Wildfly Cluster Failover?)
Priority: Blocker (was: Critical)
> Wildfly failover test not working when Network Disabling on a VM(Node) or by Shutting down the VM (Node).
> ---------------------------------------------------------------------------------------------------------
>
> Key: WFLY-6762
> URL: https://issues.jboss.org/browse/WFLY-6762
> Project: WildFly
> Issue Type: Quality Risk
> Components: Clustering
> Affects Versions: 8.2.0.Final
> Reporter: Preeta Kuruvilla
> Assignee: Paul Ferraro
> Priority: Blocker
>
> In your mail related to WFLY-6749 you has said the below :-
> **The default stack contains the following failure detection protocols:
> FD_SOCK
> FD_ALL
> These protocols are described here:
> http://www.jgroups.org/manual/index.html#FailureDetection
> I suspect that your method of simulating a failure - by disabling the network of the host machine is not being detected by FD_SOCK. It will however, be detected by FD_ALL, but only after 1 minute. The heartbeat timeout used by FD_ALL can be manipulated via the timeout property.
> e.g.
> <protocol type="FD_ALL" ><property name="timeout">60000</property></protocol>
> **
> Thanks for the quick response on WFLY-6749.
> Based on your suggestion, I had a taken a look at the testing scenarios mentioned in "Table 29. Failure detection behavior" in the link that you provided- http://www.jgroups.org/manual/index.html#FailureDetection. No where its mentioned that disabling a network on a node, is a valid testing scenario in Wildfly cluster.
> The Failover is working properly when the network on a node is disabled on a weblogic cluster for our application. However it doesn't work and it hampers the application functionality on Wildfly cluster when we try to disable the network on a node in Wildfly cluster.
> However as I said earlier, the failover on wildfly cluster works when we stop a node from admin console or give Ctrl + C to stop the services on a node.
> Would like to get a confirmation from you that disabling the network on a node is not the valid failover testing scenario for wildfly cluster.
> Thanks,
> Preeta
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months
[JBoss JIRA] (JGRP-2030) GMS: view_ack_collection_timeout delay when last 2 members leave concurrently
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-2030?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2030:
---------------------------
Fix Version/s: 3.6.11
(was: 3.6.10)
Hi Dan,
is this still relevant? I don't want to introduce retrying at the transport level, IMO this is the task of one of the retransmission layers.
> GMS: view_ack_collection_timeout delay when last 2 members leave concurrently
> -----------------------------------------------------------------------------
>
> Key: JGRP-2030
> URL: https://issues.jboss.org/browse/JGRP-2030
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.8
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.11, 4.0
>
>
> When the coordinator ({{NodeE}}) leaves, it tries to install a new view on behalf of the new coordinator ({{NodeG}}, the last member).
> {noformat}
> 21:33:26,844 TRACE (ViewHandler,InitialClusterSizeTest-NodeE-42422:) [GMS] InitialClusterSizeTest-NodeE-42422: mcasting view [InitialClusterSizeTest-NodeG-30521|3] (1) [InitialClusterSizeTest-NodeG-30521] (1 mbrs)
> 21:33:26,844 TRACE (ViewHandler,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: sending msg to null, src=InitialClusterSizeTest-NodeE-42422, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=1], TP: [cluster_name=ISPN]
> {noformat}
> The message is actually sent later by the bundler, but {{NodeG}} is also sending its {{LEAVE_REQ}} message at the same time. Both nodes try to create a connection to each other, and only {{NodeG}} succeeds:
> {noformat}
> 21:33:26,844 TRACE (ForkThread-2,InitialClusterSizeTest:) [TCP_NIO2] InitialClusterSizeTest-NodeG-30521: sending msg to InitialClusterSizeTest-NodeE-42422, src=InitialClusterSizeTest-NodeG-30521, headers are GMS: GmsHeader[LEAVE_REQ]: mbr=InitialClusterSizeTest-NodeG-30521, UNICAST3: DATA, seqno=1, conn_id=1, first, TP: [cluster_name=ISPN]
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] InitialClusterSizeTest-NodeG-30521: sending 1 msgs (83 bytes (0.27% of max_bundle_size) to 1 dests(s): [ISPN:InitialClusterSizeTest-NodeE-42422]
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: sending 1 msgs (91 bytes (0.29% of max_bundle_size) to 1 dests(s): [ISPN]
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] dest=127.0.0.1:7900 (86 bytes)
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] dest=127.0.0.1:7920 (94 bytes)
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] 127.0.0.1:7900: connecting to 127.0.0.1:7920
> 21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] 127.0.0.1:7920: connecting to 127.0.0.1:7900
> 21:33:26,866 TRACE (NioConnection.Reader [null],InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] 127.0.0.1:7920: rejected connection from 127.0.0.1:7900 (connection existed and my address won as it's higher)
> 21:33:26,867 TRACE (OOB-1,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: received [dst: InitialClusterSizeTest-NodeE-42422, src: InitialClusterSizeTest-NodeG-30521 (3 headers), size=0 bytes, flags=OOB], headers are GMS: GmsHeader[LEAVE_REQ]: mbr=InitialClusterSizeTest-NodeG-30521, UNICAST3: DATA, seqno=1, conn_id=1, first, TP: [cluster_name=ISPN]
> {noformat}
> I'm guessing {{NodeE}} would need a {{STABLE}} round in order to retransmit the {{VIEW}} message, but I'm not sure if the stable round would work, since it already (partially?) installed the new view with {{NodeG}} as the only member. However, I think it should be possible for {{NodeE}} to remove {{NodeG}} from it's {{AckCollector}} once it receives its {{LEAVE_REQ}}, and stop blocking.
> This is a minor annoyance a few the Infinispan tests - most of them shut down the nodes serially, so they don't see this delay.
> The question is whether the concurrent connection setup can have an impact for other messages as well - e.g. during startup, when there aren't a lot of messages being sent around to trigger retransmission. Could the node that failed to open its connection retry immediately on the connection opened by the other node?
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 4 months