[JBoss JIRA] (WFCORE-4740) (7.3.0) preferIPv6Addresses and preferIPv4Stack System Properties are Mishandled in the Config
by Ivo Studensky (Jira)
[ https://issues.jboss.org/browse/WFCORE-4740?page=com.atlassian.jira.plugi... ]
Ivo Studensky updated WFCORE-4740:
----------------------------------
Fix Version/s: (was: 11.0.0.Beta1)
> (7.3.0) preferIPv6Addresses and preferIPv4Stack System Properties are Mishandled in the Config
> ----------------------------------------------------------------------------------------------
>
> Key: WFCORE-4740
> URL: https://issues.jboss.org/browse/…
[View More]WFCORE-4740
> Project: WildFly Core
> Issue Type: Bug
> Components: Server
> Affects Versions: 10.0.0.Final
> Environment: * JBoss EAP 7.1/7.2
> * Interface attached to port 0.0.0.0
> * Red Hat Enterprise Linux 7
> * IPv6 disabled in the kernel
> sysctl -w net.ipv6.conf.all.disable_ipv6=1
> sysctl -w net.ipv6.conf.default.disable_ipv6=1
> * System properties -Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true
> Reporter: Ivo Studensky
> Assignee: Ivo Studensky
> Priority: Major
> Labels: ipv6, startup
>
> Error is thrown on startup.
> Caused by: java.net.SocketException: Protocol family unavailable
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> at org.xnio.nio.NioXnioWorker.createTcpConnectionServer(NioXnioWorker.java:179)
> at org.xnio.XnioWorker.createStreamConnectionServer(XnioWorker.java:310)
> at io.undertow.protocols.ssl.UndertowXnioSsl.createSslConnectionServer(UndertowXnioSsl.java:301)
> at org.wildfly.extension.undertow.HttpsListenerService.startListening(HttpsListenerService.java:127)
> The customer ships a turn key solution that has the two system properties set: -Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true because they are needed for some cases for customers running IPv6, but others want to harden their systems by disabling IPv4.
> This works on JBoss EAP 6, but it throws the error on JBoss EAP 7 on the same version of Java. Furthermore, adding just Djava.net.preferIPv4Stack=false has the same issue, even though it the default value, while leaving it off starts.
> This appears to be related to controller/src/main/java/org/jboss/as/controller/interfaces/OverallInterfaceCriteria.java#pruneIPTypes where if both properties are null, it leaves the set of candidate addresses alone, but it either are set, it strips out all IPv4 addresses.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
[View Less]
5 years, 4 months
[JBoss JIRA] (WFCORE-4740) (7.3.0) preferIPv6Addresses and preferIPv4Stack System Properties are Mishandled in the Config
by Ivo Studensky (Jira)
Ivo Studensky created WFCORE-4740:
-------------------------------------
Summary: (7.3.0) preferIPv6Addresses and preferIPv4Stack System Properties are Mishandled in the Config
Key: WFCORE-4740
URL: https://issues.jboss.org/browse/WFCORE-4740
Project: WildFly Core
Issue Type: Bug
Components: Server
Affects Versions: 10.0.0.Final
Environment: * JBoss EAP 7.1/7.2
* Interface attached to port 0.0.0.0
* …
[View More]Red Hat Enterprise Linux 7
* IPv6 disabled in the kernel
sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1
* System properties -Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true
Reporter: Ivo Studensky
Assignee: Teresa Miyar Gil
Fix For: 11.0.0.Beta1
Error is thrown on startup.
Caused by: java.net.SocketException: Protocol family unavailable
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.xnio.nio.NioXnioWorker.createTcpConnectionServer(NioXnioWorker.java:179)
at org.xnio.XnioWorker.createStreamConnectionServer(XnioWorker.java:310)
at io.undertow.protocols.ssl.UndertowXnioSsl.createSslConnectionServer(UndertowXnioSsl.java:301)
at org.wildfly.extension.undertow.HttpsListenerService.startListening(HttpsListenerService.java:127)
The customer ships a turn key solution that has the two system properties set: -Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true because they are needed for some cases for customers running IPv6, but others want to harden their systems by disabling IPv4.
This works on JBoss EAP 6, but it throws the error on JBoss EAP 7 on the same version of Java. Furthermore, adding just Djava.net.preferIPv4Stack=false has the same issue, even though it the default value, while leaving it off starts.
This appears to be related to controller/src/main/java/org/jboss/as/controller/interfaces/OverallInterfaceCriteria.java#pruneIPTypes where if both properties are null, it leaves the set of candidate addresses alone, but it either are set, it strips out all IPv4 addresses.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
[View Less]
5 years, 4 months
[JBoss JIRA] (JGRP-2395) LOCAL_PING fails when 2 nodes start at the same time
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2395?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2395:
---------------------------
Fix Version/s: 4.1.8
> LOCAL_PING fails when 2 nodes start at the same time
> ----------------------------------------------------
>
> Key: JGRP-2395
> URL: https://issues.jboss.org/browse/JGRP-2395
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.1.6
> …
[View More] Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.8
>
>
> We have a test that starts 2 nodes in parallel ({{ConcurrentStartTest}} and it is randomly failing since we started using {{LOCAL_PING}}.
> {noformat}
> 01:02:11,930 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: discovery took 3 ms, members: 1 rsps (0 coords) [done]
> 01:02:11,930 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-43694: discovery took 3 ms, members: 1 rsps (0 coords) [done]
> 01:02:11,931 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-43694: could not determine coordinator from rsps 1 rsps (0 coords) [done]
> 01:02:11,931 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: could not determine coordinator from rsps 1 rsps (0 coords) [done]
> 01:02:11,931 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: nodes to choose new coord from are: [Test-NodeB-43694, Test-NodeA-29550]
> 01:02:11,931 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-43694: nodes to choose new coord from are: [Test-NodeB-43694, Test-NodeA-29550]
> 01:02:11,931 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-43694: I (Test-NodeB-43694) am the first of the nodes, will become coordinator
> 01:02:11,931 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: I (Test-NodeA-29550) am not the first of the nodes, waiting for another client to become coordinator
> 01:02:11,932 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: discovery took 0 ms, members: 1 rsps (0 coords) [done]
> 01:02:11,932 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: could not determine coordinator from rsps 1 rsps (0 coords) [done]
> 01:02:11,932 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: nodes to choose new coord from are: [Test-NodeB-43694, Test-NodeA-29550]
> ...
> 01:02:11,941 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: could not determine coordinator from rsps 1 rsps (0 coords) [done]
> 01:02:11,941 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: nodes to choose new coord from are: [Test-NodeB-43694, Test-NodeA-29550]
> 01:02:11,941 TRACE (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: I (Test-NodeA-29550) am not the first of the nodes, waiting for another client to become coordinator
> 01:02:11,942 WARN (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: too many JOIN attempts (10): becoming singleton
> 01:02:11,942 DEBUG (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: installing view [Test-NodeA-29550|0] (1) [Test-NodeA-29550]
> 01:02:11,977 DEBUG (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-43694: created cluster (first member). My view is [Test-NodeB-43694|0], impl is org.jgroups.protocols.pbcast.CoordGmsImpl
> 01:02:11,977 DEBUG (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-29550: created cluster (first member). My view is [Test-NodeA-29550|0], impl is org.jgroups.protocols.pbcast.CoordGmsImpl
> {noformat}
> The problem seems to be that it takes longer for the coordinator to install the initial view and update {{LOCAL_PING}}'s {{PingData}} then it takes the other node to retry the discovery process 10 times.
> In some cases there is no retry, because one node starts slightly faster, but it's not yet coordinator when the 2nd node does its discovery, and both nodes decide they should be coordinator:
> {noformat}
> 01:13:44,460 INFO (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-5386: no members discovered after 3 ms: creating cluster as first member
> 01:13:44,463 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-51165: discovery took 1 ms, members: 1 rsps (0 coords) [done]
> 01:13:44,465 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-51165: could not determine coordinator from rsps 1 rsps (0 coords) [done]
> 01:13:44,465 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-51165: nodes to choose new coord from are: [Test-NodeB-51165, Test-NodeA-5386]
> 01:13:44,466 TRACE (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-51165: I (Test-NodeB-51165) am the first of the nodes, will become coordinator
> 01:13:44,466 DEBUG (ForkThread-2,ConcurrentStartTest:[]) [GMS] Test-NodeB-51165: installing view [Test-NodeB-51165|0] (1) [Test-NodeB-51165]
> 01:13:44,466 DEBUG (ForkThread-1,ConcurrentStartTest:[]) [GMS] Test-NodeA-5386: installing view [Test-NodeA-5386|0] (1) [Test-NodeA-5386]
> {noformat}
> This second failure mode seems to go away if I move the {{discovery}} map access inside the {{synchronized}} block both in {{findMembers()}} and in {{down()}}.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
[View Less]
5 years, 4 months
[JBoss JIRA] (WFCORE-957) Wildfly not respecting subdeployment dependency order
by Katarina Hermanova (Jira)
[ https://issues.jboss.org/browse/WFCORE-957?page=com.atlassian.jira.plugin... ]
Katarina Hermanova reassigned WFCORE-957:
-----------------------------------------
Assignee: Katarina Hermanova
> Wildfly not respecting subdeployment dependency order
> -----------------------------------------------------
>
> Key: WFCORE-957
> URL: https://issues.jboss.org/browse/WFCORE-957
> Project: WildFly Core
> Issue Type: …
[View More]Bug
> Components: Server
> Reporter: Brian Riehman
> Assignee: Katarina Hermanova
> Priority: Minor
> Attachments: wildfly-deployment-order.ear, wildfly-deployment-order.tar.gz
>
>
> When loading an EAR with {{initialize-in-order}} set to {{true}} within the {{application.xml}} and subdeployment dependencies defined in{{jboss-deployment-structure.xml}}, Wildfly does not load the EAR modules in order as specified nor as defined by the dependencies.
> I have attached both a [source repository|https://github.com/briehman/wildfly-deployment-order] and the generated EAR. The {{application.xml}} and {{jboss-deployment-structure.xml}} are as below:
> {code}
> <?xml version="1.0" encoding="UTF-8"?>
> <application xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/application_6.xsd" version="6">
> <display-name>ear</display-name>
> <initialize-in-order>true</initialize-in-order>
> <module>
> <web>
> <web-uri>webapp-one-1.0-SNAPSHOT.war</web-uri>
> <context-root>/one</context-root>
> </web>
> </module>
> <module>
> <web>
> <web-uri>webapp-two-1.0-SNAPSHOT.war</web-uri>
> <context-root>/two</context-root>
> </web>
> </module>
> <module>
> <web>
> <web-uri>webapp-three-1.0-SNAPSHOT.war</web-uri>
> <context-root>/three</context-root>
> </web>
> </module>
> </application>
> {code}
> {code}
> <jboss-deployment-structure xmlns="urn:jboss:deployment-structure:1.2">
> <sub-deployment name="webapp-one-1.0-SNAPSHOT.war">
> </sub-deployment>
> <sub-deployment name="webapp-two-1.0-SNAPSHOT.war">
> <dependencies>
> <module name="deployment.wildfly-deployment-order.ear.webapp-one-1.0-SNAPSHOT.war" />
> </dependencies>
> </sub-deployment>
> <sub-deployment name="webapp-three-1.0-SNAPSHOT.war">
> <dependencies>
> <module name="deployment.wildfly-deployment-order.ear.webapp-two-1.0-SNAPSHOT.war" />
> </dependencies>
> </sub-deployment>
> </jboss-deployment-structure>
> {code}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
[View Less]
5 years, 4 months
[JBoss JIRA] (WFLY-12753) Intermittent failure in LastNodeToLeaveRemoteEJBTestCase
by Brian Stansberry (Jira)
[ https://issues.jboss.org/browse/WFLY-12753?page=com.atlassian.jira.plugin... ]
Brian Stansberry commented on WFLY-12753:
-----------------------------------------
Scrolling up further in the log, skipping the previous test execution, which doesn't involve any server running on 10090, and moving to the one before it, which does have such a server, I see:
{code}
[20:38:47][Step 2/3] [INFO] Running org.jboss.as.test.clustering.cluster.ejb.xpc.StatefulWithXPCFailoverTestCase
[20:39:42][Step 2/…
[View More]3] [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.511 s - in org.jboss.as.test.clustering.cluster.ejb.xpc.StatefulWithXPCFailoverTestCase
[20:39:42][Step 2/3] [INFO] Running org.jboss.as.test.clustering.cluster.ejb2.stateful.failover.RemoteEJBClientStatefulBeanFailoverTestCase
[21:59:11][Step 2/3] [INFO]
[21:59:11][Step 2/3] [INFO] Results:
[21:59:11][Step 2/3] [INFO]
[21:59:11][Step 2/3] [WARNING] Tests run: 37, Failures: 0, Errors: 0, Skipped: 5
[21:59:11][Step 2/3] [INFO]
[21:59:11][Step 2/3] [ERROR] There was a timeout or other error in the fork
{code}
So the problem looks like some sort of hang in .RemoteEJBClientStatefulBeanFailoverTestCase.
> Intermittent failure in LastNodeToLeaveRemoteEJBTestCase
> --------------------------------------------------------
>
> Key: WFLY-12753
> URL: https://issues.jboss.org/browse/WFLY-12753
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Reporter: Brian Stansberry
> Assignee: Paul Ferraro
> Priority: Major
>
> It's getting fairly common lately to see dozens/hundreds of failures like at https://ci.wildfly.org/viewLog.html?buildId=173641&buildTypeId=WF_PullReq...
> Problem is LastNodeToLeaveRemoteEJBTestCase, first test class run in a surefire execution, fails due to a port conflict, perhaps a leftover process.
> {code}
> [21:59:27][Step 2/3] [INFO] Running org.jboss.as.test.clustering.cluster.ejb.remote.byteman.LastNodeToLeaveRemoteEJBTestCase
> [21:59:36][Step 2/3] [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.321 s <<< FAILURE! - in org.jboss.as.test.clustering.cluster.ejb.remote.byteman.LastNodeToLeaveRemoteEJBTestCase
> [21:59:36][Step 2/3] [ERROR] testDNRContentsAfterLastNodeToLeave(org.jboss.as.test.clustering.cluster.ejb.remote.byteman.LastNodeToLeaveRemoteEJBTestCase) Time elapsed: 6.8 s <<< ERROR!
> [21:59:36][Step 2/3] org.jboss.arquillian.container.spi.client.container.LifecycleException:
> [21:59:36][Step 2/3] The port 10090 is already in use. It means that either the server might be already running or there is another process using port 10090.
> [21:59:36][Step 2/3] Managed containers do not support connecting to running server instances due to the possible harmful effect of connecting to the wrong server.
> [21:59:36][Step 2/3] Please stop server (or another process) before running, change to another type of container (e.g. remote) or use jboss.socket.binding.port-offset variable to change the default port.
> [21:59:36][Step 2/3] To disable this check and allow Arquillian to connect to a running server, set allowConnectingToRunningServer to true in the container configuration
> [21:59:36][Step 2/3] at org.jboss.as.arquillian.container.managed.ManagedDeployableContainer.failDueToRunning(ManagedDeployableContainer.java:323)
> [21:59:36][Step 2/3] at org.jboss.as.arquillian.container.managed.ManagedDeployableContainer.startInternal(ManagedDeployableContainer.java:81)
> [21:59:36][Step 2/3] at org.jboss.as.arquillian.container.CommonDeployableContainer.start(CommonDeployableContainer.java:123)
> [21:59:36][Step 2/3] at org.jboss.arquillian.container.impl.ContainerImpl.start(ContainerImpl.java:179)
> {code}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
[View Less]
5 years, 4 months
[JBoss JIRA] (WFLY-12753) Intermittent failure in LastNodeToLeaveRemoteEJBTestCase
by Brian Stansberry (Jira)
Brian Stansberry created WFLY-12753:
---------------------------------------
Summary: Intermittent failure in LastNodeToLeaveRemoteEJBTestCase
Key: WFLY-12753
URL: https://issues.jboss.org/browse/WFLY-12753
Project: WildFly
Issue Type: Bug
Components: Clustering
Reporter: Brian Stansberry
Assignee: Paul Ferraro
It's getting fairly common lately to see dozens/hundreds of failures like at …
[View More]https://ci.wildfly.org/viewLog.html?buildId=173641&buildTypeId=WF_PullReq...
Problem is LastNodeToLeaveRemoteEJBTestCase, first test class run in a surefire execution, fails due to a port conflict, perhaps a leftover process.
{code}
[21:59:27][Step 2/3] [INFO] Running org.jboss.as.test.clustering.cluster.ejb.remote.byteman.LastNodeToLeaveRemoteEJBTestCase
[21:59:36][Step 2/3] [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.321 s <<< FAILURE! - in org.jboss.as.test.clustering.cluster.ejb.remote.byteman.LastNodeToLeaveRemoteEJBTestCase
[21:59:36][Step 2/3] [ERROR] testDNRContentsAfterLastNodeToLeave(org.jboss.as.test.clustering.cluster.ejb.remote.byteman.LastNodeToLeaveRemoteEJBTestCase) Time elapsed: 6.8 s <<< ERROR!
[21:59:36][Step 2/3] org.jboss.arquillian.container.spi.client.container.LifecycleException:
[21:59:36][Step 2/3] The port 10090 is already in use. It means that either the server might be already running or there is another process using port 10090.
[21:59:36][Step 2/3] Managed containers do not support connecting to running server instances due to the possible harmful effect of connecting to the wrong server.
[21:59:36][Step 2/3] Please stop server (or another process) before running, change to another type of container (e.g. remote) or use jboss.socket.binding.port-offset variable to change the default port.
[21:59:36][Step 2/3] To disable this check and allow Arquillian to connect to a running server, set allowConnectingToRunningServer to true in the container configuration
[21:59:36][Step 2/3] at org.jboss.as.arquillian.container.managed.ManagedDeployableContainer.failDueToRunning(ManagedDeployableContainer.java:323)
[21:59:36][Step 2/3] at org.jboss.as.arquillian.container.managed.ManagedDeployableContainer.startInternal(ManagedDeployableContainer.java:81)
[21:59:36][Step 2/3] at org.jboss.as.arquillian.container.CommonDeployableContainer.start(CommonDeployableContainer.java:123)
[21:59:36][Step 2/3] at org.jboss.arquillian.container.impl.ContainerImpl.start(ContainerImpl.java:179)
{code}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
[View Less]
5 years, 4 months