[JBoss JIRA] (WFLY-12633) Server doesn't start when DNS_PING is configured
by James Perkins (Jira)
James Perkins created WFLY-12633:
------------------------------------
Summary: Server doesn't start when DNS_PING is configured
Key: WFLY-12633
URL: https://issues.jboss.org/browse/WFLY-12633
Project: WildFly
Issue Type: Bug
Components: Clustering
Reporter: Jean Francois Denise
Assignee: Radoslav Husar
NPE at start.
The root cause is that <module name="sun.jdk"/> is missing from org.wildfly.clustering.service module.
Exception:
Caused by: java.lang.NullPointerException
at org.jgroups.protocols.dns.DNS_PING.destroy(DNS_PING.java:70)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at org.jgroups.stack.ProtocolStack.destroy(ProtocolStack.java:876)
at org.jgroups.stack.ProtocolStack.initProtocolStack(ProtocolStack.java:867)
at org.jgroups.stack.ProtocolStack.init(ProtocolStack.java:849)
at org.jgroups.JChannel.<init>(JChannel.java:155)
at org.jboss.as.clustering.jgroups.JChannelFactory.createChannel(JChannelFactory.java:116)
at org.jboss.as.clustering.jgroups.subsystem.ChannelServiceConfigurator.get(ChannelServiceConfigurator.java:96)
Hidden exception:
Failed instantiate InitialContextFactory com.sun.jndi.dns.DnsContextFactory from classloader ModuleClassLoader for Module "org.wildfly.clustering.service" version 18.0.0.Final-SNAPSHOT from local module loader @2d3fcdbd (finder: local module finder @617c74e5 (roots: /home/jdenise/workspaces/wildfly-jfdenise/build/target/wildfly-18.0.0.Final-SNAPSHOT/modules,/home/jdenise/workspaces/wildfly-jfdenise/build/target/wildfly-18.0.0.Final-SNAPSHOT/modules/system/layers/base)) [Root exception is java.lang.ClassNotFoundException: com.sun.jndi.dns.DnsContextFactory from [Module "org.wildfly.clustering.service" version 18.0.0.Final-SNAPSHOT from local module loader @2d3fcdbd (finder: local module finder @617c74e5 (roots: /home/jdenise/workspaces/wildfly-jfdenise/build/target/wildfly-18.0.0.Final-SNAPSHOT/modules,/home/jdenise/workspaces/wildfly-jfdenise/build/target/wildfly-18.0.0.Final-SNAPSHOT/modules/system/layers/base))]]
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2387) Message from a non-member causes FD_ALL to continually suspect it
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2387?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2387:
--------------------------------
{quote}
This triggers VERIFY_SUSPECT to try to ping it, which it can't because it doesn't have the address (but can cause a "no physical address" log in some cases).
{quote}
I'm wondering why the logical address cache in the transport would _not_ have the address: unless the member to which we're trying to send the message _has never been a member_, its address should still be in the cache! The cache only removes 'removable' elements when UDP.logical_addr_cache_max_size (default=2000) has been exceeded, but this should not be the case! Or does JDG set this value by default?
> Message from a non-member causes FD_ALL to continually suspect it
> -----------------------------------------------------------------
>
> Key: JGRP-2387
> URL: https://issues.jboss.org/browse/JGRP-2387
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.1
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.6
>
> Attachments: Test.java
>
>
> If an FD_ALL control message from a non-member is seen by FD_ALL, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
> This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2387) Message from a non-member causes FD_ALL to continually suspect it
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2387?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2387:
--------------------------------
Workaround to make the warnings go away:
* Set VERIFY_SUSPECT.use_icmp to true
* Remove VERIFY_SUSPECT
* Increase UDP.who_has_cache_timeout (this should reduce the frequency of the warnings, but not completely eliminate them)
I haven't tried these, but perhaps it's worth experimenting with these workaround until I have a fix?
> Message from a non-member causes FD_ALL to continually suspect it
> -----------------------------------------------------------------
>
> Key: JGRP-2387
> URL: https://issues.jboss.org/browse/JGRP-2387
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.1
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.6
>
> Attachments: Test.java
>
>
> If an FD_ALL control message from a non-member is seen by FD_ALL, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
> This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (WFLY-12632) Downgrade MicroProfile Health to 2.0.1
by Jeff Mesnil (Jira)
Jeff Mesnil created WFLY-12632:
----------------------------------
Summary: Downgrade MicroProfile Health to 2.0.1
Key: WFLY-12632
URL: https://issues.jboss.org/browse/WFLY-12632
Project: WildFly
Issue Type: Component Upgrade
Components: MP Health
Reporter: Jeff Mesnil
Assignee: Jeff Mesnil
Fix For: 18.0.0.Final
MP Health was upgraded to 2.1 in WFLY-12555 at the same time that smallrye-health was upgrade to 2.0.0 in WFLY-12554
That was a mistake as smallrye-health is depending on MP Health 2.0.1.
There is no harm on staying on MP Health 2.1 as the API changes are internal to the API and do not require any implementation from the provider:
https://github.com/eclipse/microprofile-health/compare/2.0.1...2.1
However, if we decide to keep WildFly 18 under the same MP Umbrella release (3.0) at the moment, it might be better to downgrade to 2.0.1
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (DROOLS-4563) Upgrade javax.validation from 1.0.0.GA to 2.0.1.Final
by Marek Novotny (Jira)
[ https://issues.jboss.org/browse/DROOLS-4563?page=com.atlassian.jira.plugi... ]
Marek Novotny updated DROOLS-4563:
----------------------------------
Sprint: 2019 Week 38-40 (from Sep 16)
> Upgrade javax.validation from 1.0.0.GA to 2.0.1.Final
> ------------------------------------------------------
>
> Key: DROOLS-4563
> URL: https://issues.jboss.org/browse/DROOLS-4563
> Project: Drools
> Issue Type: Feature Request
> Reporter: Michael Biarnes Kiefer
> Assignee: Marek Novotny
> Priority: Optional
>
> To do the upgrade of javax.validation [(PR)|https://github.com/kiegroup/droolsjbpm-build-bootstrap/pull/1055] it is needed to check all GWT reps (uberfire, errai,wb etc) because the javax.validation version is not supported in these reps.
> The version upgrade is needed for spring-boot and quarkus - so it should be upgraded in kie-parent but overwritten in all poms of GWT reps that doesn't support this new version by the old version. It is thought to add a dependency overwite in the root pom in those repo's to overwrite it to use the original older version
> On the other hand existing overrides like [this|https://github.com/kiegroup/droolsjbpm-integration/blob/81415ae5cdc4...] should be removed then.
> All reps should be examined if they have an override or if the need the old version.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2387) Message from a non-member causes FD_ALL to continually suspect it
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2387?page=com.atlassian.jira.plugin.... ]
Bela Ban edited comment on JGRP-2387 at 10/3/19 9:13 AM:
---------------------------------------------------------
FD_ALL or FD_SOCK? I guess FD_ALL... Looking into this now... Changed the original issue to refer to FD_ALL, not FD_SOCK
was (Author: belaban):
FD_ALL or FD_SOCK? I guess FD_ALL... Looking into this now...
> Message from a non-member causes FD_ALL to continually suspect it
> -----------------------------------------------------------------
>
> Key: JGRP-2387
> URL: https://issues.jboss.org/browse/JGRP-2387
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.1
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.6
>
> Attachments: Test.java
>
>
> If an FD_ALL control message from a non-member is seen by FD_ALL, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
> This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2387) Message from a non-member causes FD_ALL to continually suspect it
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2387?page=com.atlassian.jira.plugin.... ]
Bela Ban edited comment on JGRP-2387 at 10/3/19 9:13 AM:
---------------------------------------------------------
The technical detail:
FD_ALL keeps track of the time the last message from each member was seen in the "timestamps" map.
It periodically suspects any entries in this map whose timestamps are too old.
When a new view is installed, any members that left are removed from the map, and an entry is added for each member if it doesn't already exist.
When any FD_ALL message is received from a member its entry in "timestamps" is updated.
If msg_counts_as_heartbeat is on then the same is done for every message from that member. (this is off by default)
The problem: When it updates the timestamp, no membership check is done first.
So a message from a non-member triggers an entry added to the table, which is never removed until the next view is processed, and will continually send suspect events up the stack.
This triggers VERIFY_SUSPECT to try to ping it, which it can't because it doesn't have the address (but can cause a "no physical address" log in some cases).
VERIFY_SUSPECT will eventually send SUSPECT events up the stack, which are ignored by GMS because the node isn't part of the cluster.
was (Author: dereed):
The technical detail:
FD_SOCK keeps track of the time the last message from each member was seen in the "timestamps" map.
It periodically suspects any entries in this map whose timestamps are too old.
When a new view is installed, any members that left are removed from the map, and an entry is added for each member if it doesn't already exist.
When any FD_SOCK message is received from a member its entry in "timestamps" is updated.
If msg_counts_as_heartbeat is on then the same is done for every message from that member. (this is off by default)
The problem: When it updates the timestamp, no membership check is done first.
So a message from a non-member triggers an entry added to the table, which is never removed until the next view is processed, and will continually send suspect events up the stack.
This triggers VERIFY_SUSPECT to try to ping it, which it can't because it doesn't have the address (but can cause a "no physical address" log in some cases).
VERIFY_SUSPECT will eventually send SUSPECT events up the stack, which are ignored by GMS because the node isn't part of the cluster.
> Message from a non-member causes FD_ALL to continually suspect it
> -----------------------------------------------------------------
>
> Key: JGRP-2387
> URL: https://issues.jboss.org/browse/JGRP-2387
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.1
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.6
>
> Attachments: Test.java
>
>
> If an FD_ALL control message from a non-member is seen by FD_ALL, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
> This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2387) Message from a non-member causes FD_ALL to continually suspect it
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2387?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2387:
---------------------------
Description:
If an FD_ALL control message from a non-member is seen by FD_ALL, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
was:
If a FD_SOCK control message from a non-member is seen by FD_SOCK, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
> Message from a non-member causes FD_ALL to continually suspect it
> -----------------------------------------------------------------
>
> Key: JGRP-2387
> URL: https://issues.jboss.org/browse/JGRP-2387
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.1
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.6
>
> Attachments: Test.java
>
>
> If an FD_ALL control message from a non-member is seen by FD_ALL, it will start continually suspecting that node. If msg_counts_as_heartbeat=true then any message from a non-member triggers the issue. The issue is cleared on the next view change.
> This does not cause any functional issues in the cluster, but can cause repeated WARN logs in some cases.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2388) DNS_PING#destroy could yield NPE hiding the root cause
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2388?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2388:
---------------------------
Fix Version/s: 4.1.6
> DNS_PING#destroy could yield NPE hiding the root cause
> ------------------------------------------------------
>
> Key: JGRP-2388
> URL: https://issues.jboss.org/browse/JGRP-2388
> Project: JGroups
> Issue Type: Bug
> Reporter: Radoslav Husar
> Assignee: Radoslav Husar
> Priority: Minor
> Fix For: 4.1.6
>
>
> Caused by: java.lang.NullPointerException
> at org.jgroups.protocols.dns.DNS_PING.destroy(DNS_PING.java:70)
> at java.util.ArrayList.forEach(ArrayList.java:1257)
> at org.jgroups.stack.ProtocolStack.destroy(ProtocolStack.java:876)
> at org.jgroups.stack.ProtocolStack.initProtocolStack(ProtocolStack.java:867)
> at org.jgroups.stack.ProtocolStack.init(ProtocolStack.java:849)
> at org.jgroups.JChannel.<init>(JChannel.java:155)
> at org.jboss.as.clustering.jgroups.JChannelFactory.createChannel(JChannelFactory.java:116)
> at org.jboss.as.clustering.jgroups.subsystem.ChannelServiceConfigurator.get(ChannelServiceConfigurator.java:96)
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years
[JBoss JIRA] (JGRP-2388) DNS_PING#destroy could yield NPE hiding the root cause
by Radoslav Husar (Jira)
[ https://issues.jboss.org/browse/JGRP-2388?page=com.atlassian.jira.plugin.... ]
Radoslav Husar commented on JGRP-2388:
--------------------------------------
e.g. when the DNS resolver classes fail to load; with the proposed fix, the above would result int
{code}
Caused by: java.lang.ClassNotFoundException: com.sun.jndi.dns.DnsContextFactory from [Module "org.wildfly.clustering.service" version 18.0.0.Final-SNAPSHOT from local module loader @6537cf78 (finder: local module finder @67b6d4ae (roots: /Users/rhusar/git/wildfly/build/target/wildfly-18.0.0.Final-SNAPSHOT/modules,/Users/rhusar/git/wildfly/build/target/wildfly-18.0.0.Final-SNAPSHOT/modules/system/layers/base))]
at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:255)
at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:410)
at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:398)
at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:116)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.jboss.as.naming.InitialContext.getDefaultInitCtx(InitialContext.java:115)
... 25 more
{code}
> DNS_PING#destroy could yield NPE hiding the root cause
> ------------------------------------------------------
>
> Key: JGRP-2388
> URL: https://issues.jboss.org/browse/JGRP-2388
> Project: JGroups
> Issue Type: Bug
> Reporter: Radoslav Husar
> Assignee: Radoslav Husar
> Priority: Minor
>
> Caused by: java.lang.NullPointerException
> at org.jgroups.protocols.dns.DNS_PING.destroy(DNS_PING.java:70)
> at java.util.ArrayList.forEach(ArrayList.java:1257)
> at org.jgroups.stack.ProtocolStack.destroy(ProtocolStack.java:876)
> at org.jgroups.stack.ProtocolStack.initProtocolStack(ProtocolStack.java:867)
> at org.jgroups.stack.ProtocolStack.init(ProtocolStack.java:849)
> at org.jgroups.JChannel.<init>(JChannel.java:155)
> at org.jboss.as.clustering.jgroups.JChannelFactory.createChannel(JChannelFactory.java:116)
> at org.jboss.as.clustering.jgroups.subsystem.ChannelServiceConfigurator.get(ChannelServiceConfigurator.java:96)
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
5 years