[
https://issues.jboss.org/browse/WFCORE-3572?page=com.atlassian.jira.plugi...
]
Brian Stansberry edited comment on WFCORE-3572 at 2/5/18 8:58 AM:
------------------------------------------------------------------
[~dmlloyd] I believe this is the same thing as what we were discussing in the Chicago
airport on the way to Europe: the initial failure on
https://ci.wildfly.org/viewLog.html?buildId=86953&tab=buildResultsDiv....
I'm pretty sure the difference here vs there is the CI job failed with AssertionError
at L71 while Martin didn't have asserts enabled so it NPE'd at L72.
I analyzed the uses of the 'worker' controlled by ManagementWorkerService and I
don't see any place where its lifecycle is being incorrectly controlled (e.g.
worker.shutdown() is incorrectly called.)
My look into this was not exhaustive, but the only thing I could think of is the
possibility of a logic flaw within EnhancedQueueExecutor. The failure arises out of its
"completeTermination()" method, which I see can be called both from
EQE.shutdown(boolean) and EQE.tryDeallocateThread(long). If there is a logic flaw that
results in 1 thread calling completeTermination from shutdown and another thread calling
it from tryDeallocateThread we would see the behavior of MWS.stopDone being called twice.
A simple workaround for this is for MWS.stopDone to remove the assert at L71 and replace
it with a null check. But that would be papering over the problem.
was (Author: brian.stansberry):
[~dmlloyd] I believe this is the same thing as what we were discussing in the Chicago
airport on the way to Europe: the initial failure on
https://ci.wildfly.org/viewLog.html?buildId=86953&tab=buildResultsDiv....
I'm pretty sure the difference here vs there is the CI job failed with AssertionError
at L71 while Martin didn't have asserts enabled so it NPE'd at L72.
I analyzed the uses of the 'worker' controlled by ManagementWorkerService and I
don't see any place where its lifecycle is being incorrectly controlled (e.g.
worker.shutdown() is incorrectly called.)
The only thing I can think of is a logic flaw within EnhancedQueueExecutor. The failure
arises out of its "completeTermination()" method, which I see can be called both
from EQE.shutdown(boolean) and EQE.tryDeallocateThread(long). If there is a logic flaw
that results in 1 thread calling completeTermination from shutdown and another thread
calling it from tryDeallocateThread we would see the behavior of MWS.stopDone being called
twice.
A simple workaround for this is for MWS.stopDone to remove the assert at L71 and replace
it with a null check. But that would be papering over the problem.
NullPointerException during server reload
-----------------------------------------
Key: WFCORE-3572
URL:
https://issues.jboss.org/browse/WFCORE-3572
Project: WildFly Core
Issue Type: Bug
Components: Domain Management, Remoting
Affects Versions: 4.0.0.Alpha7
Reporter: Martin Choma
Assignee: David Lloyd
Priority: Blocker
Reproducer:
{code}
0. build wildfly from master
1. ./standalone.sh
2. ./jboss-cli.sh
connect
reload
reload
reload
...
{code}
In log there occurs NPE:
{code}
09:52:28,772 INFO [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: WildFly Core
4.0.0.Alpha7 "Kenny" started in 196ms - Started 292 of 532 services (327
services are lazy, passive or on-demand)
09:52:32,175 INFO [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0008:
Undertow HTTPS listener https suspending
09:52:32,176 INFO [org.jboss.as.connector.subsystems.datasources] (MSC service thread
1-8) WFLYJCA0010: Unbound data source [java:jboss/datasources/ExampleDS]
09:52:32,176 INFO [org.wildfly.extension.undertow] (MSC service thread 1-6) WFLYUT0008:
Undertow HTTP listener default suspending
09:52:32,176 INFO [org.wildfly.extension.undertow] (MSC service thread 1-5) WFLYUT0019:
Host default-host stopping
09:52:32,176 INFO [org.wildfly.extension.undertow] (MSC service thread 1-6) WFLYUT0007:
Undertow HTTP listener default stopped, was bound to 127.0.0.1:8080
09:52:32,177 INFO [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0007:
Undertow HTTPS listener https stopped, was bound to 127.0.0.1:8443
09:52:32,179 INFO [org.wildfly.extension.undertow] (MSC service thread 1-8) WFLYUT0004:
Undertow 1.4.22.Final stopping
09:52:32,180 INFO [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-8)
WFLYJCA0019: Stopped Driver service with driver-name = h2
09:52:32,184 INFO [org.jboss.as.mail.extension] (MSC service thread 1-1) WFLYMAIL0002:
Unbound mail session [java:jboss/mail/Default]
09:52:32,182 ERROR [org.jboss.threads.errors] (management task-1) Thread
Thread[management task-1,5,main] threw an uncaught exception:
java.lang.NullPointerException
at
org.jboss.as.server.mgmt.ManagementWorkerService.stopDone(ManagementWorkerService.java:72)
at org.xnio.XnioWorker$1.run(XnioWorker.java:138)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1979)
at
org.jboss.threads.EnhancedQueueExecutor.completeTermination(EnhancedQueueExecutor.java:1755)
at
org.jboss.threads.EnhancedQueueExecutor.tryDeallocateThread(EnhancedQueueExecutor.java:1578)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1393)
at java.lang.Thread.run(Thread.java:745)
09:52:32,187 INFO [org.jboss.as] (MSC service thread 1-6) WFLYSRV0050: WildFly Core
4.0.0.Alpha7 "Kenny" stopped in 20ms
09:52:32,189 INFO [org.jboss.as] (MSC service thread 1-6) WFLYSRV0049: WildFly Core
4.0.0.Alpha7 "Kenny" starting
{code}
Also sometimes (in TS) I can see AssertionError be thrown from same area of code.
{code}
13:59:39,966 ERROR [org.jboss.threads.errors] (management task-3) Thread
Thread[management task-3,5,main] threw an uncaught exception: java.lang.AssertionError
at
org.jboss.as.server.mgmt.ManagementWorkerService.stopDone(ManagementWorkerService.java:71)
at org.xnio.XnioWorker$1.run(XnioWorker.java:138)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1979)
at
org.jboss.threads.EnhancedQueueExecutor.completeTermination(EnhancedQueueExecutor.java:1755)
at
org.jboss.threads.EnhancedQueueExecutor.tryDeallocateThread(EnhancedQueueExecutor.java:1578)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1393)
at java.lang.Thread.run(Thread.java:748)
{code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)