]
Richard Opalka commented on WFCORE-3590:
----------------------------------------
"MSC service thread 1-8" was blocked because MSC queue executor have been
terminated prematurely
and any attempt of its core thread to schedule new tasks (while executing current task)
was rejected.
MSC ServiceControllerImpl source code forces current thread to handle
RejectedExecutionException
by simply executing rejected tasks on its own.
In such scenario "MSC service thread 1-8" acquired "Lockable read
lock" in one task in
rejected tasks queue and then it was trying to acquire "Lockable write lock" in
another rejected task.
Since "MSC service thread 1-8" acquired read lock and because rejected tasks are
chaining
the thread will not free read lock before moving to another task.
So thread will wait for read lock to be released forever.
The proper fix is to avoid queue executor shutdown until all scheduled tasks have been
completed.
Hang in ServerStartFailureTestCase
----------------------------------
Key: WFCORE-3590
URL:
https://issues.jboss.org/browse/WFCORE-3590
Project: WildFly Core
Issue Type: Bug
Components: Domain Management, Server
Affects Versions: 4.0.0.Alpha9
Reporter: Brian Stansberry
Assignee: Richard Opalka
Priority: Critical
Attachments: WFCORE-3590-threads.txt
Hang observed in
https://ci.wildfly.org/viewLog.html?buildId=88611&buildTypeId=WildFly...
I'll attach the thread dump.
[~dmlloyd] I assigned this to you mostly as a form of ping, as I want to talk to you
about it and you are away today.
Interesting parts of the thread dump:
{code}
"Thread-2" #11 prio=5 os_prio=0 tid=0xe13f0400 nid=0x4c49 waiting on condition
[0xde4ed000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xe5ea9de8> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at org.jboss.as.server.BootstrapImpl$ShutdownHook.shutdown(BootstrapImpl.java:276)
at org.jboss.as.server.BootstrapImpl$ShutdownHook.run(BootstrapImpl.java:240)
"Controller Boot Thread" #25 prio=5 os_prio=0 tid=0xe0ca4c00 nid=0x4c35 waiting
for monitor entry [0xdf3fe000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Shutdown.exit(Shutdown.java:212)
- waiting to lock <0xe31d5e18> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at org.jboss.as.server.SystemExiter$DefaultExiter.exit(SystemExiter.java:117)
at org.jboss.as.server.SystemExiter.logAndExit(SystemExiter.java:98)
at org.jboss.as.server.ServerService.boot(ServerService.java:405)
at
org.jboss.as.controller.AbstractControllerService$1.run(AbstractControllerService.java:370)
at java.lang.Thread.run(Thread.java:748)
"MSC service thread 1-8" #20 prio=5 os_prio=0 tid=0x087b8c00 nid=0x4c2f in
Object.wait() [0xe03ba000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0xe32a3f28> (a org.jboss.msc.service.ServiceRegistrationImpl)
at java.lang.Object.wait(Object.java:502)
at org.jboss.msc.service.Lockable.acquireWrite(Lockable.java:97)
at
org.jboss.msc.service.ServiceControllerImpl$RemoveTask.execute(ServiceControllerImpl.java:1865)
- locked <0xe32a3f28> (a org.jboss.msc.service.ServiceRegistrationImpl)
at
org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1527)
at
org.jboss.msc.service.ServiceControllerImpl.doExecute(ServiceControllerImpl.java:788)
at
org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1537)
at
org.jboss.msc.service.ServiceControllerImpl.doExecute(ServiceControllerImpl.java:788)
at
org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1537)
at
org.jboss.msc.service.ServiceControllerImpl.doExecute(ServiceControllerImpl.java:788)
at
org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1537)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1979)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1481)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1374)
at java.lang.Thread.run(Thread.java:748)
"main" #1 prio=5 os_prio=0 tid=0xf6509000 nid=0x4c02 in Object.wait()
[0xf6685000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0xe32b58e8> (a org.jboss.as.server.BootstrapImpl$ShutdownHook)
at java.lang.Thread.join(Thread.java:1252)
- locked <0xe32b58e8> (a org.jboss.as.server.BootstrapImpl$ShutdownHook)
at java.lang.Thread.join(Thread.java:1326)
at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked <0xe31d5e18> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at org.jboss.as.server.SystemExiter$DefaultExiter.exit(SystemExiter.java:117)
at org.jboss.as.server.SystemExiter.logAndExit(SystemExiter.java:98)
at org.jboss.as.server.DomainServerMain.main(DomainServerMain.java:183)
at java.lang.invoke.LambdaForm$DMH/7468253.invokeStatic_L_V(LambdaForm$DMH)
at java.lang.invoke.LambdaForm$MH/7742980.invokeExact_MT(LambdaForm$MH)
at org.jboss.modules.Module.runMainMethod(Module.java:348)
at org.jboss.modules.Module.run(Module.java:328)
at org.jboss.modules.Main.main(Main.java:557)
{code}
This is a domain server. The "main" thread has recognized that its
ProcessController has closed its stdin, so it is shutting down via System.exit.
"Thread-2" is running BootstrapImpl.ShutdownHook, waiting on a latch for the
MSC ServiceContainer to complete termination. So the SC not completing termination is the
basic issue.
"Controller Boot Thread" is there because this termination occurred during
boot. That caused some problem during boot (not surprising) so it is responding to that
problem by trying to terminate the process, via System.exit. It's blocking waiting for
"main" which has done the same. This thread should not be preventing MSC
terminating though; it's not, for example called as part of a
StartContext.asynchronous thing. IOW I don't think this thread is relevant to the
problem.
"MSC service thread 1-8" is the most interesting one to me. An MSC thread is
blocked but it's not clear to me why. An interesting frame in the stack is
org.jboss.msc.service.ServiceControllerImpl.doExecute(ServiceControllerImpl.java:788).
That shows that ServiceControllerImpl$RemoveTask was passed to the executor but a
RejectedExecutionException was thrown, so the task is being run from the thread that
attempted to pass it to the executor. Should the MSC executor be rejecting tasks before
all service controllers are removed?