[
https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi...
]
Brian Stansberry edited comment on WFCORE-2089 at 12/7/16 2:07 PM:
-------------------------------------------------------------------
Two thread dumps attached. First with ControllerBootThread blocking in
awaitContainerStability, which eventually times out as expected and logs. The boot
proceeds and gets to the second thread dump, where ControllerBootThread is trying to write
the audit log.
Both logs show a bunch of MSC threads blocking, because now the security stuff will block
forever. A blocking operation has been introduced into service start() calls in a way that
is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz
stuff that seems to be the main thing blocking various MSC threads. But the blocking call
path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check
for the security identity can be dropped somehow, but I don't think that's enough.
We can't have a call path in management op execution that can block forever like this.
There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
was (Author: brian.stansberry):
Two thread dumps attached. First with ControllerBootThread blocking in
awaitContainerStability, which eventually times out as expected and logs. The boot
proceeds and gets to the second thread dump, where ControllerBootThread is trying to log.
Both logs show a bunch of MSC threads blocking, because now the security stuff will block
forever. A blocking operation has been introduced into service start() calls in a way that
is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz
stuff that seems to be the main thing blocking various MSC threads. But the blocking call
path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check
for the security identity can be dropped somehow, but I don't think that's enough.
We can't have a call path in management op execution that can block forever like this.
There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
Infinite wildfly boot on StartException in service start
--------------------------------------------------------
Key: WFCORE-2089
URL:
https://issues.jboss.org/browse/WFCORE-2089
Project: WildFly Core
Issue Type: Bug
Components: Domain Management, Security
Reporter: Jan Kalina
Assignee: Darran Lofthouse
Priority: Blocker
Attachments: threads-2.txt, threads.txt
Following exception (and probably similar too) will cause wildfly frozing during start.
Visible especially during integration tests (which will never ends), but reproducible
directly too (see steps).
{code:java}
15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001:
Failed to start service org.wildfly.security.security-realm.ManagementRealm:
org.jboss.msc.service.StartException in service
org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property
file is invalid: ELY01006: No realm name found in password property file
at
org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
at
org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
at
org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
at
org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
*Update:* This problem with infinite boot will occure everytime the start() method of
some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)