[JBoss JIRA] (WFCORE-2089) Infinite wildfly boot on StartException in service start
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi... ]
Brian Stansberry commented on WFCORE-2089:
------------------------------------------
[~honza889] You wrote:
"Update: This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem."
I don't believe this is true. Why do you say that? This security case is quite unique.
> Infinite wildfly boot on StartException in service start
> --------------------------------------------------------
>
> Key: WFCORE-2089
> URL: https://issues.jboss.org/browse/WFCORE-2089
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management, Security
> Reporter: Jan Kalina
> Assignee: Darran Lofthouse
> Priority: Blocker
> Attachments: threads-2.txt, threads.txt
>
>
> Following exception (and probably similar too) will cause wildfly frozing during start. Visible especially during integration tests (which will never ends), but reproducible directly too (see steps).
> {code:java}
> 15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.security.security-realm.ManagementRealm: org.jboss.msc.service.StartException in service org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property file is invalid: ELY01006: No realm name found in password property file
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
> at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> *Update:* This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFCORE-2089) Infinite wildfly boot on StartException in service start
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi... ]
Brian Stansberry edited comment on WFCORE-2089 at 12/7/16 2:09 PM:
-------------------------------------------------------------------
Two thread dumps attached. First with ControllerBootThread blocking in awaitContainerStability, which eventually times out as expected and logs. The boot proceeds and gets to the second thread dump, where ControllerBootThread is trying to write the audit log.
In both cases ControllerBootThread is blocking waiting for a future that will never return from get() because the failed service means the dependency will never become available.
Both logs also show a bunch of MSC threads blocking, because now the security stuff will block forever. A blocking operation has been introduced into service start() calls in a way that is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz stuff that seems to be the main thing blocking various MSC threads. But the blocking call path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check for the security identity can be dropped somehow, but I don't think that's enough. We can't have a call path in management op execution that can block forever like this. There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
was (Author: brian.stansberry):
Two thread dumps attached. First with ControllerBootThread blocking in awaitContainerStability, which eventually times out as expected and logs. The boot proceeds and gets to the second thread dump, where ControllerBootThread is trying to write the audit log.
Both logs show a bunch of MSC threads blocking, because now the security stuff will block forever. A blocking operation has been introduced into service start() calls in a way that is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz stuff that seems to be the main thing blocking various MSC threads. But the blocking call path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check for the security identity can be dropped somehow, but I don't think that's enough. We can't have a call path in management op execution that can block forever like this. There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
> Infinite wildfly boot on StartException in service start
> --------------------------------------------------------
>
> Key: WFCORE-2089
> URL: https://issues.jboss.org/browse/WFCORE-2089
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management, Security
> Reporter: Jan Kalina
> Assignee: Darran Lofthouse
> Priority: Blocker
> Attachments: threads-2.txt, threads.txt
>
>
> Following exception (and probably similar too) will cause wildfly frozing during start. Visible especially during integration tests (which will never ends), but reproducible directly too (see steps).
> {code:java}
> 15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.security.security-realm.ManagementRealm: org.jboss.msc.service.StartException in service org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property file is invalid: ELY01006: No realm name found in password property file
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
> at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> *Update:* This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFCORE-2089) Infinite wildfly boot on StartException in service start
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi... ]
Brian Stansberry reassigned WFCORE-2089:
----------------------------------------
Component/s: Security
Assignee: Darran Lofthouse
I'm assigning this back to Darran as the fundamental issue is the addition of a permanently blocking call into the management call path. I can help try and fix it or assign someone on my team but will need advice on what to do.
> Infinite wildfly boot on StartException in service start
> --------------------------------------------------------
>
> Key: WFCORE-2089
> URL: https://issues.jboss.org/browse/WFCORE-2089
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management, Security
> Reporter: Jan Kalina
> Assignee: Darran Lofthouse
> Priority: Blocker
> Attachments: threads-2.txt, threads.txt
>
>
> Following exception (and probably similar too) will cause wildfly frozing during start. Visible especially during integration tests (which will never ends), but reproducible directly too (see steps).
> {code:java}
> 15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.security.security-realm.ManagementRealm: org.jboss.msc.service.StartException in service org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property file is invalid: ELY01006: No realm name found in password property file
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
> at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> *Update:* This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFCORE-2089) Infinite wildfly boot on StartException in service start
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi... ]
Brian Stansberry edited comment on WFCORE-2089 at 12/7/16 2:07 PM:
-------------------------------------------------------------------
Two thread dumps attached. First with ControllerBootThread blocking in awaitContainerStability, which eventually times out as expected and logs. The boot proceeds and gets to the second thread dump, where ControllerBootThread is trying to write the audit log.
Both logs show a bunch of MSC threads blocking, because now the security stuff will block forever. A blocking operation has been introduced into service start() calls in a way that is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz stuff that seems to be the main thing blocking various MSC threads. But the blocking call path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check for the security identity can be dropped somehow, but I don't think that's enough. We can't have a call path in management op execution that can block forever like this. There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
was (Author: brian.stansberry):
Two thread dumps attached. First with ControllerBootThread blocking in awaitContainerStability, which eventually times out as expected and logs. The boot proceeds and gets to the second thread dump, where ControllerBootThread is trying to log.
Both logs show a bunch of MSC threads blocking, because now the security stuff will block forever. A blocking operation has been introduced into service start() calls in a way that is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz stuff that seems to be the main thing blocking various MSC threads. But the blocking call path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check for the security identity can be dropped somehow, but I don't think that's enough. We can't have a call path in management op execution that can block forever like this. There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
> Infinite wildfly boot on StartException in service start
> --------------------------------------------------------
>
> Key: WFCORE-2089
> URL: https://issues.jboss.org/browse/WFCORE-2089
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management, Security
> Reporter: Jan Kalina
> Assignee: Darran Lofthouse
> Priority: Blocker
> Attachments: threads-2.txt, threads.txt
>
>
> Following exception (and probably similar too) will cause wildfly frozing during start. Visible especially during integration tests (which will never ends), but reproducible directly too (see steps).
> {code:java}
> 15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.security.security-realm.ManagementRealm: org.jboss.msc.service.StartException in service org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property file is invalid: ELY01006: No realm name found in password property file
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
> at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> *Update:* This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFCORE-2089) Infinite wildfly boot on StartException in service start
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi... ]
Brian Stansberry commented on WFCORE-2089:
------------------------------------------
Two thread dumps attached. First with ControllerBootThread blocking in awaitContainerStability, which eventually times out as expected and logs. The boot proceeds and gets to the second thread dump, where ControllerBootThread is trying to log.
Both logs show a bunch of MSC threads blocking, because now the security stuff will block forever. A blocking operation has been introduced into service start() calls in a way that is unknowable by the service author. This is why MSC doesn't reach stability.
[~dlofthouse] your JMX change may help with some of this by getting rid of the JMX authz stuff that seems to be the main thing blocking various MSC threads. But the blocking call path in OperationContext.getCaller() needs to be addressed. Perhaps during boot the check for the security identity can be dropped somehow, but I don't think that's enough. We can't have a call path in management op execution that can block forever like this. There's a whole set of work (see BlockingTimeout) that's all about avoiding that.
> Infinite wildfly boot on StartException in service start
> --------------------------------------------------------
>
> Key: WFCORE-2089
> URL: https://issues.jboss.org/browse/WFCORE-2089
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Reporter: Jan Kalina
> Priority: Blocker
> Attachments: threads-2.txt, threads.txt
>
>
> Following exception (and probably similar too) will cause wildfly frozing during start. Visible especially during integration tests (which will never ends), but reproducible directly too (see steps).
> {code:java}
> 15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.security.security-realm.ManagementRealm: org.jboss.msc.service.StartException in service org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property file is invalid: ELY01006: No realm name found in password property file
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
> at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> *Update:* This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFLY-7746) Upgrade to a version of Infinispan that depends on JBoss Marshalling 2.0.0.Beta3
by David Lloyd (JIRA)
[ https://issues.jboss.org/browse/WFLY-7746?page=com.atlassian.jira.plugin.... ]
David Lloyd commented on WFLY-7746:
-----------------------------------
In the absolute worst-case scenario, we could possibly just have both.
> Upgrade to a version of Infinispan that depends on JBoss Marshalling 2.0.0.Beta3
> --------------------------------------------------------------------------------
>
> Key: WFLY-7746
> URL: https://issues.jboss.org/browse/WFLY-7746
> Project: WildFly
> Issue Type: Component Upgrade
> Components: Clustering
> Reporter: Farah Juma
> Assignee: Paul Ferraro
> Attachments: org.jboss.as.test.clustering.cluster.sso.ClusteredSingleSignOnTestCase-SYNC-tcp-output.txt
>
>
> As part of the wildfly-naming-client integration work, we need to upgrade to JBoss Marshalling 2.0.0.Beta3 (see WFCORE-2044 and WFLY-7675). This upgrade currently results in testsuite failures in {{org.jboss.as.test.clustering.cluster.sso.ClusteredSingleSignOnTestCase}} since Infinispan is still on JBoss Marshalling 1.4.x.
> ISPN-3391 was created a while back to upgrade Infinispan to JBoss Marshalling 2.0.0 but it seems this issue was waiting on a JBoss Marshalling release that adds back some classes that had previously been removed. JBoss Marshalling 2.0.0.Beta3 contains these classes so it should be possible now to upgrade Infinispan to this new version.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFLY-7746) Upgrade to a version of Infinispan that depends on JBoss Marshalling 2.0.0.Beta3
by Farah Juma (JIRA)
[ https://issues.jboss.org/browse/WFLY-7746?page=com.atlassian.jira.plugin.... ]
Farah Juma commented on WFLY-7746:
----------------------------------
Thanks, Paul.
> Upgrade to a version of Infinispan that depends on JBoss Marshalling 2.0.0.Beta3
> --------------------------------------------------------------------------------
>
> Key: WFLY-7746
> URL: https://issues.jboss.org/browse/WFLY-7746
> Project: WildFly
> Issue Type: Component Upgrade
> Components: Clustering
> Reporter: Farah Juma
> Assignee: Paul Ferraro
> Attachments: org.jboss.as.test.clustering.cluster.sso.ClusteredSingleSignOnTestCase-SYNC-tcp-output.txt
>
>
> As part of the wildfly-naming-client integration work, we need to upgrade to JBoss Marshalling 2.0.0.Beta3 (see WFCORE-2044 and WFLY-7675). This upgrade currently results in testsuite failures in {{org.jboss.as.test.clustering.cluster.sso.ClusteredSingleSignOnTestCase}} since Infinispan is still on JBoss Marshalling 1.4.x.
> ISPN-3391 was created a while back to upgrade Infinispan to JBoss Marshalling 2.0.0 but it seems this issue was waiting on a JBoss Marshalling release that adds back some classes that had previously been removed. JBoss Marshalling 2.0.0.Beta3 contains these classes so it should be possible now to upgrade Infinispan to this new version.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFLY-7746) Upgrade to a version of Infinispan that depends on JBoss Marshalling 2.0.0.Beta3
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/WFLY-7746?page=com.atlassian.jira.plugin.... ]
Paul Ferraro commented on WFLY-7746:
------------------------------------
[~fjuma] Thanks for the details. Let me investigate the possibility of moving Infinispan 8.2.x to JBM2. If this isn't feasible, I'll need to escalate this to Jason and Tristan. I'll keep you posted.
> Upgrade to a version of Infinispan that depends on JBoss Marshalling 2.0.0.Beta3
> --------------------------------------------------------------------------------
>
> Key: WFLY-7746
> URL: https://issues.jboss.org/browse/WFLY-7746
> Project: WildFly
> Issue Type: Component Upgrade
> Components: Clustering
> Reporter: Farah Juma
> Assignee: Paul Ferraro
> Attachments: org.jboss.as.test.clustering.cluster.sso.ClusteredSingleSignOnTestCase-SYNC-tcp-output.txt
>
>
> As part of the wildfly-naming-client integration work, we need to upgrade to JBoss Marshalling 2.0.0.Beta3 (see WFCORE-2044 and WFLY-7675). This upgrade currently results in testsuite failures in {{org.jboss.as.test.clustering.cluster.sso.ClusteredSingleSignOnTestCase}} since Infinispan is still on JBoss Marshalling 1.4.x.
> ISPN-3391 was created a while back to upgrade Infinispan to JBoss Marshalling 2.0.0 but it seems this issue was waiting on a JBoss Marshalling release that adds back some classes that had previously been removed. JBoss Marshalling 2.0.0.Beta3 contains these classes so it should be possible now to upgrade Infinispan to this new version.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months
[JBoss JIRA] (WFCORE-2089) Infinite wildfly boot on StartException in service start
by Brian Stansberry (JIRA)
[ https://issues.jboss.org/browse/WFCORE-2089?page=com.atlassian.jira.plugi... ]
Brian Stansberry updated WFCORE-2089:
-------------------------------------
Attachment: threads.txt
threads-2.txt
> Infinite wildfly boot on StartException in service start
> --------------------------------------------------------
>
> Key: WFCORE-2089
> URL: https://issues.jboss.org/browse/WFCORE-2089
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Reporter: Jan Kalina
> Priority: Blocker
> Attachments: threads-2.txt, threads.txt
>
>
> Following exception (and probably similar too) will cause wildfly frozing during start. Visible especially during integration tests (which will never ends), but reproducible directly too (see steps).
> {code:java}
> 15:59:37,252 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-4) MSC000001: Failed to start service org.wildfly.security.security-realm.ManagementRealm: org.jboss.msc.service.StartException in service org.wildfly.security.security-realm.ManagementRealm: WFLYELY00025: Referenced property file is invalid: ELY01006: No realm name found in password property file
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:185)
> at org.wildfly.extension.elytron.PropertiesRealmDefinition$1$1.get(PropertiesRealmDefinition.java:164)
> at org.wildfly.extension.elytron.TrivialService.start(TrivialService.java:53)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1963)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1896)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> *Update:* This problem with infinite boot will occure everytime the start() method of some service throws StartException(). Not an Elytron problem.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
7 years, 7 months