[jboss-jira] [JBoss JIRA] (WFCORE-4623) Intermittent failures in IdentityOperationsTestCase
Brian Stansberry (Jira)
issues at jboss.org
Sat Aug 31 08:11:00 EDT 2019
[ https://issues.jboss.org/browse/WFCORE-4623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13777400#comment-13777400 ]
Brian Stansberry commented on WFCORE-4623:
------------------------------------------
This looks like a JVM bug:
{code}
MSC service thread 1-4
-- locked <0xa7769568> (a java.io.File)
-- running clinit of java.nio.file.FileSystems$DefaultFileSystemHolder
-- waiting to lock <0xa71ae1b8> (a java.lang.Runtime)
MSC service thread 1-8
-- locked <0xa718dae0> (a java.lang.Class for java.security.Security)
-- locked <0xa71ae1b8> (a java.lang.Runtime)
-- waiting for completion of clinit of java.nio.file.FileSystems$DefaultFileSystemHolder
{code}
The dump doesn't show 'locks' for this, but internally the VM will only allow one thread to do a clinit, so effectively thread 1-4 has 'locked java.nio.file.FileSystems$DefaultFileSystemHolder' while 1-8 is 'waiting to lock java.nio.file.FileSystems$DefaultFileSystemHolder'. So typical deadlock, two threads acquiring locks in different order.
> Intermittent failures in IdentityOperationsTestCase
> ---------------------------------------------------
>
> Key: WFCORE-4623
> URL: https://issues.jboss.org/browse/WFCORE-4623
> Project: WildFly Core
> Issue Type: Bug
> Components: Security
> Reporter: Brian Stansberry
> Assignee: Ashley Abdel-Sayed
> Priority: Major
> Attachments: WFCORE-4623_hang.txt
>
>
> IdentityOperationsTestCase fails intermittently, producing a set of 21 failures. When this happens the entire job seems to time out.
> https://ci.wildfly.org/project.html?projectId=WildFlyCore_PullRequest&buildTypeId=&tab=testDetails&testNameId=7747204239795063105&order=TEST_STATUS_DESC&branch_WildFlyCore_PullRequest=__all_branches__&itemsCount=50
> The problem seems to involve a server not being able to reach MSC stability during boot and then a lot of problems trying to roll back the boot op. The latter are kind of noise, i.e. the stack trace bit in the snippet below. The key thing is the failure to get MSC stability.
> {code}
> 17:11:15,658 INFO (main) [org.wildfly.security] <Version.java:55> ELY00001: WildFly Elytron version 1.10.0.CR5
> 17:11:15,878 INFO (main) [org.jboss.msc] <ServiceContainerImpl.java:90> JBoss MSC version 1.4.8.Final
> 17:11:15,893 INFO (main) [org.jboss.threads] <Version.java:52> JBoss Threads version 2.3.3.Final
> 17:11:16,027 TRACE (main) [org.wildfly.security] <SecurityDomain.java:1056> Building security domain with defaultRealmName Empty.
> 17:11:16,037 TRACE (main) [org.wildfly.security] <SecurityDomain.java:708> Role mapping: principal [anonymous] -> decoded roles [] -> realm mapped roles [] -> domain mapped roles []
> 17:11:16,312 TRACE (MSC service thread 1-2) [org.wildfly.extension.elytron] <ProviderDefinitions.java:238> Loaded providers [WildFlyElytron version 1.0]
> 17:16:16,313 ERROR (Controller Boot Thread) [org.jboss.as.controller.management-operation] <OperationContextImpl.java:489> WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[("path" => "jboss.server.data.dir")]'
> 17:16:21,322 ERROR (Controller Boot Thread) [org.jboss.as.controller.management-operation] <AbstractOperationContext.java:1525> WFLYCTL0190: Step handler org.jboss.as.controller.AbstractAddStepHandler$1 at b0b65f for operation add at address [
> ("subsystem" => "elytron"),
> ("filesystem-realm" => "FileSystemRealm")
> ] failed handling operation rollback -- java.util.concurrent.TimeoutException: java.util.concurrent.TimeoutException
> at org.jboss.as.controller.OperationContextImpl.waitForRemovals(OperationContextImpl.java:523)
> at org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1518)
> at org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1472)
> at org.jboss.as.controller.AbstractOperationContext$Step.finalizeStep(AbstractOperationContext.java:1455)
> at org.jboss.as.controller.AbstractOperationContext$Step.access$400(AbstractOperationContext.java:1319)
> at org.jboss.as.controller.AbstractOperationContext.executeResultHandlerPhase(AbstractOperationContext.java:876)
> at org.jboss.as.controller.AbstractOperationContext.processStages(AbstractOperationContext.java:726)
> at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:467)
> at org.jboss.as.controller.OperationContextImpl.executeOperation(OperationContextImpl.java:1413)
> at org.jboss.as.controller.ModelControllerImpl.boot(ModelControllerImpl.java:495)
> {code}
--
This message was sent by Atlassian Jira
(v7.13.5#713005)
More information about the jboss-jira
mailing list