[jboss-jira] [JBoss JIRA] (WFLY-9531) Deadlock in model controller encountered in basic test suite
Brian Stansberry (JIRA)
issues at jboss.org
Tue Feb 20 17:42:00 EST 2018
[ https://issues.jboss.org/browse/WFLY-9531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487195#comment-13487195 ]
Brian Stansberry edited comment on WFLY-9531 at 2/20/18 5:41 PM:
-----------------------------------------------------------------
The problem:
Two different services in the same deployment manipulate the deployment=* MRR tree in order to register custom resource types that expose service-specific mgmt information (e.g. stats that a particular resource adapter provides.)
As part of undeploy, in Service.stop() each is trying delete the items in the tree. Both are trying to delete the same general purpose nodes; they aren't focused on the service-specific nodes they added in start.
*Thread MSC 1-1*
Trying to remove the MRR for /deployment=*/subsystem=resource-adapters
Has the write lock on /deployment=*, wants the read lock on child subsystem=resource-adapters
*Thread MSC 1-2*
Trying to remove the MRR for /deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar
Has the write lock on /deployment=*/subsystem=resource-adapters
Wants the read lock on /deployment=*.
*Result*: deadlock.
The "DeploymentScanner-threads - 2" in the description doesn't seem relevant; it's just being impeded by the deadlock.
There's a basic problem in MRR that needs to be fixed. So I cloned this to WFCORE-3410. But, there are things the resource-adapters subsystem is doing wrong that contribute to this, and fixing those will work around the problem. Hence I left WFLY-9531 open and cloned this into WFCORE instead of just moving WFLY-9531.
The simplest workaround is dealing with the fact that IronJacamarActivationResourceService.stop first removes{code} /deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar{code} and then removes {code}/deployment=*/subsystem=resource-adapters{code}. The two MSC threads described above are in different points in that logic. But the first step is not necessary because the 2nd step would remove the child anyway. Just dropping the /deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar:remove will prevent the deadlock.
was (Author: brian.stansberry):
The problem:
Two different services in the same deployment manipulate the deployment=* MRR tree in order to register custom resource types that expose service-specific mgmt information (e.g. stats that a particular resource adapter provides.)
As part of undeploy, in Service.stop() each is trying delete the items in the tree. Both are trying to delete the same general purpose nodes; they aren't focused on the service-specific nodes they added in start.
*Thread MSC 1-1*
Trying to remove the MRR for /deployment=*/subsystem=resource-adapters
Has the write lock on /deployment=*, wants the read lock on child subsystem=resource-adapters
*Thread MSC 1-2*
Trying to remove the MRR for /deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar
Has the write lock on /deployment=*/subsystem=resource-adapters
Wants the read lock on /deployment=*.
*Result*: deadlock.
The "DeploymentScanner-threads - 2" in the description doesn't seem relevant; it's just being impeded by the deadlock.
There's a basic problem in MRR that needs to be fixed. So I cloned this to WFCROE-3410. But, there are things the resource-adapters subsystem is doing wrong that contribute to this, and fixing those will work around the problem. Hence I left WFLY-9531 open and cloned this into WFCORE instead of just moving WFLY-9531.
The simplest workaround is dealing with the fact that IronJacamarActivationResourceService.stop first removes{code} /deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar{code} and then removes {code}/deployment=*/subsystem=resource-adapters{code}. The two MSC threads described above are in different points in that logic. But the first step is not necessary because the 2nd step would remove the child anyway. Just dropping the /deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar:remove will prevent the deadlock.
> Deadlock in model controller encountered in basic test suite
> ------------------------------------------------------------
>
> Key: WFLY-9531
> URL: https://issues.jboss.org/browse/WFLY-9531
> Project: WildFly
> Issue Type: Bug
> Components: Domain Management, JCA, Test Suite
> Reporter: David Lloyd
> Assignee: Brian Stansberry
> Priority: Critical
> Attachments: stack.txt
>
>
> A Java-level deadlock was encountered while running {{org.jboss.as.test.integration.jca.workmanager.LongRunningThreadsCheckTestCase}}. The thread dump is attached.
> This occurred during testing of the new thread pool (WFLY-5332 and related), but it seems unlikely to be caused by this change, unless the new pool's scheduling behavior differed enough from the default that it exposed an already-existing race condition. The deadlock was hit just once in more than 10 runs.
> This is the deadlock:
> {noformat}
> Found one Java-level deadlock:
> =============================
> "DeploymentScanner-threads - 2":
> waiting for ownable synchronizer 0x000000075357cef0, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "MSC service thread 1-1"
> "MSC service thread 1-1":
> waiting for ownable synchronizer 0x000000071dff9b08, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "MSC service thread 1-2"
> "MSC service thread 1-2":
> waiting for ownable synchronizer 0x000000075357cef0, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "MSC service thread 1-1"
> {noformat}
> Here is the specific deadlock info (also in the attachment), showing that it has something to do with the model controller:
> {noformat}
> Java stack information for the threads listed above:
> ===================================================
> "DeploymentScanner-threads - 2":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000075357cef0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getOperationEntry(ConcreteResourceRegistration.java:304)
> at org.jboss.as.controller.registry.NodeSubregistry.getOperationEntry(NodeSubregistry.java:186)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getOperationEntry(ConcreteResourceRegistration.java:300)
> at org.jboss.as.controller.registry.NodeSubregistry.getOperationEntry(NodeSubregistry.java:186)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getOperationEntry(ConcreteResourceRegistration.java:300)
> at org.jboss.as.controller.registry.AbstractResourceRegistration.getOperationEntry(AbstractResourceRegistration.java:180)
> at org.jboss.as.controller.registry.AbstractResourceRegistration.getOperationEntry(AbstractResourceRegistration.java:175)
> at org.jboss.as.controller.OperationContextImpl.getAuthorizationAction(OperationContextImpl.java:1890)
> at org.jboss.as.controller.OperationContextImpl.getBasicAuthorizationResponse(OperationContextImpl.java:1832)
> at org.jboss.as.controller.OperationContextImpl.authorize(OperationContextImpl.java:1756)
> at org.jboss.as.controller.OperationContextImpl.authorize(OperationContextImpl.java:1306)
> at org.jboss.as.controller.operations.global.ReadResourceHandler.doExecuteInternal(ReadResourceHandler.java:316)
> at org.jboss.as.controller.operations.global.ReadResourceHandler.doExecute(ReadResourceHandler.java:171)
> at org.jboss.as.controller.operations.global.GlobalOperationHandlers$AbstractMultiTargetHandler.execute(GlobalOperationHandlers.java:231)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:982)
> at org.jboss.as.controller.AbstractOperationContext.processStages(AbstractOperationContext.java:726)
> at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:450)
> at org.jboss.as.controller.OperationContextImpl.executeOperation(OperationContextImpl.java:1402)
> at org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:418)
> at org.jboss.as.controller.ModelControllerImpl.lambda$execute$1(ModelControllerImpl.java:243)
> at org.jboss.as.controller.ModelControllerImpl$$Lambda$656/1109010279.run(Unknown Source)
> at org.wildfly.security.auth.server.SecurityIdentity$$Lambda$657/2059677950.run(Unknown Source)
> at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:263)
> at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:229)
> at org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:243)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl$LocalClient.executeOperation(ModelControllerClientFactoryImpl.java:131)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl$1$$Lambda$654/385827253.apply(Unknown Source)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl.lambda$executeInVm$0(ModelControllerClientFactoryImpl.java:296)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl$$Lambda$655/955046503.run(Unknown Source)
> at org.jboss.as.controller.access.InVmAccess.runInVm(InVmAccess.java:85)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl.executeInVm(ModelControllerClientFactoryImpl.java:296)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl.access$000(ModelControllerClientFactoryImpl.java:54)
> at org.jboss.as.controller.ModelControllerClientFactoryImpl$1.executeOperation(ModelControllerClientFactoryImpl.java:77)
> at org.jboss.as.controller.LocalModelControllerClient.execute(LocalModelControllerClient.java:54)
> at org.jboss.as.controller.LocalModelControllerClient.execute(LocalModelControllerClient.java:39)
> at org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations$$Lambda$652/737892850.apply(Unknown Source)
> at org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations$Execution$1.execute(DefaultDeploymentOperations.java:135)
> at org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations.getDeploymentsStatus(DefaultDeploymentOperations.java:80)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$ScanContext.<init>(FileSystemDeploymentService.java:1687)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$ScanContext.<init>(FileSystemDeploymentService.java:1636)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService.scan(FileSystemDeploymentService.java:589)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService.scan(FileSystemDeploymentService.java:493)
> at org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$DeploymentScanRunnable.run(FileSystemDeploymentService.java:255)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> at org.jboss.threads.JBossThread.run(JBossThread.java:484)
> "MSC service thread 1-1":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000071dff9b08> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getCapabilities(ConcreteResourceRegistration.java:607)
> at org.jboss.as.controller.registry.NodeSubregistry.unregisterSubModel(NodeSubregistry.java:172)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.unregisterSubModel(ConcreteResourceRegistration.java:273)
> at org.jboss.as.connector.services.resourceadapters.IronJacamarActivationResourceService.stop(IronJacamarActivationResourceService.java:283)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1761)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.execute(ServiceControllerImpl.java:1734)
> at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1521)
> at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1964)
> at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1467)
> at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1346)
> at java.lang.Thread.run(Thread.java:748)
> "MSC service thread 1-2":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000075357cef0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getSubregistry(ConcreteResourceRegistration.java:571)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getChildAddresses(ConcreteResourceRegistration.java:829)
> at org.jboss.as.controller.registry.NodeSubregistry.getChildAddresses(NodeSubregistry.java:349)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getChildAddresses(ConcreteResourceRegistration.java:833)
> at org.jboss.as.controller.registry.NodeSubregistry.getChildAddresses(NodeSubregistry.java:349)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.getChildAddresses(ConcreteResourceRegistration.java:833)
> at org.jboss.as.controller.registry.AbstractResourceRegistration.getChildAddresses(AbstractResourceRegistration.java:326)
> at org.jboss.as.controller.registry.AbstractResourceRegistration.getChildAddresses(AbstractResourceRegistration.java:323)
> at org.jboss.as.controller.registry.ConcreteResourceRegistration.unregisterSubModel(ConcreteResourceRegistration.java:264)
> at org.jboss.as.connector.services.resourceadapters.IronJacamarActivationResourceService.stop(IronJacamarActivationResourceService.java:281)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1761)
> at org.jboss.msc.service.ServiceControllerImpl$StopTask.execute(ServiceControllerImpl.java:1734)
> at org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1521)
> at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1964)
> at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1467)
> at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1346)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the jboss-jira
mailing list