[
https://issues.jboss.org/browse/WFLY-9531?page=com.atlassian.jira.plugin....
]
Brian Stansberry edited comment on WFLY-9531 at 2/20/18 5:41 PM:
-----------------------------------------------------------------
The problem:
Two different services in the same deployment manipulate the deployment=* MRR tree in
order to register custom resource types that expose service-specific mgmt information
(e.g. stats that a particular resource adapter provides.)
As part of undeploy, in Service.stop() each is trying delete the items in the tree. Both
are trying to delete the same general purpose nodes; they aren't focused on the
service-specific nodes they added in start.
*Thread MSC 1-1*
Trying to remove the MRR for /deployment=*/subsystem=resource-adapters
Has the write lock on /deployment=*, wants the read lock on child
subsystem=resource-adapters
*Thread MSC 1-2*
Trying to remove the MRR for
/deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar
Has the write lock on /deployment=*/subsystem=resource-adapters
Wants the read lock on /deployment=*.
*Result*: deadlock.
The "DeploymentScanner-threads - 2" in the description doesn't seem
relevant; it's just being impeded by the deadlock.
There's a basic problem in MRR that needs to be fixed. So I cloned this to
WFCORE-3410. But, there are things the resource-adapters subsystem is doing wrong that
contribute to this, and fixing those will work around the problem. Hence I left WFLY-9531
open and cloned this into WFCORE instead of just moving WFLY-9531.
The simplest workaround is dealing with the fact that
IronJacamarActivationResourceService.stop first removes{code}
/deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar{code} and then removes
{code}/deployment=*/subsystem=resource-adapters{code}. The two MSC threads described above
are in different points in that logic. But the first step is not necessary because the 2nd
step would remove the child anyway. Just dropping the
/deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar:remove will prevent the
deadlock.
was (Author: brian.stansberry):
The problem:
Two different services in the same deployment manipulate the deployment=* MRR tree in
order to register custom resource types that expose service-specific mgmt information
(e.g. stats that a particular resource adapter provides.)
As part of undeploy, in Service.stop() each is trying delete the items in the tree. Both
are trying to delete the same general purpose nodes; they aren't focused on the
service-specific nodes they added in start.
*Thread MSC 1-1*
Trying to remove the MRR for /deployment=*/subsystem=resource-adapters
Has the write lock on /deployment=*, wants the read lock on child
subsystem=resource-adapters
*Thread MSC 1-2*
Trying to remove the MRR for
/deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar
Has the write lock on /deployment=*/subsystem=resource-adapters
Wants the read lock on /deployment=*.
*Result*: deadlock.
The "DeploymentScanner-threads - 2" in the description doesn't seem
relevant; it's just being impeded by the deadlock.
There's a basic problem in MRR that needs to be fixed. So I cloned this to
WFCROE-3410. But, there are things the resource-adapters subsystem is doing wrong that
contribute to this, and fixing those will work around the problem. Hence I left WFLY-9531
open and cloned this into WFCORE instead of just moving WFLY-9531.
The simplest workaround is dealing with the fact that
IronJacamarActivationResourceService.stop first removes{code}
/deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar{code} and then removes
{code}/deployment=*/subsystem=resource-adapters{code}. The two MSC threads described above
are in different points in that logic. But the first step is not necessary because the 2nd
step would remove the child anyway. Just dropping the
/deployment=*/subsystem=resource-adapters/ironjacamar=ironjacamar:remove will prevent the
deadlock.
Deadlock in model controller encountered in basic test suite
------------------------------------------------------------
Key: WFLY-9531
URL:
https://issues.jboss.org/browse/WFLY-9531
Project: WildFly
Issue Type: Bug
Components: Domain Management, JCA, Test Suite
Reporter: David Lloyd
Assignee: Brian Stansberry
Priority: Critical
Attachments: stack.txt
A Java-level deadlock was encountered while running
{{org.jboss.as.test.integration.jca.workmanager.LongRunningThreadsCheckTestCase}}. The
thread dump is attached.
This occurred during testing of the new thread pool (WFLY-5332 and related), but it seems
unlikely to be caused by this change, unless the new pool's scheduling behavior
differed enough from the default that it exposed an already-existing race condition. The
deadlock was hit just once in more than 10 runs.
This is the deadlock:
{noformat}
Found one Java-level deadlock:
=============================
"DeploymentScanner-threads - 2":
waiting for ownable synchronizer 0x000000075357cef0, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "MSC service thread 1-1"
"MSC service thread 1-1":
waiting for ownable synchronizer 0x000000071dff9b08, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "MSC service thread 1-2"
"MSC service thread 1-2":
waiting for ownable synchronizer 0x000000075357cef0, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "MSC service thread 1-1"
{noformat}
Here is the specific deadlock info (also in the attachment), showing that it has
something to do with the model controller:
{noformat}
Java stack information for the threads listed above:
===================================================
"DeploymentScanner-threads - 2":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000075357cef0> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getOperationEntry(ConcreteResourceRegistration.java:304)
at
org.jboss.as.controller.registry.NodeSubregistry.getOperationEntry(NodeSubregistry.java:186)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getOperationEntry(ConcreteResourceRegistration.java:300)
at
org.jboss.as.controller.registry.NodeSubregistry.getOperationEntry(NodeSubregistry.java:186)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getOperationEntry(ConcreteResourceRegistration.java:300)
at
org.jboss.as.controller.registry.AbstractResourceRegistration.getOperationEntry(AbstractResourceRegistration.java:180)
at
org.jboss.as.controller.registry.AbstractResourceRegistration.getOperationEntry(AbstractResourceRegistration.java:175)
at
org.jboss.as.controller.OperationContextImpl.getAuthorizationAction(OperationContextImpl.java:1890)
at
org.jboss.as.controller.OperationContextImpl.getBasicAuthorizationResponse(OperationContextImpl.java:1832)
at
org.jboss.as.controller.OperationContextImpl.authorize(OperationContextImpl.java:1756)
at
org.jboss.as.controller.OperationContextImpl.authorize(OperationContextImpl.java:1306)
at
org.jboss.as.controller.operations.global.ReadResourceHandler.doExecuteInternal(ReadResourceHandler.java:316)
at
org.jboss.as.controller.operations.global.ReadResourceHandler.doExecute(ReadResourceHandler.java:171)
at
org.jboss.as.controller.operations.global.GlobalOperationHandlers$AbstractMultiTargetHandler.execute(GlobalOperationHandlers.java:231)
at
org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:982)
at
org.jboss.as.controller.AbstractOperationContext.processStages(AbstractOperationContext.java:726)
at
org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:450)
at
org.jboss.as.controller.OperationContextImpl.executeOperation(OperationContextImpl.java:1402)
at
org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:418)
at
org.jboss.as.controller.ModelControllerImpl.lambda$execute$1(ModelControllerImpl.java:243)
at org.jboss.as.controller.ModelControllerImpl$$Lambda$656/1109010279.run(Unknown
Source)
at org.wildfly.security.auth.server.SecurityIdentity$$Lambda$657/2059677950.run(Unknown
Source)
at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:263)
at org.wildfly.security.auth.server.SecurityIdentity.runAs(SecurityIdentity.java:229)
at org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:243)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl$LocalClient.executeOperation(ModelControllerClientFactoryImpl.java:131)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl$1$$Lambda$654/385827253.apply(Unknown
Source)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl.lambda$executeInVm$0(ModelControllerClientFactoryImpl.java:296)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl$$Lambda$655/955046503.run(Unknown
Source)
at org.jboss.as.controller.access.InVmAccess.runInVm(InVmAccess.java:85)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl.executeInVm(ModelControllerClientFactoryImpl.java:296)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl.access$000(ModelControllerClientFactoryImpl.java:54)
at
org.jboss.as.controller.ModelControllerClientFactoryImpl$1.executeOperation(ModelControllerClientFactoryImpl.java:77)
at
org.jboss.as.controller.LocalModelControllerClient.execute(LocalModelControllerClient.java:54)
at
org.jboss.as.controller.LocalModelControllerClient.execute(LocalModelControllerClient.java:39)
at
org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations$$Lambda$652/737892850.apply(Unknown
Source)
at
org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations$Execution$1.execute(DefaultDeploymentOperations.java:135)
at
org.jboss.as.server.deployment.scanner.DefaultDeploymentOperations.getDeploymentsStatus(DefaultDeploymentOperations.java:80)
at
org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$ScanContext.<init>(FileSystemDeploymentService.java:1687)
at
org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$ScanContext.<init>(FileSystemDeploymentService.java:1636)
at
org.jboss.as.server.deployment.scanner.FileSystemDeploymentService.scan(FileSystemDeploymentService.java:589)
at
org.jboss.as.server.deployment.scanner.FileSystemDeploymentService.scan(FileSystemDeploymentService.java:493)
at
org.jboss.as.server.deployment.scanner.FileSystemDeploymentService$DeploymentScanRunnable.run(FileSystemDeploymentService.java:255)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at org.jboss.threads.JBossThread.run(JBossThread.java:484)
"MSC service thread 1-1":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000071dff9b08> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getCapabilities(ConcreteResourceRegistration.java:607)
at
org.jboss.as.controller.registry.NodeSubregistry.unregisterSubModel(NodeSubregistry.java:172)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.unregisterSubModel(ConcreteResourceRegistration.java:273)
at
org.jboss.as.connector.services.resourceadapters.IronJacamarActivationResourceService.stop(IronJacamarActivationResourceService.java:283)
at
org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1761)
at
org.jboss.msc.service.ServiceControllerImpl$StopTask.execute(ServiceControllerImpl.java:1734)
at
org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1521)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1964)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1467)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1346)
at java.lang.Thread.run(Thread.java:748)
"MSC service thread 1-2":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000075357cef0> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getSubregistry(ConcreteResourceRegistration.java:571)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getChildAddresses(ConcreteResourceRegistration.java:829)
at
org.jboss.as.controller.registry.NodeSubregistry.getChildAddresses(NodeSubregistry.java:349)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getChildAddresses(ConcreteResourceRegistration.java:833)
at
org.jboss.as.controller.registry.NodeSubregistry.getChildAddresses(NodeSubregistry.java:349)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.getChildAddresses(ConcreteResourceRegistration.java:833)
at
org.jboss.as.controller.registry.AbstractResourceRegistration.getChildAddresses(AbstractResourceRegistration.java:326)
at
org.jboss.as.controller.registry.AbstractResourceRegistration.getChildAddresses(AbstractResourceRegistration.java:323)
at
org.jboss.as.controller.registry.ConcreteResourceRegistration.unregisterSubModel(ConcreteResourceRegistration.java:264)
at
org.jboss.as.connector.services.resourceadapters.IronJacamarActivationResourceService.stop(IronJacamarActivationResourceService.java:281)
at
org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:1761)
at
org.jboss.msc.service.ServiceControllerImpl$StopTask.execute(ServiceControllerImpl.java:1734)
at
org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1521)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1964)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1467)
at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1346)
at java.lang.Thread.run(Thread.java:748)
{noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)