[jboss-jira] [JBoss JIRA] (JBMESSAGING-1941) Messaging deadlocks on AspectManager
Aaron Ogburn (JIRA)
jira-events at lists.jboss.org
Tue Mar 5 12:49:57 EST 2013
[ https://issues.jboss.org/browse/JBMESSAGING-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758633#comment-12758633 ]
Aaron Ogburn edited comment on JBMESSAGING-1941 at 3/5/13 12:49 PM:
--------------------------------------------------------------------
Saw another occurrence of this. It's a bit different but fundamentally the same:
{noformat}
"WorkManager(2)-2" daemon prio=10 tid=0x000000004079b000 nid=0x1f99 waiting for monitor entry [0x00007f2d94a00000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.jboss.aop.AspectManager.getAdvisor(AspectManager.java:728)
- waiting to lock <0x000000077039a500> (a org.jboss.aop.AspectManager)
at org.jboss.jms.client.delegate.ClientConnectionDelegate.<clinit>(ClientConnectionDelegate.java) <-- native classloader lock grabbed
at org.jboss.jms.delegate.CreateConnectionResult.read(CreateConnectionResult.java:106)
at org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateResponse.read(ConnectionFactoryCreateConnectionDelegateResponse.java:71)
at org.jboss.jms.wireformat.JMSWireFormat.read(JMSWireFormat.java:298)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedRead(MicroSocketClientInvoker.java:1301)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:907)
at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:461)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:167)
at org.jboss.remoting.Client.invoke(Client.java:2034)
at org.jboss.remoting.Client.invoke(Client.java:877)
"WorkManager(2)-3" daemon prio=10 tid=0x00007f2d9811d000 nid=0x2003 in Object.wait() [0x00007f2d94125000]
java.lang.Thread.State: RUNNABLE
at org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegateInternal(ServerConnectionFactoryEndpoint.java:296) <-- waiting for that native cl lock
- locked <0x000000077039a500> (a org.jboss.aop.AspectManager)
at org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegate(ServerConnectionFactoryEndpoint.java:170)
at org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.org$jboss$jms$server$endpoint$advised$ConnectionFactoryAdvised$createConnectionDelegate$aop(ConnectionFactoryAdvised.java:108)
at org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.createConnectionDelegate(ConnectionFactoryAdvised.java)
at org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateRequest.serverInvoke(ConnectionFactoryCreateConnectionDelegateRequest.java:91)
at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:157)
{noformat}
So we're deadlocked still on the same locks. WorkManager(2)-2 is grabbing the native cl lock then the AspectManager, while WorkManager(2)-3 grabs the AspectManager then the native cl lock. I think the problem is the AspectManager synchronization that was added in JBMESSAGING-797 as a JBM level workaround for JBAOP-360. I think this was fine until synchronization was added in AOP to fix JBAOP-360 as well, and that AOP level fix is in EAP 5 now to conflict with that prior messaging fix. JBAOP-360 resulted in AspectManager.getAdvisor being made a synchronized method and would address the issue without JBMESSAGING-797. JBAOP-360 and JBMESSAGING-797 together though causes the issue as it introduces potential inverse use of the AspectManager and native cl lock.
The synchronization in AspectManager has been changed even further beyond JBAOP-360 (by JBAOP-501 & JBAOP-653 for instance), but it seems pretty clear the synchronization we have in place from JBMESSAGING-797 is a relic at this point not necessarily needed any longer with the current AOP branch used on EAP 5. Can we remove the JBMESSAGING-797 changes to address this issue? If any AOP deadlock were to crop back up similar to JBMESSAGING-797, then I think it should be fixed in AOP.
was (Author: aogburn):
Saw another occurrence of this. It's a bit different but fundamentally the same:
{noformat}
"WorkManager(2)-2" daemon prio=10 tid=0x000000004079b000 nid=0x1f99 waiting for monitor entry [0x00007f2d94a00000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.jboss.aop.AspectManager.getAdvisor(AspectManager.java:728)
- waiting to lock <0x000000077039a500> (a org.jboss.aop.AspectManager)
at org.jboss.jms.client.delegate.ClientConnectionDelegate.<clinit>(ClientConnectionDelegate.java) <-- native classloader lock grabbed
at org.jboss.jms.delegate.CreateConnectionResult.read(CreateConnectionResult.java:106)
at org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateResponse.read(ConnectionFactoryCreateConnectionDelegateResponse.java:71)
at org.jboss.jms.wireformat.JMSWireFormat.read(JMSWireFormat.java:298)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedRead(MicroSocketClientInvoker.java:1301)
at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:907)
at org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:461)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:167)
at org.jboss.remoting.Client.invoke(Client.java:2034)
at org.jboss.remoting.Client.invoke(Client.java:877)
"WorkManager(2)-3" daemon prio=10 tid=0x00007f2d9811d000 nid=0x2003 in Object.wait() [0x00007f2d94125000]
java.lang.Thread.State: RUNNABLE
at org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegateInternal(ServerConnectionFactoryEndpoint.java:296) <-- waiting for that native cl lock
- locked <0x000000077039a500> (a org.jboss.aop.AspectManager)
at org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegate(ServerConnectionFactoryEndpoint.java:170)
at org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.org$jboss$jms$server$endpoint$advised$ConnectionFactoryAdvised$createConnectionDelegate$aop(ConnectionFactoryAdvised.java:108)
at org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.createConnectionDelegate(ConnectionFactoryAdvised.java)
at org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateRequest.serverInvoke(ConnectionFactoryCreateConnectionDelegateRequest.java:91)
at org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:157)
{noformat}
So we're deadlocked still on the same locks. WorkManager(2)-2 is grabbing the native cl lock then the AspectManager, while WorkManager(2)-3 grabs the AspectManager then the native cl lock. I think the problem is the AspectManager synchronization that was added in JBMESSAGING-797 as a JBM level workaround for JBAOP-360. I think this was fine until synchronization was added in AOP to fix JBAOP-360 as well, and that AOP level fix is in EAP 5 now to conflict with that prior messaging fix. JBAOP-360 resulted in AspectManager.getAdvisor being made a synchronized method and should address the issue without JBMESSAGING-797. JBAOP-360 and JBMESSAGING-797 together though causes the issue as it introduces potential inverse use of the AspectManager and native cl lock. Can we remove the JBMESSAGING-797 changes to address this issue?
> Messaging deadlocks on AspectManager
> ------------------------------------
>
> Key: JBMESSAGING-1941
> URL: https://issues.jboss.org/browse/JBMESSAGING-1941
> Project: JBoss Messaging
> Issue Type: Bug
> Affects Versions: 1.4.8.SP9
> Environment: -JBoss Enterprise Application Platform (EAP) 5
> Reporter: Aaron Ogburn
> Attachments: 00779442testpatch.diff
>
>
> Messaging can stall JBoss at start up. The hang occurs as the main thread blocks in different locations in JMS trying to acquire the AspectManager instance lock:
> {noformat}
> "main" prio=10 tid=0x00007fe69c1e1000 nid=0x3613 waiting for monitor entry [0x00007fe6828e5000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.jboss.aop.AspectManager.getAdvisor(AspectManager.java:728)
> - waiting to lock <0x00000000a5e43e50> (a org.jboss.aop.AspectManager)
> at org.jboss.jms.server.endpoint.advised.SessionAdvised.<clinit>(SessionAdvised.java)
> at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:273)
> - locked <0x00000000a5600f48> (a org.jboss.aop.asintegration.jboss5.ScopedVFSClassLoaderDomain)
> at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:229)
> {noformat}
> Since the main thread is stalled, so too is start up. This lock is always held by another making a similar call like so:
> {noformat}
> "WorkManager(2)-1" daemon prio=10 tid=0x00007fe62dd96000 nid=0x363e in
> Object.wait() [0x00007fe68029a000]
> java.lang.Thread.State: RUNNABLE
> at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:273)
> - locked <0x00000000a5e43e50> (a org.jboss.aop.AspectManager)
> at org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:229)
> {noformat}
> The code there at line 273 is simple enough and shouldn't cause a long stall:
> {code:title=org.jboss.jms.server.endpoint.ServerConnectionEndpoint.java|borderStyle=solid}
> synchronized (AspectManager.instance())
> {
> advised = new SessionAdvised(ep); // line 273
> }
> {code}
> Nonetheless, it does clearly hang at this line as WorkManager(2)-1 never progresses. There's no GC/CPU issues or anything of that nature, but we get a slight in the thread dump as WorkManager(2)-1 is shown to be in Object.wait(). Checking the thread through pstack, we can see the issue. It is waiting in native/JDK level operations required to initialize the SessionAdvised class:
> {noformat}
> Thread 2 (Thread 0x7fe68029d700 (LWP 13886)):
> #0 0x0000003b3e40b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
> #1 0x00007fe6a12d447e in os:latformEvent:ark() () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
> #2 0x00007fe6a12c54fa in ObjectMonitor::wait(long, bool, Thread*) () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
> #3 0x00007fe6a13ab5f0 in ObjectSynchronizer::waitUninterruptibly(Handle, long, Thread*) () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
> #4 0x00007fe6a107115b in instanceKlass::initialize_impl(instanceKlassHandle, Thread*) () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
> #5 0x00007fe6a107077a in instanceKlass::initialize(Thread*) () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
> #6 0x00007fe6a1097c5c in InterpreterRuntime::_new(JavaThread*, constantPoolOopDesc*, int) () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
> {noformat}
> These calls are naturally synchronized internally in the JVM so only one thread can proceed through initialization of a class at a given time. Checking for any threads in process of initializing SessionAdvised, we can see only the main thread is. But the main thread is still blocked for the AspectManager lock.
> So we have a deadlock here. The main thread holds the initialization lock while waiting for the AspectManager lock, while the WorkManager(2)-1 thread holds the AspectManager lock while waiting for the initialization lock.
> The initialization of SessionAdvised is wrapped in the synchronized block so the main thread should not have been able to proceed into the initialization at all without already possessing the AspectManager lock. It locks the ScopedVFSClassLoaderDomain, which extends the AspectManager. This is different from the actual AspectManager though that main tries to lock in the clinit call and that it deadlocks on.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list