[
https://issues.jboss.org/browse/JBMESSAGING-1941?page=com.atlassian.jira....
]
Aaron Ogburn edited comment on JBMESSAGING-1941 at 3/5/13 12:49 PM:
--------------------------------------------------------------------
Saw another occurrence of this. It's a bit different but fundamentally the same:
{noformat}
"WorkManager(2)-2" daemon prio=10 tid=0x000000004079b000 nid=0x1f99 waiting for
monitor entry [0x00007f2d94a00000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.jboss.aop.AspectManager.getAdvisor(AspectManager.java:728)
- waiting to lock <0x000000077039a500> (a org.jboss.aop.AspectManager)
at
org.jboss.jms.client.delegate.ClientConnectionDelegate.<clinit>(ClientConnectionDelegate.java)
<-- native classloader lock grabbed
at org.jboss.jms.delegate.CreateConnectionResult.read(CreateConnectionResult.java:106)
at
org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateResponse.read(ConnectionFactoryCreateConnectionDelegateResponse.java:71)
at org.jboss.jms.wireformat.JMSWireFormat.read(JMSWireFormat.java:298)
at
org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedRead(MicroSocketClientInvoker.java:1301)
at
org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:907)
at
org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:461)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:167)
at org.jboss.remoting.Client.invoke(Client.java:2034)
at org.jboss.remoting.Client.invoke(Client.java:877)
"WorkManager(2)-3" daemon prio=10 tid=0x00007f2d9811d000 nid=0x2003 in
Object.wait() [0x00007f2d94125000]
java.lang.Thread.State: RUNNABLE
at
org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegateInternal(ServerConnectionFactoryEndpoint.java:296)
<-- waiting for that native cl lock
- locked <0x000000077039a500> (a org.jboss.aop.AspectManager)
at
org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegate(ServerConnectionFactoryEndpoint.java:170)
at
org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.org$jboss$jms$server$endpoint$advised$ConnectionFactoryAdvised$createConnectionDelegate$aop(ConnectionFactoryAdvised.java:108)
at
org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.createConnectionDelegate(ConnectionFactoryAdvised.java)
at
org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateRequest.serverInvoke(ConnectionFactoryCreateConnectionDelegateRequest.java:91)
at
org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:157)
{noformat}
So we're deadlocked still on the same locks. WorkManager(2)-2 is grabbing the native
cl lock then the AspectManager, while WorkManager(2)-3 grabs the AspectManager then the
native cl lock. I think the problem is the AspectManager synchronization that was added
in JBMESSAGING-797 as a JBM level workaround for JBAOP-360. I think this was fine until
synchronization was added in AOP to fix JBAOP-360 as well, and that AOP level fix is in
EAP 5 now to conflict with that prior messaging fix. JBAOP-360 resulted in
AspectManager.getAdvisor being made a synchronized method and would address the issue
without JBMESSAGING-797. JBAOP-360 and JBMESSAGING-797 together though causes the issue
as it introduces potential inverse use of the AspectManager and native cl lock.
The synchronization in AspectManager has been changed even further beyond JBAOP-360 (by
JBAOP-501 & JBAOP-653 for instance), but it seems pretty clear the synchronization we
have in place from JBMESSAGING-797 is a relic at this point not necessarily needed any
longer with the current AOP branch used on EAP 5. Can we remove the JBMESSAGING-797
changes to address this issue? If any AOP deadlock were to crop back up similar to
JBMESSAGING-797, then I think it should be fixed in AOP.
was (Author: aogburn):
Saw another occurrence of this. It's a bit different but fundamentally the same:
{noformat}
"WorkManager(2)-2" daemon prio=10 tid=0x000000004079b000 nid=0x1f99 waiting for
monitor entry [0x00007f2d94a00000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.jboss.aop.AspectManager.getAdvisor(AspectManager.java:728)
- waiting to lock <0x000000077039a500> (a org.jboss.aop.AspectManager)
at
org.jboss.jms.client.delegate.ClientConnectionDelegate.<clinit>(ClientConnectionDelegate.java)
<-- native classloader lock grabbed
at org.jboss.jms.delegate.CreateConnectionResult.read(CreateConnectionResult.java:106)
at
org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateResponse.read(ConnectionFactoryCreateConnectionDelegateResponse.java:71)
at org.jboss.jms.wireformat.JMSWireFormat.read(JMSWireFormat.java:298)
at
org.jboss.remoting.transport.socket.MicroSocketClientInvoker.versionedRead(MicroSocketClientInvoker.java:1301)
at
org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:907)
at
org.jboss.remoting.transport.bisocket.BisocketClientInvoker.transport(BisocketClientInvoker.java:461)
at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:167)
at org.jboss.remoting.Client.invoke(Client.java:2034)
at org.jboss.remoting.Client.invoke(Client.java:877)
"WorkManager(2)-3" daemon prio=10 tid=0x00007f2d9811d000 nid=0x2003 in
Object.wait() [0x00007f2d94125000]
java.lang.Thread.State: RUNNABLE
at
org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegateInternal(ServerConnectionFactoryEndpoint.java:296)
<-- waiting for that native cl lock
- locked <0x000000077039a500> (a org.jboss.aop.AspectManager)
at
org.jboss.jms.server.endpoint.ServerConnectionFactoryEndpoint.createConnectionDelegate(ServerConnectionFactoryEndpoint.java:170)
at
org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.org$jboss$jms$server$endpoint$advised$ConnectionFactoryAdvised$createConnectionDelegate$aop(ConnectionFactoryAdvised.java:108)
at
org.jboss.jms.server.endpoint.advised.ConnectionFactoryAdvised.createConnectionDelegate(ConnectionFactoryAdvised.java)
at
org.jboss.jms.wireformat.ConnectionFactoryCreateConnectionDelegateRequest.serverInvoke(ConnectionFactoryCreateConnectionDelegateRequest.java:91)
at
org.jboss.jms.server.remoting.JMSServerInvocationHandler.invoke(JMSServerInvocationHandler.java:157)
{noformat}
So we're deadlocked still on the same locks. WorkManager(2)-2 is grabbing the native
cl lock then the AspectManager, while WorkManager(2)-3 grabs the AspectManager then the
native cl lock. I think the problem is the AspectManager synchronization that was added
in JBMESSAGING-797 as a JBM level workaround for JBAOP-360. I think this was fine until
synchronization was added in AOP to fix JBAOP-360 as well, and that AOP level fix is in
EAP 5 now to conflict with that prior messaging fix. JBAOP-360 resulted in
AspectManager.getAdvisor being made a synchronized method and should address the issue
without JBMESSAGING-797. JBAOP-360 and JBMESSAGING-797 together though causes the issue
as it introduces potential inverse use of the AspectManager and native cl lock. Can we
remove the JBMESSAGING-797 changes to address this issue?
Messaging deadlocks on AspectManager
------------------------------------
Key: JBMESSAGING-1941
URL:
https://issues.jboss.org/browse/JBMESSAGING-1941
Project: JBoss Messaging
Issue Type: Bug
Affects Versions: 1.4.8.SP9
Environment: -JBoss Enterprise Application Platform (EAP) 5
Reporter: Aaron Ogburn
Attachments: 00779442testpatch.diff
Messaging can stall JBoss at start up. The hang occurs as the main thread blocks in
different locations in JMS trying to acquire the AspectManager instance lock:
{noformat}
"main" prio=10 tid=0x00007fe69c1e1000 nid=0x3613 waiting for monitor entry
[0x00007fe6828e5000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.jboss.aop.AspectManager.getAdvisor(AspectManager.java:728)
- waiting to lock <0x00000000a5e43e50> (a org.jboss.aop.AspectManager)
at
org.jboss.jms.server.endpoint.advised.SessionAdvised.<clinit>(SessionAdvised.java)
at
org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:273)
- locked <0x00000000a5600f48> (a
org.jboss.aop.asintegration.jboss5.ScopedVFSClassLoaderDomain)
at
org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:229)
{noformat}
Since the main thread is stalled, so too is start up. This lock is always held by
another making a similar call like so:
{noformat}
"WorkManager(2)-1" daemon prio=10 tid=0x00007fe62dd96000 nid=0x363e in
Object.wait() [0x00007fe68029a000]
java.lang.Thread.State: RUNNABLE
at
org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:273)
- locked <0x00000000a5e43e50> (a org.jboss.aop.AspectManager)
at
org.jboss.jms.server.endpoint.ServerConnectionEndpoint.createSessionDelegate(ServerConnectionEndpoint.java:229)
{noformat}
The code there at line 273 is simple enough and shouldn't cause a long stall:
{code:title=org.jboss.jms.server.endpoint.ServerConnectionEndpoint.java|borderStyle=solid}
synchronized (AspectManager.instance())
{
advised = new SessionAdvised(ep); // line 273
}
{code}
Nonetheless, it does clearly hang at this line as WorkManager(2)-1 never progresses.
There's no GC/CPU issues or anything of that nature, but we get a slight in the thread
dump as WorkManager(2)-1 is shown to be in Object.wait(). Checking the thread through
pstack, we can see the issue. It is waiting in native/JDK level operations required to
initialize the SessionAdvised class:
{noformat}
Thread 2 (Thread 0x7fe68029d700 (LWP 13886)):
#0 0x0000003b3e40b43c in pthread_cond_wait@(a)GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fe6a12d447e in os:latformEvent:ark() () from
/opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
#2 0x00007fe6a12c54fa in ObjectMonitor::wait(long, bool, Thread*) () from
/opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
#3 0x00007fe6a13ab5f0 in ObjectSynchronizer::waitUninterruptibly(Handle, long, Thread*)
() from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
#4 0x00007fe6a107115b in instanceKlass::initialize_impl(instanceKlassHandle, Thread*) ()
from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
#5 0x00007fe6a107077a in instanceKlass::initialize(Thread*) () from
/opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
#6 0x00007fe6a1097c5c in InterpreterRuntime::_new(JavaThread*, constantPoolOopDesc*,
int) () from /opt/app/jvm/jdk1.6.0_38/jre/lib/amd64/server/libjvm.so
{noformat}
These calls are naturally synchronized internally in the JVM so only one thread can
proceed through initialization of a class at a given time. Checking for any threads in
process of initializing SessionAdvised, we can see only the main thread is. But the main
thread is still blocked for the AspectManager lock.
So we have a deadlock here. The main thread holds the initialization lock while waiting
for the AspectManager lock, while the WorkManager(2)-1 thread holds the AspectManager lock
while waiting for the initialization lock.
The initialization of SessionAdvised is wrapped in the synchronized block so the main
thread should not have been able to proceed into the initialization at all without already
possessing the AspectManager lock. It locks the ScopedVFSClassLoaderDomain, which extends
the AspectManager. This is different from the actual AspectManager though that main tries
to lock in the clinit call and that it deadlocks on.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira