]
Brian Stansberry updated JBAS-2647:
-----------------------------------
Fix Version/s: JBossAS-5.0.0.CR1
(was: JBossAS-5.0.0.Beta4)
I'll do this for CR1. It's fairly low priority as it's a quite unlikely
scenario; I'm more concerned about breaking something by changing the logic.
Remove potential deadlock condition from HASingletonSupport
-----------------------------------------------------------
Key: JBAS-2647
URL:
http://jira.jboss.com/jira/browse/JBAS-2647
Project: JBoss Application Server
Issue Type: Sub-task
Security Level: Public(Everyone can see)
Components: Clustering
Reporter: Brian Stansberry
Assigned To: Brian Stansberry
Fix For: JBossAS-5.0.0.CR1
The startService() implementation HASingletonSupport inherits from HAServiceMBeanSupport
has a slight potential for deadlock is a cluster topology change occurs while the
singleton service itself is being deployed. The only known use case where this would
occur is with the HASingletonDeployer service.
Details:
In Thread A
1) HASingletonDeployerServices is being deployed, and therefore has synchronized on
org.jboss.system.ServiceController.
2) Calls DRM.registerListener()
3) Call DRM.add() (this is the next line of code)
4) As part of add processing, DRM callsback to the HASingleton.
5) Inside a synchronized block in the callback method, singleton determines if it is the
master node, goes on to do its work.
Problem occurs if a cluster topology change occurs between steps 2 and 3. In that case,
the following would happen in another thread, Thread B.
1) Topology changes, so DRM notifies listeners.
2) Our HASingleton is registered as a listener, so step 5 above occurs.
3) Since its the master, goes and tries to deploy things in deploy-hasingleton.
4) Deployment can't proceed because Thread A has synchronized on
org.jboss.system.ServiceController.
5) Thread A can't proceed because Thread B is stuck inside the synchronized block in
the callback method. Deadlock.
This is an unlikely scenario, but I'm marking this issue as major since if it does
occur it deadlocks the node.
A likely fix will involve overriding the startService() implemetation so it doesn't
rely on the callback to determine whether or not its the master node. Instead it directly
does what the callback code does, and then registers as a listener. Have to be careful
not to drop any topology changes in the middle.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: