JBoss 6.0.0 fails to restart HA Singletons after recovering from a split brain
-------------------------------------------------------------------------------
Key: JBAS-9456
URL:
https://issues.jboss.org/browse/JBAS-9456
Project: Legacy JBoss Application Server 6
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Clustering
Affects Versions: 6.0.0.Final
Environment: Any
Reporter: Robert Hayward
Assignee: Paul Ferraro
We've been running with JBoss 6.0.0 clustered across 2 boxes and running with a number
of HA Singletons. A brief network outage caused the cluster to split and the HA Singletons
to start up on the second box. After the network issues were resolved, the JBoss instances
correctly re-clustered, but the HA Singletons remained running on both boxes.
I believe that they should have automatically stopped and only the HA Singletons on the
master node should have started back up.
I've finally tracked the issue down to common/lib/jboss-ha-server-core.jar from the
source code at
http://grepcode.com/snapshot/repository.jboss.org/nexus/content/repositor...
The bug is in the file:
org/jboss/ha/core/framework/server/DistributedReplicantManagerImpl.java
In the method:
/**
* Add a replicant to the replicants map.
* @param key replicant key name
* @param nodeName name of the node that adds this replicant
* @param replicant Serialized representation of the replica
* @return true, if this replicant was newly added to the map, false otherwise
*/
protected boolean addReplicant(String key, String nodeName, Serializable replicant)
{
ConcurrentMap<String, Serializable> map = new ConcurrentHashMap<String,
Serializable>();
ConcurrentMap<String, Serializable> existingMap =
this.replicants.putIfAbsent(key, map);
return (((existingMap != null) ? existingMap : map).put(nodeName, replicant) !=
null);
}
The last line of the method should be changed to:
return (((existingMap != null) ? existingMap : map).put(nodeName, replicant) ==
null);
addReplicant() should return true if the replicant wasn't previously in the map, which
would happen if the Map.put() method returns null. It looks like the return value of this
method is only checked when merging a split cluster.
Probably affects JBoss 6.1.0 - not sure about 7.X.X though.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira