[
http://jira.jboss.com/jira/browse/JGRP-620?page=comments#action_12387395 ]
P Eger commented on JGRP-620:
-----------------------------
Browsing the code, the following code in GroupRequest.java looks suspect (no pun intended
;-):
private void adjustMembership() {
if(requests.isEmpty())
return;
Address mbr;
Rsp rsp;
synchronized(members) {
for(Map.Entry<Address,Rsp> entry: requests.entrySet()) {
mbr=entry.getKey();
if((!this.members.contains(mbr)) || suspects.contains(mbr)) { //<-- should be
!suspects.contains(mbr) ??
addSuspect(mbr);
rsp=entry.getValue();
rsp.setValue(null);
rsp.setSuspected(true);
}
}
}
}
Also this code seems strange, though not necessarily incorrect:
public void viewChange(View new_view) {
...
if(modified || tmp != null) { //<-- should be just if(tmp != null)?
"modified" variable appears useless / redundant
synchronized(requests) {
for(Address suspect: tmp) {
addSuspect(suspect);
}
requests.notifyAll();
}
}
}
Also, 2 additional notes on the code:
1) What's the deal with "max_suspects"? seems arbitrarily set to 40
2) Why is the GroupRequest locking so complicated? I see synchronization on
"requests", "members", "this", and "suspects",
which seems to complicate things. The fix for
http://jira.jboss.com/jira/browse/JGRP-554
for instance. Would you accept a patch to unify everything under 1 lock? I don't see
any callouts/callbacks that would result in deadlocks if this is done. I would doubt
performace impact would be noticeable.
RpcDispatcher.callRemoteMethods() hangs
---------------------------------------
Key: JGRP-620
URL:
http://jira.jboss.com/jira/browse/JGRP-620
Project: JGroups
Issue Type: Bug
Affects Versions: 2.5.1
Environment: RHEL4 64 bit: Linux svr5 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16
17:13:42 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
SUN JDK 1.5.0_12-b04 64 bit
Reporter: P Eger
Assigned To: Bela Ban
Attachments: stacktrace.txt, tcp-config.xml
RpcDispatcher.callRemoteMethods() hangs, while there is a lot of member churn at the time
(4 servers starting into a cluster of 4). 1 of the 4 server starting up is hung with the
attached stack trace.
-------------------------------------------------------------------
channel=new JChannel(jgroups_config_file);
channel.setOpt(Channel.AUTO_RECONNECT, Boolean.TRUE);
channel.addChannelListener(this);
//TODO: verify these startup params
//NOTE: deadlock detection leaks memory as of 2.5b2, do not use
disp=new RpcDispatcher(channel, null, this, this, false, true);
//force connect
channel.connect(clusterName);
MethodCall mc = new MethodCall("remoteBroadcastAvailability", new
Object[]{peer,sequence,serviceStatus,rotationStatus}, new
Class[]{Address.class,Long.class,ServiceStatus.class,RotationStatus.class});
disp.callRemoteMethods(channel.getView().getMembers(), mc, GroupRequest.GET_ALL, 0);
-------------------------------------------------------------------
Name: main
State: WAITING on java.util.HashMap@2d95bbec
Total blocked: 53 Total waited: 12
Stack trace:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:474)
org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:479)
org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:190)
org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:430)
org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:199)
org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:167)
org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:163)
utils.cluster.PeerClusterManager.broadcastAvailability(PeerClusterManager.java:1110)
utils.cluster.PeerClusterManager.broadcastMyAvailability(PeerClusterManager.java:428)
init.InitManager.ensureInitialized(InitManager.java:552)
init.InitManager.__init(InitManager.java:409)
init.InitManager.init(InitManager.java:300)
init.ServletListener.contextInitialized(ServletListener.java:23)
org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:3669)
org.apache.catalina.core.StandardContext.start(StandardContext.java:4104)
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1012)
org.apache.catalina.core.StandardHost.start(StandardHost.java:718)
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1012)
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:442)
org.apache.catalina.core.StandardService.start(StandardService.java:450)
org.apache.catalina.core.StandardServer.start(StandardServer.java:683)
org.apache.catalina.startup.Catalina.start(Catalina.java:537)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
java.lang.reflect.Method.invoke(Method.java:585)
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:271)
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:409)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira