[jboss-jira] [JBoss JIRA] Commented: (JGRP-620) RpcDispatcher.callRemoteMethods() hangs

P Eger (JIRA) jira-events at lists.jboss.org
Tue Nov 13 17:52:18 EST 2007


    [ http://jira.jboss.com/jira/browse/JGRP-620?page=comments#action_12387395 ] 
            
P Eger commented on JGRP-620:
-----------------------------

Browsing the code, the following code in GroupRequest.java looks suspect (no pun intended ;-):


private void adjustMembership() {
  if(requests.isEmpty())
    return;

  Address mbr;
  Rsp rsp;
  synchronized(members) {
    for(Map.Entry<Address,Rsp> entry: requests.entrySet()) {
      mbr=entry.getKey();
      if((!this.members.contains(mbr)) || suspects.contains(mbr)) { //<-- should be !suspects.contains(mbr) ??
        addSuspect(mbr);
        rsp=entry.getValue();
        rsp.setValue(null);
        rsp.setSuspected(true);
      }
    }
  }
}

Also this code seems strange, though not necessarily incorrect:

public void viewChange(View new_view) {

  ...

  if(modified || tmp != null) { //<-- should be just if(tmp != null)? "modified" variable appears useless / redundant
    synchronized(requests) {
      for(Address suspect: tmp) {
        addSuspect(suspect);
      }
      requests.notifyAll();
    }
  }
}

Also, 2 additional notes on the code:

1) What's the deal with "max_suspects"? seems arbitrarily set to 40
2) Why is the GroupRequest locking so complicated? I see synchronization on "requests", "members", "this", and "suspects", which seems to complicate things. The fix for http://jira.jboss.com/jira/browse/JGRP-554 for instance. Would you accept a patch to unify everything under 1 lock? I don't see any callouts/callbacks that would result in deadlocks if this is done. I would doubt performace impact would be noticeable.



> RpcDispatcher.callRemoteMethods() hangs
> ---------------------------------------
>
>                 Key: JGRP-620
>                 URL: http://jira.jboss.com/jira/browse/JGRP-620
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>         Environment: RHEL4 64 bit: Linux svr5 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
> SUN JDK 1.5.0_12-b04 64 bit
>            Reporter: P Eger
>         Assigned To: Bela Ban
>         Attachments: stacktrace.txt, tcp-config.xml
>
>
> RpcDispatcher.callRemoteMethods() hangs, while there is a lot of member churn at the time (4 servers starting into a cluster of 4). 1 of the 4 server starting up is hung with the attached stack trace.
> -------------------------------------------------------------------
> channel=new JChannel(jgroups_config_file);
> channel.setOpt(Channel.AUTO_RECONNECT, Boolean.TRUE);
> channel.addChannelListener(this);
> 		
> //TODO: verify these startup params
> //NOTE: deadlock detection leaks memory as of 2.5b2, do not use
> disp=new RpcDispatcher(channel, null, this, this, false, true);
> 		
> //force connect
> channel.connect(clusterName);
> MethodCall mc = new MethodCall("remoteBroadcastAvailability", new Object[]{peer,sequence,serviceStatus,rotationStatus}, new Class[]{Address.class,Long.class,ServiceStatus.class,RotationStatus.class});
> disp.callRemoteMethods(channel.getView().getMembers(), mc, GroupRequest.GET_ALL, 0);
> -------------------------------------------------------------------
> Name: main
> State: WAITING on java.util.HashMap at 2d95bbec
> Total blocked: 53  Total waited: 12
> Stack trace: 
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:474)
> org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:479)
> org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:190)
> org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:430)
> org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:199)
> org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:167)
> org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:163)
> utils.cluster.PeerClusterManager.broadcastAvailability(PeerClusterManager.java:1110)
> utils.cluster.PeerClusterManager.broadcastMyAvailability(PeerClusterManager.java:428)
> init.InitManager.ensureInitialized(InitManager.java:552)
> init.InitManager.__init(InitManager.java:409)
> init.InitManager.init(InitManager.java:300)
> init.ServletListener.contextInitialized(ServletListener.java:23)
> org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:3669)
> org.apache.catalina.core.StandardContext.start(StandardContext.java:4104)
> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1012)
> org.apache.catalina.core.StandardHost.start(StandardHost.java:718)
> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1012)
> org.apache.catalina.core.StandardEngine.start(StandardEngine.java:442)
> org.apache.catalina.core.StandardService.start(StandardService.java:450)
> org.apache.catalina.core.StandardServer.start(StandardServer.java:683)
> org.apache.catalina.startup.Catalina.start(Catalina.java:537)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> java.lang.reflect.Method.invoke(Method.java:585)
> org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:271)
> org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:409)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list