[jboss-jira] [JBoss JIRA] Commented: (JGRP-594) Intermittently, a Null pointer exception is thrown when trying to remove non-existent entry in ReplicatedHashMap

Dipak Kothari (JIRA) jira-events at lists.jboss.org
Thu Dec 13 06:33:51 EST 2007


    [ http://jira.jboss.com/jira/browse/JGRP-594?page=comments#action_12391935 ] 
            
Dipak Kothari commented on JGRP-594:
------------------------------------

Hi,

I have worked on this issue again.  As this problem is intermitten, i created a test which ran over 12-24 hours, capturing the output.

Details of test:

Started 60 services, each adding (put) entries into the ReplicatedHashMap<String, String>.  After, 3 minutes, the logs were checked for a null pointer exception.  if the exception was found, it would take a back up of the logs.  The services would be killed using "kill -9) and the process repeated.

Snippet of code:

    registry = new ReplicatedHashMap(GROUP, factory, properties, false, 10*1000);
    LOGGER.debug("Group memebership = " + registry.getChannel().getView().getMembers().size());
    localAddress = registry.getLocalAddress();
    LOGGER.debug("Local address = " + localAddress);
    channel = (JChannel)registry.getChannel();
    MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
    JmxConfigurator.registerChannel(channel, mbs, "JGroups.channel", channel.getChannelName(), true);
    registry.addNotifier(new RHMListener());
    dumpMap();
    registry.put(service, "STARTING");
    LOGGER.debug("Successfully put value");
    double delay = Math.random();
    Thread.sleep((long)(delay*5000));
    registry.put(service, "RUNNING");
    LOGGER.debug("Updated state of service ...");
    while(true) {
      try {
        Thread.sleep(10*1000);
        LOGGER.debug("Group membership = " + registry.getChannel().getView().getMembers().size());
        dumpMap();
      }
      catch(Exception ex) {
        LOGGER.debug("Failed to get size details", ex);
      }
    }

This was done for JGroup 2.5.0 (our current version) and with JGroup 2.6.1.  With 2.5.0, we got 2 occurences in a 12 hour run.  With 2.6.1 I got it to occur once in a 24 hour run (although it happened after 4-5 hours).  I also get a number of Errors reported.

I have attached the log file for 2.6.1 run.

Let me know if you require any other information.

Thanks.

> Intermittently, a Null pointer exception is thrown when trying to remove non-existent entry in ReplicatedHashMap
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: JGRP-594
>                 URL: http://jira.jboss.com/jira/browse/JGRP-594
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.5
>         Environment: Linux
>            Reporter: Dipak Kothari
>         Assigned To: Bela Ban
>             Fix For: 2.7
>
>
> Intermittently, when an entry is removed from a ReplicatedHashMap (where the entry does not exist) the following exception is thrown:
> java.lang.RuntimeException: remove(APMExample.Services.examples.ServerA09) failed
>         at org.jgroups.blocks.ReplicatedHashMap.remove(ReplicatedHashMap.java:405)
>         at com.ubs.apm.control.service.nameservice.jgroup.JGroupNameService.unRegisterService(JGroupNameService.java:132)
>         at com.ubs.apm.control.sensors.ControlSensorManager.cleanup(ControlSensorManager.java:468)
>         at com.ubs.apm.control.sensors.ControlSensorManager.<init>(ControlSensorManager.java:125)
>         at com.ubs.apm.control.sensors.ControlSensorManager.<init>(ControlSensorManager.java:106)
>         at com.ubs.apm.control.example.ManagedServer.init(ManagedServer.java:20)
>         at com.ubs.apm.control.example.ManagedServer.main(ManagedServer.java:45)
> Caused by: java.lang.RuntimeException: failed executing request [req_id=1189617222436
> caller=14.64.61.201:6825
> 14.64.61.201:6838: sender=14.64.61.201:6838, retval=null, received=false, suspected=false
> .... many such lines ...
> 14.64.61.201:6860: sender=14.64.61.201:6860, retval=null, received=false, suspected=false
> 14.64.61.201:6847: sender=14.64.61.201:6847, retval=null, received=false, suspected=false
> request_msg: [dst: <null>, src: 14.64.61.201:6825 (2 headers), size=143 bytes]
> rsp_mode: GET_NONE
> done: true
> timeout: 5000
> expected_mbrs: 0 ([14.64.61.201:6815, 14.64.61.201:6816, 14.64.61.201:6824, 14.64.61.201:6825, 14.64.61.201:6826, 14.64.61.201:6827, 14.64.61.201:6828, 14.64.61.201:6829, 14.64.61.201:6830, 14.64.61.201:6831, 14.64.61.201:6833, 14.64.61.201:6834, 14.64.61.201:6835, 14.64.61.201:6836, 14.64.61.201:6837, 14.64.61.201:6838, 14.64.61.201:6839, 14.64.61.201:6842, 14.64.61.201:6843, 14.64.61.201:6844, 14.64.61.201:6845, 14.64.61.201:6846, 14.64.61.201:6847, 14.64.61.201:6848, 14.64.61.201:6849, 14.64.61.201:6850, 14.64.61.201:6851, 14.64.61.201:6852, 14.64.61.201:6853, 14.64.61.201:6854, 14.64.61.201:6855, 14.64.61.201:6856, 14.64.61.201:6857, 14.64.61.201:6858, 14.64.61.201:6859, 14.64.61.201:6860, 14.64.61.201:6861, 14.64.61.201:6862, 14.64.61.201:6863, 14.64.61.201:6864, 14.64.61.201:6865, 14.64.61.201:6866])]
>         at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:433)
>         at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:199)
>         at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:167)
>         at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:163)
>         at org.jgroups.blocks.ReplicatedHashMap.remove(ReplicatedHashMap.java:402)
>         ... 6 more
> Caused by: java.lang.RuntimeException: failure adding msg [dst: <null>, src: 14.64.61.201:6825 (2 headers), size=143 bytes] to the retransmit table for 14.64.61.201:6825
>         at org.jgroups.protocols.pbcast.NAKACK.send(NAKACK.java:636)
>         at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:438)
>         at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:317)
>         at org.jgroups.protocols.pbcast.GMS.down(GMS.java:782)
>         at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:221)
>         at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:339)
>         at org.jgroups.JChannel.downcall(JChannel.java:1240)
>         at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:752)
>         at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:301)
>         at org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:440)
>         at org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:190)
>         at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:430)
>         ... 10 more
> Caused by: java.lang.NullPointerException
>         at org.jgroups.protocols.pbcast.NAKACK.send(NAKACK.java:632)
>         ... 21 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list