]
Dipak Kothari commented on JGRP-594:
------------------------------------
Hi,
I have worked on this issue again. As this problem is intermitten, i created a test which
ran over 12-24 hours, capturing the output.
Details of test:
Started 60 services, each adding (put) entries into the ReplicatedHashMap<String,
String>. After, 3 minutes, the logs were checked for a null pointer exception. if the
exception was found, it would take a back up of the logs. The services would be killed
using "kill -9) and the process repeated.
Snippet of code:
registry = new ReplicatedHashMap(GROUP, factory, properties, false, 10*1000);
LOGGER.debug("Group memebership = " +
registry.getChannel().getView().getMembers().size());
localAddress = registry.getLocalAddress();
LOGGER.debug("Local address = " + localAddress);
channel = (JChannel)registry.getChannel();
MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
JmxConfigurator.registerChannel(channel, mbs, "JGroups.channel",
channel.getChannelName(), true);
registry.addNotifier(new RHMListener());
dumpMap();
registry.put(service, "STARTING");
LOGGER.debug("Successfully put value");
double delay = Math.random();
Thread.sleep((long)(delay*5000));
registry.put(service, "RUNNING");
LOGGER.debug("Updated state of service ...");
while(true) {
try {
Thread.sleep(10*1000);
LOGGER.debug("Group membership = " +
registry.getChannel().getView().getMembers().size());
dumpMap();
}
catch(Exception ex) {
LOGGER.debug("Failed to get size details", ex);
}
}
This was done for JGroup 2.5.0 (our current version) and with JGroup 2.6.1. With 2.5.0,
we got 2 occurences in a 12 hour run. With 2.6.1 I got it to occur once in a 24 hour run
(although it happened after 4-5 hours). I also get a number of Errors reported.
I have attached the log file for 2.6.1 run.
Let me know if you require any other information.
Thanks.
Intermittently, a Null pointer exception is thrown when trying to
remove non-existent entry in ReplicatedHashMap
----------------------------------------------------------------------------------------------------------------
Key: JGRP-594
URL:
http://jira.jboss.com/jira/browse/JGRP-594
Project: JGroups
Issue Type: Bug
Affects Versions: 2.5
Environment: Linux
Reporter: Dipak Kothari
Assigned To: Bela Ban
Fix For: 2.7
Intermittently, when an entry is removed from a ReplicatedHashMap (where the entry does
not exist) the following exception is thrown:
java.lang.RuntimeException: remove(APMExample.Services.examples.ServerA09) failed
at org.jgroups.blocks.ReplicatedHashMap.remove(ReplicatedHashMap.java:405)
at
com.ubs.apm.control.service.nameservice.jgroup.JGroupNameService.unRegisterService(JGroupNameService.java:132)
at
com.ubs.apm.control.sensors.ControlSensorManager.cleanup(ControlSensorManager.java:468)
at
com.ubs.apm.control.sensors.ControlSensorManager.<init>(ControlSensorManager.java:125)
at
com.ubs.apm.control.sensors.ControlSensorManager.<init>(ControlSensorManager.java:106)
at com.ubs.apm.control.example.ManagedServer.init(ManagedServer.java:20)
at com.ubs.apm.control.example.ManagedServer.main(ManagedServer.java:45)
Caused by: java.lang.RuntimeException: failed executing request [req_id=1189617222436
caller=14.64.61.201:6825
14.64.61.201:6838: sender=14.64.61.201:6838, retval=null, received=false,
suspected=false
.... many such lines ...
14.64.61.201:6860: sender=14.64.61.201:6860, retval=null, received=false,
suspected=false
14.64.61.201:6847: sender=14.64.61.201:6847, retval=null, received=false,
suspected=false
request_msg: [dst: <null>, src: 14.64.61.201:6825 (2 headers), size=143 bytes]
rsp_mode: GET_NONE
done: true
timeout: 5000
expected_mbrs: 0 ([14.64.61.201:6815, 14.64.61.201:6816, 14.64.61.201:6824,
14.64.61.201:6825, 14.64.61.201:6826, 14.64.61.201:6827, 14.64.61.201:6828,
14.64.61.201:6829, 14.64.61.201:6830, 14.64.61.201:6831, 14.64.61.201:6833,
14.64.61.201:6834, 14.64.61.201:6835, 14.64.61.201:6836, 14.64.61.201:6837,
14.64.61.201:6838, 14.64.61.201:6839, 14.64.61.201:6842, 14.64.61.201:6843,
14.64.61.201:6844, 14.64.61.201:6845, 14.64.61.201:6846, 14.64.61.201:6847,
14.64.61.201:6848, 14.64.61.201:6849, 14.64.61.201:6850, 14.64.61.201:6851,
14.64.61.201:6852, 14.64.61.201:6853, 14.64.61.201:6854, 14.64.61.201:6855,
14.64.61.201:6856, 14.64.61.201:6857, 14.64.61.201:6858, 14.64.61.201:6859,
14.64.61.201:6860, 14.64.61.201:6861, 14.64.61.201:6862, 14.64.61.201:6863,
14.64.61.201:6864, 14.64.61.201:6865, 14.64.61.201:6866])]
at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:433)
at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:199)
at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:167)
at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:163)
at org.jgroups.blocks.ReplicatedHashMap.remove(ReplicatedHashMap.java:402)
... 6 more
Caused by: java.lang.RuntimeException: failure adding msg [dst: <null>, src:
14.64.61.201:6825 (2 headers), size=143 bytes] to the retransmit table for
14.64.61.201:6825
at org.jgroups.protocols.pbcast.NAKACK.send(NAKACK.java:636)
at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:438)
at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:317)
at org.jgroups.protocols.pbcast.GMS.down(GMS.java:782)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:221)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:339)
at org.jgroups.JChannel.downcall(JChannel.java:1240)
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:752)
at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:301)
at org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:440)
at org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:190)
at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:430)
... 10 more
Caused by: java.lang.NullPointerException
at org.jgroups.protocols.pbcast.NAKACK.send(NAKACK.java:632)
... 21 more
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: