]
Dipak Kothari commented on JGRP-594:
------------------------------------
The initialisation of my service intialises a number of items including joing JGroup. If
an exception is thrown then it does some clean up including removing it from the
replicatedHashMap. So my code looked something like this:
catch(Exception ex) {
cleanup();
throw ex;
}
Unfortunately, I could see what caused the initial exception because I got a runtime
exception with the cleanup. However, I can confirm that it did not register (join the
gorup).
My configuration:
<config>
<TCP
start_port="${start_port}"
bind_addr="${bind_addr}"
loopback="true"
discard_incompatible_packets="true"
use_send_queues="false"
max_bundle_size="64000"
max_bundle_timeout="30"
enable_bundling="false"
sock_conn_timeout="300"
skip_suspected_members="true"
enable_diagnostics="false"
use_concurrent_stack="true"
thread_pool.enabled="true"
thread_pool.min_threads="1"
thread_pool.max_threads="8"
thread_pool.keep_alive_time="5000"
thread_pool.queue_enabled="true"
thread_pool.queue_max_size="100"
thread_pool.rejection_policy="Run"
oob_thread_pool.enabled="true"
oob_thread_pool.min_threads="1"
oob_thread_pool.max_threads="8"
oob_thread_pool.keep_alive_time="5000"
oob_thread_pool.queue_enabled="true"
oob_thread_pool.queue_max_size="100"
oob_thread_pool.rejection_policy="Run"
/>
<TCPPING
timeout="3000"
initial_hosts="${initial_hosts}"
num_initial_members="1"
port_range="${port_range}"/>
<MERGE2
max_interval="30000"
min_interval="10000"/>
<FD_SOCK/>
<VERIFY_SUSPECT
timeout="1000"/>
<pbcast.NAKACK
gc_lag="50"
retransmit_timeout="300,600,1200,2400,4800"
use_mcast_xmit="false"
max_xmit_size="64000"
discard_delivered_msgs="false"/>
<pbcast.STABLE
stability_delay="1000"
desired_avg_gossip="20000"/>
<pbcast.GMS
print_local_addr="true"
join_timeout="3000"
join_retry_timeout="5000"
view_bundling="true"
shun="false"/>
<pbcast.STATE_TRANSFER/>
</config>
I will try and use the code 2.6 head and will check - however this may take some time.
Intermittently, a Null pointer exception is thrown when trying to
remove non-existent entry in ReplicatedHashMap
----------------------------------------------------------------------------------------------------------------
Key: JGRP-594
URL:
http://jira.jboss.com/jira/browse/JGRP-594
Project: JGroups
Issue Type: Bug
Affects Versions: 2.5
Environment: Linux
Reporter: Dipak Kothari
Assigned To: Bela Ban
Fix For: 2.6
Intermittently, when an entry is removed from a ReplicatedHashMap (where the entry does
not exist) the following exception is thrown:
java.lang.RuntimeException: remove(APMExample.Services.examples.ServerA09) failed
at org.jgroups.blocks.ReplicatedHashMap.remove(ReplicatedHashMap.java:405)
at
com.ubs.apm.control.service.nameservice.jgroup.JGroupNameService.unRegisterService(JGroupNameService.java:132)
at
com.ubs.apm.control.sensors.ControlSensorManager.cleanup(ControlSensorManager.java:468)
at
com.ubs.apm.control.sensors.ControlSensorManager.<init>(ControlSensorManager.java:125)
at
com.ubs.apm.control.sensors.ControlSensorManager.<init>(ControlSensorManager.java:106)
at com.ubs.apm.control.example.ManagedServer.init(ManagedServer.java:20)
at com.ubs.apm.control.example.ManagedServer.main(ManagedServer.java:45)
Caused by: java.lang.RuntimeException: failed executing request [req_id=1189617222436
caller=14.64.61.201:6825
14.64.61.201:6838: sender=14.64.61.201:6838, retval=null, received=false,
suspected=false
.... many such lines ...
14.64.61.201:6860: sender=14.64.61.201:6860, retval=null, received=false,
suspected=false
14.64.61.201:6847: sender=14.64.61.201:6847, retval=null, received=false,
suspected=false
request_msg: [dst: <null>, src: 14.64.61.201:6825 (2 headers), size=143 bytes]
rsp_mode: GET_NONE
done: true
timeout: 5000
expected_mbrs: 0 ([14.64.61.201:6815, 14.64.61.201:6816, 14.64.61.201:6824,
14.64.61.201:6825, 14.64.61.201:6826, 14.64.61.201:6827, 14.64.61.201:6828,
14.64.61.201:6829, 14.64.61.201:6830, 14.64.61.201:6831, 14.64.61.201:6833,
14.64.61.201:6834, 14.64.61.201:6835, 14.64.61.201:6836, 14.64.61.201:6837,
14.64.61.201:6838, 14.64.61.201:6839, 14.64.61.201:6842, 14.64.61.201:6843,
14.64.61.201:6844, 14.64.61.201:6845, 14.64.61.201:6846, 14.64.61.201:6847,
14.64.61.201:6848, 14.64.61.201:6849, 14.64.61.201:6850, 14.64.61.201:6851,
14.64.61.201:6852, 14.64.61.201:6853, 14.64.61.201:6854, 14.64.61.201:6855,
14.64.61.201:6856, 14.64.61.201:6857, 14.64.61.201:6858, 14.64.61.201:6859,
14.64.61.201:6860, 14.64.61.201:6861, 14.64.61.201:6862, 14.64.61.201:6863,
14.64.61.201:6864, 14.64.61.201:6865, 14.64.61.201:6866])]
at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:433)
at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:199)
at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:167)
at org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:163)
at org.jgroups.blocks.ReplicatedHashMap.remove(ReplicatedHashMap.java:402)
... 6 more
Caused by: java.lang.RuntimeException: failure adding msg [dst: <null>, src:
14.64.61.201:6825 (2 headers), size=143 bytes] to the retransmit table for
14.64.61.201:6825
at org.jgroups.protocols.pbcast.NAKACK.send(NAKACK.java:636)
at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:438)
at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:317)
at org.jgroups.protocols.pbcast.GMS.down(GMS.java:782)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:221)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:339)
at org.jgroups.JChannel.downcall(JChannel.java:1240)
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:752)
at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:301)
at org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:440)
at org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:190)
at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:430)
... 10 more
Caused by: java.lang.NullPointerException
at org.jgroups.protocols.pbcast.NAKACK.send(NAKACK.java:632)
... 21 more
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: