I'm experiencing an intermittent loss of replication after several hours of the
applications using JBossCache 1.4.1.SP9 with JGroups 2.6.3.GA for Hibernate 2nd level
caching. I am only able to reproduce the problem on Windows XP which is my local
development environment. The same code and configurations work fine on Linux and
Solaris.
Here is my JGroups configuration.
<config>
| <UDP mcast_addr="224.0.10.8"
| mcast_port="45586"
| ip_ttl="1"
| tos="8"
| ucast_recv_buf_size="20000000"
| ucast_send_buf_size="640000"
| mcast_recv_buf_size="25000000"
| mcast_send_buf_size="640000"
| loopback="true"
| discard_incompatible_packets="true"
| max_bundle_size="64000"
| max_bundle_timeout="30"
| use_incoming_packet_handler="true"
| enable_bundling="true"
| use_concurrent_stack="true"
| thread_pool.enabled="true"
| thread_pool.min_threads="1"
| thread_pool.max_threads="25"
| thread_pool.keep_alive_time="5000"
| thread_pool.queue_enabled="false"
| thread_pool.queue_max_size="100"
| thread_pool.rejection_policy="Run"
| oob_thread_pool.enabled="true"
| oob_thread_pool.min_threads="1"
| oob_thread_pool.max_threads="8"
| oob_thread_pool.keep_alive_time="5000"
| oob_thread_pool.queue_enabled="false"
| oob_thread_pool.queue_max_size="100"
| oob_thread_pool.rejection_policy="Run"/>
| <PING timeout="2000" num_initial_members="3"/>
| <MERGE2 max_interval="100000" min_interval="20000"/>
| <FD_SOCK/>
| <FD timeout="10000" max_tries="5" shun="true"/>
| <VERIFY_SUSPECT timeout="1500" />
| <pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
discard_delivered_msgs="false"/>
| <UNICAST timeout="300,600,1200,2400,3600"/>
| <pbcast.STABLE stability_delay="1000"
desired_avg_gossip="50000" max_bytes="400000"/>
| <pbcast.GMS print_local_addr="true" join_timeout="3000"
shun="false" view_bundling="true"
view_ack_collection_timeout="5000"/>
| <FC max_credits="2000000" min_threshold="0.10"/>
| <FRAG2 frag_size="60000" />
| <pbcast.STREAMING_STATE_TRANSFER/>
| <!-- <pbcast.STATE_TRANSFER/> -->
| <pbcast.FLUSH timeout="0"/>
| </config>
Here is the error that I see when the replication begins to fail.
17:54:50,123 ERROR [UDP] failed sending message to null (75 bytes)
| java.lang.Exception: dest=/228.1.2.3:45566 (78 bytes)
| at org.jgroups.protocols.UDP._send(UDP.java:333)
| at org.jgroups.protocols.UDP.sendToAllMembers(UDP.java:283)
| at org.jgroups.protocols.TP.doSend(TP.java:1327)
| at org.jgroups.protocols.TP.send(TP.java:1317)
| at org.jgroups.protocols.TP.down(TP.java:1038)
| at org.jgroups.protocols.Discovery.down(Discovery.java:350)
| at org.jgroups.protocols.MERGE2.down(MERGE2.java:176)
| at org.jgroups.protocols.FD$BroadcastTask.run(FD.java:689)
| at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
| at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
| at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
| at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
| at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
| at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
| at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
| at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
| at java.lang.Thread.run(Thread.java:619)
| Caused by: java.net.NoRouteToHostException: No route to host: Datagram send failed
| at java.net.PlainDatagramSocketImpl.send(Native Method)
| at java.net.DatagramSocket.send(DatagramSocket.java:612)
| at org.jgroups.protocols.UDP._send(UDP.java:324)
| ... 16 more
I've seen other posts with similar issues but no resolutions that seem to help. Any
help would be greatly appreciated.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4167949#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...