Hi. Here's the problem in a nutshell. 3-node cluster with shared tree cache. Nodes 1
and 2 go away at around the same time (via an unplugged network cable). Node 3 gets
notification withing 10-12 seconds that Node 1 is gone and makes a few changes to the
cache (within a transaction). Cache tries to replicate to Node 2 (not knowing it has gone
away) and fails (ReplicationException). Node 3 thinks that his local cache has been
updated but it hasn't because of the replication failure. Node 3 receives notification
that Node 2 has gone away after ~50 seconds and again updates his cache, which works
because there is no one left to replicate to.
There are two things I need help with:
1. I need to have my local cache update even when it fails to replicate.
2. Why does it take so long to receive notification that the second node has gone away
when they were both on the same network cable that I unplugged? My JGroups timeout is set
to 12 seconds max (counting retries). The two JGroups viewChange notifications are
sometime more than 60 seconds apart.
Thanks for the help!
Jim
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3973649#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...