[jboss-user] [Clustering/JBoss] - JGroups Multiple Registrations

kvbisme do-not-reply at jboss.com
Tue Mar 18 10:29:39 EDT 2008


We have two JBoss Servers (4.0.5.GA) clustered together.  Each machine has two network cards, one connects to the world and the other to a small subnet of the JBoss Servers and other support servers used by the enterprise (database and stuff like that)
In the run.conf at startup we added a -Djboss.bind.address= to point to the network card connected to our smaller subset of machines.
Every so often . . .  we start getting messages (the machine and IP address have been changed to accommodate my overly concerned boss):

anonymous wrote : 
  | [org.jgroups.protocols.FD] I was suspected, but will not remove myself from membership (waiting for EXIT message)
  | [org.jgroups.protocols.pbcast.GMS] checkSelfInclusion() failed, Machine1-int:33252 (additional data: 18 bytes) is not a member of view [Machine2-int:33200 (additional data: 18 bytes) |2] [Machine2-int:33200 (additional data: 18 bytes)]; discarding view
  | [org.jgroups.protocols.pbcast.GMS] I (Machine1-int:33252 (additional data: 18 bytes)) am being shunned, will leave and rejoin group (prev_members are [Machine1-int:33252 (additional data: 18 bytes) Machine2-int:33200 (additional data: 18 bytes) ])
  | [org.jgroups.protocols.pbcast.NAKACK] [Machine1-int:33252 (additional data: 18 bytes)] discarded message from non-member Machine2-int:33200 (additional data: 18 bytes)
  | [org.jgroups.protocols.pbcast.NAKACK] [Machine1-int:33252 (additional data: 18 bytes)] discarded message from non-member Machine2-int:33200 (additional data: 18 bytes)
  | [org.jgroups.protocol.PING] down_handler thread for PING was interrupted (in order to be terminated), but is is still alive
  | ----------------------------------------------------------------------------
  | GMS: address is Machine1-int:33265 (additional data: 18 bytes)
  | ----------------------------------------------------------------------------
  | [org.jgroups.protocol.pbcast.NAKACK] sender Machine1-int:33252 (additional data: 18 bytes) not found in received_msgs
  | [org.jgroups.protocol.pbcast.NAKACK] range is null
  | [org.jgroups.protocol.pbcast.NAKACK] sender Machine2-int:33200 (additional data: 18 bytes) not found in received_msgs
  | [org.jgroups.protocol.pbcast.NAKACK] range is null
  | [org.jgroups.protocol.pbcast.Digest] sender is null, will not add it !
  | [org.jgroups.protocol.pbcast.Digest] sender is null, will not add it !
  | [org.jgroups.protocols.pbcast.NAKACK] sender at index 1 in digest is null
  | [org.jgroups.protocols.pbcast.NAKACK] sender at index 2 in digest is null
  | [org.jboss.ha.framework.interfaces.HAPartition.lifecycle.DefaultPartition] New cluster view for partition DefaultPartition ( id: 3, delta: 1) : [111.222.333.001:1099, 111.222.333.002:1099, 111.222.333.001:1099]
  | [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] I am (111.222.333.001:1099) receivedmembershipChanged event:
  | [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] Dead Members: 0 ([])
  | [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] New Members: 0 ([])
  | [org.jboss.ha.framework.server.DistributedReplicantManagerImpl.DefaultPartition] All Members : 3 ([ 111.222.333.001:1099, 111.222.333.002:1099, 111.222.333.001:1099])
  | [org.jgroups.protocols.pbcast.STATE_TRANSFER] GET_APPLSTATE_OK: received application state, but there are no requestors !
  | 
Then there are  four sets of the following messages with sequence numbers starting at zero and ending at 1273:
anonymous wrote : 
  | [org.jgroups.protocols.pbcast.NAKACK] (requestor=Machine1-int:33265 (additional data: 18 bytes), local_addr=Machine1-int:33252 (additional data: 18 bytes)) message with seqno=0 not found in sent_msgs ! sent_msgs=[1274 -“ 1274]
  | 
 . . .
anonymous wrote : 
  | [org.jgroups.protocols.pbcast.NAKACK] (requestor=Machine1-int:33265 (additional data: 18 bytes), local_addr=Machine1-int:33252 (additional data: 18 bytes)) message with seqno=1273 not found in sent_msgs ! sent_msgs=[1274 - 1274]
  | 
At this point Machine1 starts adding itself to the cluster over and over again until we have to stop and restart the machine.

What could possibly be going on here?


View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4137434#4137434

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4137434




More information about the jboss-user mailing list