[jboss-jira] [JBoss JIRA] Commented: (JGRP-1060) NAKACK has inconsistent internal state after concurrent node startup
Dennis Reed (JIRA)
jira-events at lists.jboss.org
Thu Sep 24 17:49:49 EDT 2009
[ https://jira.jboss.org/jira/browse/JGRP-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12487332#action_12487332 ]
Dennis Reed commented on JGRP-1060:
-----------------------------------
What I think is happening is:
- Node2 gets view (Node1, Node2). NAKACK sets"members" and "received_msgs" to (Node1, Node2)
- Node1 sends the state to Node2, including its current digest (Node1, Node2)
- Node2 gets view (Node1, Node2, Node3). NAKACK sets "members" and "received_msgs" to (Node1, Node2, Node3)
- Node2 processes the state. STATE_TRANSFER sends Event.SET_DIGEST (Node1, Node2) to NAKACK
- NAKACK sets received_msgs to (Node1, Node2).
At this point, received_msgs is missing node3, so Node2 drops all messages from Node3.
> NAKACK has inconsistent internal state after concurrent node startup
> --------------------------------------------------------------------
>
> Key: JGRP-1060
> URL: https://jira.jboss.org/jira/browse/JGRP-1060
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 2.4.5
> Reporter: Dennis Reed
> Assignee: Bela Ban
>
> Three nodes are started concurrently. The log from the second node to join shows the following (IPs/ports have been replaced)
> 05:26:00,594 45102 INFO [org.jboss.cache.TreeCache] (main:) viewAccepted(): [node1:1234|1] [node1:1234, node2:1234]
> 05:26:00,732 45240 INFO [org.jboss.cache.TreeCache] (main:) TreeCache local address is node2:1234
> 05:26:00,852 45360 INFO [org.jboss.cache.TreeCache] (IncomingPacketHandler (channel=Tomcat-DefaultPartition):) viewAccepted(): [node1:1234|2] [node1:1234, node2:1234, node3:1234]
> 05:26:00,861 45369 INFO [org.jboss.cache.TreeCache] (IncomingPacketHandler (channel=Tomcat-DefaultPartition):) received the state (size=1024 bytes)
> Then many instances of the following logs for more than a day:
> WARN [org.jgroups.protocols.pbcast.NAKACK] (IncomingPacketHandler (channel=Tomcat-DefaultPartition):) node2:1234] discarded message from non-member node3:1234, my view is [node1:1234|2] [node1:1234, node2:1234, node3:1234]
> ERROR [org.jgroups.protocols.pbcast.NAKACK] (Timer-3:) sender node3:1234 not found in received_msgs
> For these messages to be logged, NAKACK is in an inconsistent internal state. The addresses in "members" does not match "received_msgs".
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list