[jboss-jira] [JBoss JIRA] (JGRP-1529) RELAY2: Intra-site view not being accepted upon inter-site installation failure
Bela Ban (JIRA)
jira-events at lists.jboss.org
Fri Nov 2 09:40:18 EDT 2012
[ https://issues.jboss.org/browse/JGRP-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731199#comment-12731199 ]
Bela Ban commented on JGRP-1529:
--------------------------------
Fixed; property async_relay_creation in RELAY2 installs the relay in a separate task, enabling the view callback to return immediately.
Please re-open if this doesn't work for you, but add steps to reproduce if you reopen.
> RELAY2: Intra-site view not being accepted upon inter-site installation failure
> -------------------------------------------------------------------------------
>
> Key: JGRP-1529
> URL: https://issues.jboss.org/browse/JGRP-1529
> Project: JGroups
> Issue Type: Bug
> Reporter: Radim Vansa
> Assignee: Bela Ban
> Fix For: 3.3
>
>
> When a node becomes coordinator, it sends the VIEW_CHANGE event up the stack. This should result in Receiver.viewAccepted(...) method call. However, when RELAY2 is in stack and the coordinator cannot be reached, it blocks the thread (sending discovery pings) and, therefore, the viewAccepted event is postponed.
> In my opinion the inter-site stack should be created and handled in different thread.
> Context:
> In my case, the coordinator for both local cluster and the global (inter-site) cluster was killed. The FD_SOCK on inter-site stack somehow failed to notice that the coordinator has crashed (more investigation required) and the nodes in global cluster still reported the crashed node as the global coordinator.
> Therefore, the new coordinator of local cluster failed to join the global cluster (obviously got no response from the dead global coordinator).
> The restarted node joined the local cluster and then tried to join the local Infinispan cache with a new local view ID. However, the coordinator failed to notice (in Infinispan viewAccepted handler which was not called) that he had already installed a new JGroups view and it did not respond to the cache join request because it was waiting until it got the new JGroups view (again, which was installed in JGroups but the viewAccepted did not notified Infinispan about that).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list