[jboss-jira] [JBoss JIRA] (JGRP-2159) Delta view cannot be installed
Bela Ban (JIRA)
issues at jboss.org
Mon Feb 27 03:03:00 EST 2017
[ https://issues.jboss.org/browse/JGRP-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13369414#comment-13369414 ]
Bela Ban commented on JGRP-2159:
--------------------------------
Here's how this can be reproduced (unit test: {{DeltaViewTest}}):
* J is the coordinator and has view J|0=K
* K joins and sends a JOIN-REQ to J
* J creates new view J|1=J,K (setting {{ltime}} to 1) and multicasts it, but the multicast is delayed (e.g. dropped and retransmitted)
* Finally, J sends a JOIN-RSP with view J|1 to K
* Before receiving the new view, K times out and sends another JOIN-REQ to J
* K receives view J|1 and installs it
* J creates a new view J|2=J,K (setting {{ltime}} to 2) and multicasts it. The multicast is again delayed.
* J sends a JOIN-RSP to K with view J|2, K installs it
* J finally gets the first view multicast and installs J|1=J,K
* New member L sends a JOIN-RSP to J
* J creates view J|3=JKL and multicasts it, and then sends a JOIN-RSP to L
* The multicast of J|3 is a *DeltaView* with ref-view-id=J|1 and joiners=L
* L installs the new view
* J installs the new view J|3
* However, K cannot install the new view as ref-view-id=J|1 is not known as it has view=J|2!
SOLUTION:
* The reason why spurious view J|2 is sent to K is that J hasn't yet installed view J|1 locally. If that was the case, it would see that K is already a member and simply resend view J|1, instead of creating view J|2.
* We therefore need to make sure a new view is installed in the coordinator *before* multicasting it, and this can be done by setting {{install_view_locally_first}} to true by default (or even removing the attribute)
* As a second line of defense, make the recepient of a DeltaView that cannot be installed send a request to the coordinator to resend the view as a full- instead of a delta- view.
* J creates new view J|3=JL
> Delta view cannot be installed
> ------------------------------
>
> Key: JGRP-2159
> URL: https://issues.jboss.org/browse/JGRP-2159
> Project: JGroups
> Issue Type: Bug
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 4.1, 4.0.1
>
> Attachments: discarded_delta_view.log
>
>
> A DeltaView cannot be installed because the ref view-id is not the current view-id.
> Looking at the view sequence for members J, K and L:
> {noformat}
> 19:22:54,278 DEBUG (testng-Test:[]) [GMS] J: installing view [J|0] (1) [J]
> 19:22:56,519 DEBUG (testng-Test:[]) [GMS] K: installing view [J|1] (2) [J, K]
> 19:22:56,572 DEBUG (jgroups-7,J:[]) [GMS] J: installing view [J|1] (2) [J, K]
> 19:22:56,590 DEBUG (jgroups-5,K:[]) [GMS] K: installing view [J|2] (2) [J, K]
> 19:22:58,585 DEBUG (jgroups-5,J:[]) [GMS] J: installing view [J|3] (3) [J, K, L]
> 19:23:00,603 DEBUG (testng-Test:[]) [GMS] L: installing view [J|3] (3) [J, K, L]
> {noformat}
> K cannot install DeltaView J|3 because it has view J|2 but the DeltaView has ref view-id J|1.
> The reason is that J|2 was apparently installed *only* at K (but not at coordinator J1!), despite it being the same view as J|1.
> We need to look into why J|2 was installed at K only. Second line of defense: when a DeltaView cannot be installed, send a message to the view sender (coord) and solicit the full view instead.
> See the attached log.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
More information about the jboss-jira
mailing list