[jboss-jira] [JBoss JIRA] (JGRP-1817) OverlappingMergeTest testSameCreatorDifferentIDs fails to create correct merged view
Richard Achmatowicz (JIRA)
issues at jboss.org
Thu Mar 27 12:38:14 EDT 2014
[ https://issues.jboss.org/browse/JGRP-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Achmatowicz updated JGRP-1817:
--------------------------------------
Description:
This test does the following:
- creates three channels a,b,c
- injects views A: {A|5 A}, B:{A|6 A,B}, C:{A|7 A,B,C}
- injects a merge event in each of channels A,B,C representing these four views
- checks that all channels have the final view of size 3
The test fails intermittently on RHEL, with the same failure each time:
{noformat}
-------------------------------------------------------------------
GMS: address=A, cluster=OverlappingMergeTest, physical address=10.16.95.7:27215
-------------------------------------------------------------------
-------------------------------------------------------------------
GMS: address=B, cluster=OverlappingMergeTest, physical address=10.16.95.7:27216
-------------------------------------------------------------------
-------------------------------------------------------------------
GMS: address=C, cluster=OverlappingMergeTest, physical address=10.16.95.7:27217
-------------------------------------------------------------------
------------- testSameCreatorDifferentIDs -----------
[A] view=[A|5] [A]
[B] view=[A|6] [A, B]
[C] view=[A|7] [A, B, C]
A's view: [A|5] [A]
B's view: [A|6] [A, B]
C's view: [A|7] [A, B, C]
Enabling TRACE debugging for GMS, MERGE2 and Discovery
==== triggering merge solicitation ====:
212534 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27216
212537 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27218
212538 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27217
215538 [TRACE] TCPPING: - A: discovery took 3004 ms: responses: 1 total (1 servers (0 coord), 0 clients)
215539 [TRACE] MERGE2: - Discovery results:
[B]: view_id=[A|6] ([A|6] [A, B])
[A]: view_id=[A|5] ([A|5] [A])
215539 [DEBUG] MERGE2: - A found different views : [A|5], [A|6]; sending up MERGE event with merge participants [B, A].
Discovery results:
[B]: coord=A
[A]: coord=A
==== checking views after merge ====:
....................Disabling TRACE debugging for GMS, MERGE2 and Discovery
A's view: [A|7] [A, B]
B's view: [A|7] [A, B]
C's view: [A|7] [A, B, C]
{noformat}
Whenever this test fails, it is the discovery phase which fails to find the correct set of views. Instead of finding views for channels A, B and C, it only finds views for channels A and B.
Also, the discovery requests are sent to host:port combinations which are offset by 1. For example, in the case above, the host:port combinations of the channels are 10.16.95.7:27215, 10.16.95.7:27216, and 10.16.95.7:27217, but the pings go put to 10.16.95.7:27216, 10.16.95.7:27217, and 10.16.95.7:27218. Not sure if this is significant as it still covers the channels B and C.
was:
This test does the following:
- creates four channels a,b,c,d
- injects views A: {A,C,B}, B:{A,C,B}, C:{A,C,B} and D: {B,A,C,D}
- injects a merge event in each of channels A,B,C,D representing these four views
- checks that all channels have the final view of size 4
The test fails intermittently on RHEL, with the same failure each time:
{noformat}
181595 [DEBUG] GMS: - A: installing view [A|2] [A]
[testng] 181596 [TRACE] GMS: - A: received all 1 ACKs from members for view [A|2]
[testng] 181866 [TRACE] GMS: - view [A|3] [] is empty: will not multicast it (last view)
[testng]
[testng] -------------------------------------------------------------------
[testng] GMS: address=A, cluster=OverlappingMergeTest, physical address=10.16.94.42:27199
[testng] -------------------------------------------------------------------
[testng] 184954 [TRACE] GMS: - A: no initial members discovered: creating group as first member
[testng] 184954 [DEBUG] GMS: - A: installing view [A|0] [A]
[testng] 184955 [DEBUG] GMS: - created group (first member). My view is [A|0], impl is org.jgroups.protocols.pbcast.CoordGmsImpl
[testng]
[testng] -------------------------------------------------------------------
[testng] GMS: address=B, cluster=OverlappingMergeTest, physical address=10.16.94.42:27200
[testng] -------------------------------------------------------------------
[testng] 184961 [TRACE] GMS: - B: initial_mbrs are A
[testng] 184961 [DEBUG] GMS: - election results: {A=1}
[testng] 184961 [DEBUG] GMS: - sending JOIN(B) to A
[testng] 185013 [TRACE] GMS: - A: new members=[B], suspected=[], leaving=[], new view: [A|1] [A, B]
[testng] 185014 [TRACE] GMS: - A: mcasting view [A|1] [A, B] (2 mbrs)
[testng]
[testng] 185025 [DEBUG] GMS: - A: installing view [A|1] [A, B]
[testng] 185026 [TRACE] GMS: - A: received all 1 ACKs from members for view [A|1]
[testng] 185055 [TRACE] GMS: - B: JOIN-RSP=[A|1] [A, B] [size=2]
[testng]
[testng]
[testng] 185055 [DEBUG] GMS: - B: installing view [A|1] [A, B]
[testng] 185057 [TRACE] GMS: - A: received all ACKs (1) from joiners for view [A|1]
[testng]
[testng] -------------------------------------------------------------------
[testng] GMS: address=C, cluster=OverlappingMergeTest, physical address=10.16.94.42:27201
[testng] -------------------------------------------------------------------
[testng] 185064 [TRACE] GMS: - C: initial_mbrs are B A
[testng] 185064 [DEBUG] GMS: - election results: {A=2}
[testng] 185064 [DEBUG] GMS: - sending JOIN(C) to A
[testng] 185108 [TRACE] GMS: - A: new members=[C], suspected=[], leaving=[], new view: [A|2] [A, B, C]
[testng] 185108 [TRACE] GMS: - A: mcasting view [A|2] [A, B, C] (3 mbrs)
[testng]
[testng] 185117 [DEBUG] GMS: - A: installing view [A|2] [A, B, C]
[testng] 185118 [DEBUG] GMS: - B: installing view [A|2] [A, B, C]
[testng] 185119 [TRACE] GMS: - A: received all 2 ACKs from members for view [A|2]
[testng] 185148 [TRACE] GMS: - C: JOIN-RSP=[A|2] [A, B, C] [size=3]
[testng]
[testng]
[testng] 185149 [DEBUG] GMS: - C: installing view [A|2] [A, B, C]
[testng] 185151 [TRACE] GMS: - A: received all ACKs (1) from joiners for view [A|2]
[testng]
[testng] -------------------------------------------------------------------
[testng] GMS: address=D, cluster=OverlappingMergeTest, physical address=10.16.94.42:27202
[testng] -------------------------------------------------------------------
[testng] 185164 [TRACE] GMS: - D: initial_mbrs are B C A
[testng] 185164 [DEBUG] GMS: - election results: {A=3}
[testng] 185164 [DEBUG] GMS: - sending JOIN(D) to A
[testng] 185203 [TRACE] GMS: - A: new members=[D], suspected=[], leaving=[], new view: [A|3] [A, B, C, D]
[testng] 185203 [TRACE] GMS: - A: mcasting view [A|3] [A, B, C, D] (4 mbrs)
[testng]
[testng] 185210 [DEBUG] GMS: - A: installing view [A|3] [A, B, C, D]
[testng] 185211 [DEBUG] GMS: - B: installing view [A|3] [A, B, C, D]
[testng] 185211 [DEBUG] GMS: - C: installing view [A|3] [A, B, C, D]
[testng] 185213 [TRACE] GMS: - A: received all 3 ACKs from members for view [A|3]
[testng] 185242 [TRACE] GMS: - D: JOIN-RSP=[A|3] [A, B, C, D] [size=4]
[testng]
[testng]
[testng] 185242 [DEBUG] GMS: - D: installing view [A|3] [A, B, C, D]
[testng] 185242 [TRACE] GMS: - A: received all ACKs (1) from joiners for view [A|3]
[testng]
[testng] ==== Injecting view [A|4] [A, C, B] into A, B and C ====
[testng] 185243 [DEBUG] GMS: - A: installing view [A|4] [A, C, B]
[testng] 185243 [DEBUG] GMS: - B: installing view [A|4] [A, C, B]
[testng] 185244 [DEBUG] GMS: - C: installing view [A|4] [A, C, B]
[testng]
[testng] ==== Injecting view [B|4] [B, A, C, D] into D ====
[testng]
[testng] 185245 [DEBUG] GMS: - D: installing view [B|4] [B, A, C, D]
[testng] A: [A|4] [A, C, B]
[testng] B: [A|4] [A, C, B]
[testng] C: [A|4] [A, C, B]
[testng] D: [B|4] [B, A, C, D]
[testng]
[testng] ==== Injecting a merge event into A, B, C and D====
[testng] 185251 [TRACE] GMS: - A: got merge response from A, merge_id=A::3, merge data is sender=A, view=[A|4] [A, C, B], digest=C: [0 (0)], B: [0 (0)], A: [4 (4)]
[testng] 185253 [TRACE] GMS: - B: queue is suspended; request MERGE(4 views) is discarded
[testng] 185255 [TRACE] GMS: - C: queue is suspended; request MERGE(4 views) is discarded
[testng] 185255 [TRACE] GMS: - A: got merge response from B, merge_id=A::3, merge data is sender=B, view=[A|4] [A, C, B], digest=C: [0 (0)], B: [0 (1)], A: [4 (4)]
[testng] 190286 [TRACE] GMS: - A: mcasting view MergeView::[A|5] [A, B, C], subgroups=[A|4] [A, C, B], [A|4] [A, C, B] (3 mbrs)
[testng]
[testng] 190286 [TRACE] GMS: - B: mcasting view MergeView::[A|5] [A, B, C], subgroups=[A|4] [A, C, B], [A|4] [A, C, B] (3 mbrs)
[testng]
[testng] 190317 [DEBUG] GMS: - A: installing view MergeView::[A|5] [A, B, C], subgroups=[A|4] [A, C, B], [A|4] [A, C, B]
[testng] 190318 [DEBUG] GMS: - B: installing view MergeView::[A|5] [A, B, C], subgroups=[A|4] [A, C, B], [A|4] [A, C, B]
[testng] 190318 [DEBUG] GMS: - C: installing view MergeView::[A|5] [A, B, C], subgroups=[A|4] [A, C, B], [A|4] [A, C, B]
[testng] 190320 [TRACE] GMS: - A: received all 3 ACKs from members for view [A|5]
[testng] 190320 [TRACE] GMS: - B: received all 3 ACKs from members for view [A|5]
[testng] A: [A|5] [A, B, C] (coord=true)
[testng] B: [A|5] [A, B, C] (coord=false)
[testng] C: [A|5] [A, B, C] (coord=false)
[testng] D: [B|4] [B, A, C, D] (coord=false)
[testng] 195277 [DEBUG] GMS: - D: sending LEAVE request to B
[testng] FAIL: [1] org.jgroups.tests.OverlappingMergeTest.testMergeWithDifferentPartitions()
{noformat}
Whenever this test fails, I see that the queues are suspended on the initial merge attempt.
> OverlappingMergeTest testSameCreatorDifferentIDs fails to create correct merged view
> ------------------------------------------------------------------------------------
>
> Key: JGRP-1817
> URL: https://issues.jboss.org/browse/JGRP-1817
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.2.13
> Environment: RHEL
> Reporter: Richard Achmatowicz
> Assignee: Bela Ban
> Fix For: 3.2.14
>
>
> This test does the following:
> - creates three channels a,b,c
> - injects views A: {A|5 A}, B:{A|6 A,B}, C:{A|7 A,B,C}
> - injects a merge event in each of channels A,B,C representing these four views
> - checks that all channels have the final view of size 3
> The test fails intermittently on RHEL, with the same failure each time:
> {noformat}
> -------------------------------------------------------------------
> GMS: address=A, cluster=OverlappingMergeTest, physical address=10.16.95.7:27215
> -------------------------------------------------------------------
> -------------------------------------------------------------------
> GMS: address=B, cluster=OverlappingMergeTest, physical address=10.16.95.7:27216
> -------------------------------------------------------------------
> -------------------------------------------------------------------
> GMS: address=C, cluster=OverlappingMergeTest, physical address=10.16.95.7:27217
> -------------------------------------------------------------------
> ------------- testSameCreatorDifferentIDs -----------
> [A] view=[A|5] [A]
> [B] view=[A|6] [A, B]
> [C] view=[A|7] [A, B, C]
> A's view: [A|5] [A]
> B's view: [A|6] [A, B]
> C's view: [A|7] [A, B, C]
> Enabling TRACE debugging for GMS, MERGE2 and Discovery
> ==== triggering merge solicitation ====:
> 212534 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27216
> 212537 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27218
> 212538 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27217
> 215538 [TRACE] TCPPING: - A: discovery took 3004 ms: responses: 1 total (1 servers (0 coord), 0 clients)
> 215539 [TRACE] MERGE2: - Discovery results:
> [B]: view_id=[A|6] ([A|6] [A, B])
> [A]: view_id=[A|5] ([A|5] [A])
> 215539 [DEBUG] MERGE2: - A found different views : [A|5], [A|6]; sending up MERGE event with merge participants [B, A].
> Discovery results:
> [B]: coord=A
> [A]: coord=A
> ==== checking views after merge ====:
> ....................Disabling TRACE debugging for GMS, MERGE2 and Discovery
> A's view: [A|7] [A, B]
> B's view: [A|7] [A, B]
> C's view: [A|7] [A, B, C]
> {noformat}
> Whenever this test fails, it is the discovery phase which fails to find the correct set of views. Instead of finding views for channels A, B and C, it only finds views for channels A and B.
>
> Also, the discovery requests are sent to host:port combinations which are offset by 1. For example, in the case above, the host:port combinations of the channels are 10.16.95.7:27215, 10.16.95.7:27216, and 10.16.95.7:27217, but the pings go put to 10.16.95.7:27216, 10.16.95.7:27217, and 10.16.95.7:27218. Not sure if this is significant as it still covers the channels B and C.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list