[jboss-jira] [JBoss JIRA] (JGRP-2028) GMS sometimes ignores view bundling timeout
Bela Ban (JIRA)
issues at jboss.org
Thu Mar 17 02:00:01 EDT 2016
[ https://issues.jboss.org/browse/JGRP-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bela Ban resolved JGRP-2028.
----------------------------
Resolution: Done
Only waiting if wait_time_ms > 0
> GMS sometimes ignores view bundling timeout
> -------------------------------------------
>
> Key: JGRP-2028
> URL: https://issues.jboss.org/browse/JGRP-2028
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.8
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.9, 4.0
>
>
> {{GMS.ViewHandler.run()}} has this code:
> {code}
> wait_time=timeout - (System.nanoTime() - start_time);
> if(wait_time > 0 && firstRequest.canBeProcessedTogether(firstRequest)) { // JGRP-1438
> long wait_time_ms=TimeUnit.MILLISECONDS.convert(wait_time, TimeUnit.NANOSECONDS);
> queue.waitUntilClosed(wait_time_ms); // misnomer: waits until element has been added or q closed
> }
> {code}
> The problem is {{wait_time_ms}} will be {{0}} if {{0 < wait_time < 1_000_000}}. {{queue.waitUntilClosed(wait_time_ms)}} then calls {{Object.wait(0)}} and blocks forever.
> Fortunately, the joiners re-sends the {{JOIN_REQ}} message after {{GMS.join_timeout}} (3s by default), so all that happens is that the view is delayed by 3s. It does cause some random failures in tests that expect the view to form in a precise amount of time, though.
> {noformat}
> 17:04:53,034 DEBUG (ForkThread-3,InitialClusterSizeTest:) [GMS] NodeF-49399: sending JOIN(NodeF-49399) to NodeE-37644
> 17:04:53,034 TRACE (ForkThread-3,InitialClusterSizeTest:) [TCP_NIO2] NodeF-49399: sending msg to NodeE-37644, src=NodeF-49399, headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeF-49399, UNICAST3: DATA, seqno=1, first, TP: [cluster_name=ISPN]
> 17:04:53,035 DEBUG (ForkThread-4,InitialClusterSizeTest:) [GMS] NodeG-49307: sending JOIN(NodeG-49307) to NodeE-37644
> 17:04:53,035 TRACE (ForkThread-4,InitialClusterSizeTest:) [TCP_NIO2] NodeG-49307: sending msg to NodeE-37644, src=NodeG-49307, headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeG-49307, UNICAST3: DATA, seqno=1, first, TP: [cluster_name=ISPN]
> 17:04:53,060 TRACE (INT-1,NodeE-37644:) [TCP_NIO2] NodeE-37644: received [dst: NodeE-37644, src: NodeF-49399 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeF-49399, UNICAST3: DATA, seqno=1, first, TP: [cluster_name=ISPN]
> 17:04:53,060 TRACE (INT-2,NodeE-37644:) [TCP_NIO2] NodeE-37644: received [dst: NodeE-37644, src: NodeG-49307 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeG-49307, UNICAST3: DATA, seqno=1, first, TP: [cluster_name=ISPN]
> 17:04:53,060 TRACE (INT-1,NodeE-37644:) [UNICAST3] NodeE-37644 <-- DATA(NodeF-49399: #1, conn_id=0, first)
> 17:04:53,060 TRACE (INT-2,NodeE-37644:) [UNICAST3] NodeE-37644 <-- DATA(NodeG-49307: #1, conn_id=0, first)
> 17:04:53,061 TRACE (INT-2,NodeE-37644:) [UNICAST3] NodeE-37644: delivering NodeG-49307#1
> 17:04:53,061 TRACE (INT-1,NodeE-37644:) [UNICAST3] NodeE-37644: delivering NodeF-49399#1
> 17:04:56,035 WARN (ForkThread-3,InitialClusterSizeTest:) [GMS] NodeF-49399: JOIN(NodeF-49399) sent to NodeE-37644 timed out (after 3000 ms), on try 1
> 17:04:56,035 WARN (ForkThread-4,InitialClusterSizeTest:) [GMS] NodeG-49307: JOIN(NodeG-49307) sent to NodeE-37644 timed out (after 3000 ms), on try 1
> 17:04:56,036 DEBUG (ForkThread-3,InitialClusterSizeTest:) [GMS] NodeF-49399: sending JOIN(NodeF-49399) to NodeE-37644
> 17:04:56,036 TRACE (ForkThread-3,InitialClusterSizeTest:) [TCP_NIO2] NodeF-49399: sending msg to NodeE-37644, src=NodeF-49399, headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeF-49399, UNICAST3: DATA, seqno=2, TP: [cluster_name=ISPN]
> 17:04:56,036 DEBUG (ForkThread-4,InitialClusterSizeTest:) [GMS] NodeG-49307: sending JOIN(NodeG-49307) to NodeE-37644
> 17:04:56,036 TRACE (ForkThread-4,InitialClusterSizeTest:) [TCP_NIO2] NodeG-49307: sending msg to NodeE-37644, src=NodeG-49307, headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeG-49307, UNICAST3: DATA, seqno=2, TP: [cluster_name=ISPN]
> 17:04:56,057 TRACE (INT-1,NodeE-37644:) [TCP_NIO2] NodeE-37644: received [dst: NodeE-37644, src: NodeF-49399 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeF-49399, UNICAST3: DATA, seqno=2, TP: [cluster_name=ISPN]
> 17:04:56,057 TRACE (INT-2,NodeE-37644:) [TCP_NIO2] NodeE-37644: received [dst: NodeE-37644, src: NodeG-49307 (3 headers), size=0 bytes, flags=OOB|INTERNAL], headers are GMS: GmsHeader[JOIN_REQ]: mbr=NodeG-49307, UNICAST3: DATA, seqno=2, TP: [cluster_name=ISPN]
> 17:04:56,057 TRACE (INT-2,NodeE-37644:) [UNICAST3] NodeE-37644: delivering NodeG-49307#2
> 17:04:56,057 TRACE (INT-1,NodeE-37644:) [UNICAST3] NodeE-37644: delivering NodeF-49399#2
> # finally
> 17:04:56,057 TRACE (ViewHandler,NodeE-37644:) [GMS] NodeE-37644: joiners=[NodeG-49307, NodeF-49399], suspected=[], leaving=[], new view: [NodeE-37644|1] (3) [NodeE-37644, NodeG-49307, NodeF-49399]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
More information about the jboss-jira
mailing list