[JBoss JIRA] (WFLY-13162) ConcurrentModificationException in WildFlyJobXmlResolver
by Hielke Hoeve (Jira)
[ https://issues.redhat.com/browse/WFLY-13162?page=com.atlassian.jira.plugi... ]
Hielke Hoeve updated WFLY-13162:
--------------------------------
Description:
Because of a thread unsafe construction in WildFlyJobXmlResolver I am experiencing ConcurrentModificationExceptions while starting my EAR every so often. This is because every EJB of the EAR is processed by WildFlyJobXmlResolver in paralel threads. Even if it has no jbatch dependency or dependency to a EJB which does.
This is not reproducable for every startup as it depends on the performance of the host machine, number of ejbs in ear, number of threads and thread timing.
I have created an example project representing our project setup with which I am able to reproduce the issue using a default wildfly build. This test project contains 1 EJB with jbatch, a number of plain EJBs, a WAR which uses the jbatch EJB and an EAR.
See below for the exception stacktrace.
I have made a small fix for this issue @ https://github.com/hielkehoeve/wildfly/commit/82047233621c3e5bdbd45333ca4.... The example test project can also be found there.
{code:java}
08:32:30,164 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-2) MSC000001: Failed to start service jboss.deployment.subunit."project.ear"."project-d.jar".POST_MODULE: org.jboss.msc.service.StartException in service jboss.deployment.subunit."project.ear"."project-d.jar".POST_MODULE: WFLYSRV0153: Failed to process phase POST_MODULE of subdeployment "project-d.jar" of deployment "project.ear"
at org.jboss.as.server@10.0.0.Final//org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:183)
at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1739)
at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1701)
at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1559)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.ConcurrentModificationException
at java.base/java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
at java.base/java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:741)
at java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:351)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.merge(WildFlyJobXmlResolver.java:261)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:127)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.BatchEnvironmentProcessor.deploy(BatchEnvironmentProcessor.java:78)
at org.jboss.as.server@10.0.0.Final//org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:176)
... 8 more
{code}
was:
Because of a thread unsafe construction in WildFlyJobXmlResolver I am experiencing ConcurrentModificationExceptions while starting my EAR every so often. This is because every EJB of the EAR is processed by WildFlyJobXmlResolver in paralel threads. Even if it has no jbatch dependency or dependency to a EJB which does.
This is not reproducable for every startup as it depends on the performance of the host machine, number of ejbs in ear, number of threads and thread timing.
I have created an example project representing our project setup with which I am able to reproduce the issue using a default wildfly build. This test project contains 1 EJB with jbatch, a number of plain EJBs, a WAR which uses the jbatch EJB and an EAR.
See below for the exception stacktrace.
I have made a small fix for this issue @ https://github.com/hielkehoeve/wildfly/tree/18.0.x. The example test project can also be found there.
{code:java}
08:32:30,164 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-2) MSC000001: Failed to start service jboss.deployment.subunit."project.ear"."project-d.jar".POST_MODULE: org.jboss.msc.service.StartException in service jboss.deployment.subunit."project.ear"."project-d.jar".POST_MODULE: WFLYSRV0153: Failed to process phase POST_MODULE of subdeployment "project-d.jar" of deployment "project.ear"
at org.jboss.as.server@10.0.0.Final//org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:183)
at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1739)
at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1701)
at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1559)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.ConcurrentModificationException
at java.base/java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
at java.base/java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:741)
at java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:351)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.merge(WildFlyJobXmlResolver.java:261)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:127)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.BatchEnvironmentProcessor.deploy(BatchEnvironmentProcessor.java:78)
at org.jboss.as.server@10.0.0.Final//org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:176)
... 8 more
{code}
> ConcurrentModificationException in WildFlyJobXmlResolver
> --------------------------------------------------------
>
> Key: WFLY-13162
> URL: https://issues.redhat.com/browse/WFLY-13162
> Project: WildFly
> Issue Type: Bug
> Components: Batch
> Affects Versions: 18.0.1.Final, 19.0.0.Beta3
> Reporter: Hielke Hoeve
> Assignee: Cheng Fang
> Priority: Major
>
> Because of a thread unsafe construction in WildFlyJobXmlResolver I am experiencing ConcurrentModificationExceptions while starting my EAR every so often. This is because every EJB of the EAR is processed by WildFlyJobXmlResolver in paralel threads. Even if it has no jbatch dependency or dependency to a EJB which does.
> This is not reproducable for every startup as it depends on the performance of the host machine, number of ejbs in ear, number of threads and thread timing.
> I have created an example project representing our project setup with which I am able to reproduce the issue using a default wildfly build. This test project contains 1 EJB with jbatch, a number of plain EJBs, a WAR which uses the jbatch EJB and an EAR.
> See below for the exception stacktrace.
> I have made a small fix for this issue @ https://github.com/hielkehoeve/wildfly/commit/82047233621c3e5bdbd45333ca4.... The example test project can also be found there.
> {code:java}
> 08:32:30,164 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-2) MSC000001: Failed to start service jboss.deployment.subunit."project.ear"."project-d.jar".POST_MODULE: org.jboss.msc.service.StartException in service jboss.deployment.subunit."project.ear"."project-d.jar".POST_MODULE: WFLYSRV0153: Failed to process phase POST_MODULE of subdeployment "project-d.jar" of deployment "project.ear"
> at org.jboss.as.server@10.0.0.Final//org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:183)
> at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1739)
> at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1701)
> at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1559)
> at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
> at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
> at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
> at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.util.ConcurrentModificationException
> at java.base/java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
> at java.base/java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:741)
> at java.base/java.util.AbstractCollection.addAll(AbstractCollection.java:351)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.merge(WildFlyJobXmlResolver.java:261)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:127)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.WildFlyJobXmlResolver.forDeployment(WildFlyJobXmlResolver.java:130)
> at org.wildfly.extension.batch.jberet@18.0.0.Final//org.wildfly.extension.batch.jberet.deployment.BatchEnvironmentProcessor.deploy(BatchEnvironmentProcessor.java:78)
> at org.jboss.as.server@10.0.0.Final//org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:176)
> ... 8 more
> {code}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 3 months
[JBoss JIRA] (JGRP-2435) ClientGmsImpl ignores newer view during join
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2435?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2435:
---------------------------
Fix Version/s: 4.2.1
(was: 4.2.0)
> ClientGmsImpl ignores newer view during join
> --------------------------------------------
>
> Key: JGRP-2435
> URL: https://issues.redhat.com/browse/JGRP-2435
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.1.9
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Major
> Fix For: 5.0, 4.2.1
>
>
> We have random failures in a test that starts 4 nodes in parallel ({{org.infinispan.InitialClusterSizeTest.testInitialClusterSize}}). I get at least one failure if I add {{@Test(invocationCount=100)}}, but I did not get any failure when I did the same with {{org.jgroups.tests.ConcurrentStartupTest.testConcurrentJoinWithLOCAL_PING()}}, maybe because the Infinispan test sends some additional messages and sometimes changes how messages are processed.
> The problem seems to be that after receiving a {{JOIN_RSP}} and installing the first view, the GMS implementation is still {{ClientGmsImpl}} when receiving the second view, and the view is ignored. Because this second view already has all 4 members, there is no other {{VIEW}} message and the test just times out.
> I added some logs in TP and GMS for debugging, and this is what I see:
> {noformat}
> 10:37:42,896 TRACE (ucast-receiver-2,Test-NodeC:[]) [UDP] Received oob=false internal=true message GMS: GmsHeader[JOIN_RSP], UNICAST3: DATA, seqno=1, first, TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,898 TRACE (mcast-receiver-3,Test-NodeC:[]) [UDP] Received oob=false internal=false message GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=1], TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,906 TRACE (jgroups-4,Test-NodeC:[]) [UDP] Test-NodeC: received [Test-NodeB to Test-NodeC, 61 bytes, flags=INTERNAL], headers are GMS: GmsHeader[JOIN_RSP], UNICAST3: DATA, seqno=1, first, TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,906 TRACE (jgroups-4,Test-NodeC:[]) [GMS] Handling message GMS: GmsHeader[JOIN_RSP], UNICAST3: DATA, seqno=1, first, TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,907 DEBUG (ForkThread-1,InitialClusterSizeTest:[]) [GMS] Test-NodeC: installing view [Test-NodeB|1] (2) [Test-NodeB, Test-NodeC]
> 10:37:42,947 TRACE (mcast-receiver-3,Test-NodeC:[]) [UDP] Received oob=false internal=false message GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=2], TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,948 TRACE (jgroups-5,Test-NodeC:[]) [UDP] Test-NodeC: received message batch of from Test-NodeB: dest=null, sender=Test-NodeB
> 1:
> #1: GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=1], TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,948 TRACE (jgroups-5,Test-NodeC:[]) [GMS] Handling message GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=1], TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,948 TRACE (jgroups-5,Test-NodeC:[]) [UDP] Test-NodeC: received message batch of from Test-NodeB: dest=null, sender=Test-NodeB
> 1:
> #1: GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=2], TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:42,948 TRACE (jgroups-5,Test-NodeC:[]) [GMS] Handling message GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=2], TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> ### GmsImpl.handleViewChange() {}
> 10:37:42,948 TRACE (jgroups-5,Test-NodeC:[]) [GMS] GmsImpl ignoring view update [Test-NodeB|2] (4) [Test-NodeB, Test-NodeC, Test-NodeA, Test-NodeD]
> 10:37:42,950 TRACE (jgroups-5,Test-NodeC:[]) [UDP] Test-NodeC: sending msg to Test-NodeB, src=Test-NodeC, headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=3, TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> ### The ack for the JOIN_RSP is only sent here
> 10:37:43,034 TRACE (ForkThread-1,InitialClusterSizeTest:[]) [UDP] Test-NodeC: sending msg to Test-NodeB, src=Test-NodeC, headers are GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=4, TP: [cluster=org.infinispan.remoting.transport.InitialClusterSizeTest]
> 10:37:43,034 DEBUG (ForkThread-1,InitialClusterSizeTest:[]) [JGroupsTransport] Waiting for 4 nodes, current view has 2
> {noformat}
> To help debugging, {{TP.passBatchUp}} should really log the headers of the messages in the batch, and {{GMS}} and the {{GmsImpl}} subclasses should log at least a DEBUG message every time they ignore a view (even when {{log_view_warnings==false}}). I also suggest removing the default implementations from {{GmsImpl}} and updating the {{ClientGmsImpl}} javadoc, because it talks about handling {{ViewChange}} instead of {{JoinResponse}}.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 3 months
[JBoss JIRA] (JGRP-2421) Don't log an IllegalArgumentException when receiving an invalid delta view
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2421?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2421:
---------------------------
Fix Version/s: 5.0
4.2.1
(was: 4.2.0)
> Don't log an IllegalArgumentException when receiving an invalid delta view
> --------------------------------------------------------------------------
>
> Key: JGRP-2421
> URL: https://issues.redhat.com/browse/JGRP-2421
> Project: JGroups
> Issue Type: Enhancement
> Affects Versions: 4.1.6
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 5.0, 4.2.1
>
>
> When the cluster is unstable but not because of the network (e.g. excessive load and long GC pauses), a suspected node can keep receiving view updates long after it was excluded from the view.
> The first view triggers a warning about not being a member, but because later delta-views cannot be reconstructed, their warnings look much worse, even though they're about the same thing.
> {noformat}
> 11:31:05,281 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-79,edg-perf03-47882) edg-perf03-47882: not member of view [edg-perf01-21541|6]; discarding it
> 11:31:16,267 WARN [org.jgroups.protocols.pbcast.GMS] (jgroups-80,edg-perf03-47882) edg-perf03-47882: failed to create view from delta-view; dropping view: java.lang.IllegalStateException: the view-id of the delta view ([edg-perf01-21541|6]) doesn't match the current view-id ([edg-perf01-21541|5]); discarding delta view [edg-perf01-21541|7], ref-view=[edg-perf01-21541|6], left=[edg-perf06-47720]
> {noformat}
> Maybe GMS could keep track of when it receives a view in which it is not a member and stop processing delta views from that node?
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 3 months
[JBoss JIRA] (JGRP-2451) FD_ALL2: improvements
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2451?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2451:
---------------------------
Fix Version/s: 4.2.1
(was: 4.2.0)
> FD_ALL2: improvements
> ---------------------
>
> Key: JGRP-2451
> URL: https://issues.redhat.com/browse/JGRP-2451
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Major
> Fix For: 5.0, 4.2.1
>
>
> Improvements to {{FD_ALL2}}.
> * Messages should count as heartbeats ({{msg_counts_as_heartbeat}} should be *default*, and as such, deprecated).
> * When a multicast message is sent before {{interval}} elapsed, we suppress sending a heartbeat
> * There's a map associating members with booleans. True means a heartbeat was received since the last check, false means it wasn't. On a check, the booleans are all set to false.
> It is crucial that setting the in the map is quick (not like in {{FD_ALL}}, where we fetch the current time from the time service), especially since this is done on every message.
> The advantage is that we only send heartbeats when there is no (multicast) traffic, and we don't suspect a member P when heartbeats have been missing despite receiving traffic from P.
> We need to think about whether to consider unicast messages, too, on the sender side: we could populate a bit map with messages sent to members: on a unicast message to P, P's bit would be set in the bit. On a multicast message, all bits would be set. Then, we could selectively send heartbeats only to members with bits set to 0.
> However, this is only feasible with sending a message N-1 times (e.g. TCP); for UDP we don't have such an 'anycast' available.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 3 months
[JBoss JIRA] (JGRP-2454) Documentation is wrong for ForkChannel creation / Initial messages on fork channel are lost
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2454?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2454:
---------------------------
Fix Version/s: 4.2.1
(was: 4.2.0)
> Documentation is wrong for ForkChannel creation / Initial messages on fork channel are lost
> -------------------------------------------------------------------------------------------
>
> Key: JGRP-2454
> URL: https://issues.redhat.com/browse/JGRP-2454
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.1.8
> Reporter: Mirko Streckenbach
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 5.0, 4.2.1
>
>
> The documentation at
> http://www.jgroups.org/manual/html/user-advanced.html#ForkChannelCreation
> has the following example:
> {code}
> JChannel main_ch=new JChannel("/home/bela/udp.xml").name("A");
> ForkChannel fork_ch=new ForkChannel(main_ch, "lock", "fork-ch4",
> new CENTRAL_LOCK(), new STATS());
> fork_ch.connect("bla");
> main_ch.connect("cluster");
> {code}
> This does not work as "fork_ch.connect" will throw an IllegalStateException because the main channel is not connected at that point.
> But if the connects are reversed, messages for the fork channel may arrive before the fork channel is fully established and cause warnings like
> {code}
> Feb 20, 2020 6:15:37 PM org.jgroups.protocols.FORK$1 handleUnknownForkChannel
> WARNING: marian-20309: fork-channel for id=fork-ch4 not found; discarding message
> {code}
> My application will send a message to every new member in the cluster on a specific fork channel (in ReceiverAdapter.viewAccepted). These message usually get lost. Is there an alternate pattern for that?
> I can provide example code if required.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 3 months
[JBoss JIRA] (JGRP-2412) GMS: reduce likelyhood of merges on concurrent new member startup
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2412?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2412:
---------------------------
Fix Version/s: 5.0
4.2.1
(was: 4.2.0)
> GMS: reduce likelyhood of merges on concurrent new member startup
> -----------------------------------------------------------------
>
> Key: JGRP-2412
> URL: https://issues.redhat.com/browse/JGRP-2412
> Project: JGroups
> Issue Type: Enhancement
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Major
> Fix For: 5.0, 4.2.1
>
>
> When multiple new members are started concurrently, and no coord is present yet, different clients get different discovery results, and this may lead to merging.
> If we have no coord, we could run the discovery protocol multiple time, *with the current members list being cached*, and then might end up with a more similar list of new members.
> This could be governed by an additional attribute in {{Discovery}} ({{num_discovery_runs}}?)
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 3 months