November 2019 - jboss-jira - Jboss List Archives

[JBoss JIRA] (JGRP-1997) Bundler: find new default bundler

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-1997?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-1997: --------------------------- Fix Version/s: 5.1 (was: 5.0) > Bundler: find new default bundler > --------------------------------- > > Key: JGRP-1997 > URL: https://issues.jboss.org/browse/JGRP-1997 > Project: JGroups > Issue Type: Task > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.1 > > > Before removing the deprecated sender-sends-with-timer bundler, assess which bundler is the fastest across a range of applications (UPerf, Infinispan etc). -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-1830) SEQUENCER3

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-1830?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-1830: --------------------------- Fix Version/s: 5.2 (was: 5.0) > SEQUENCER3 > ---------- > > Key: JGRP-1830 > URL: https://issues.jboss.org/browse/JGRP-1830 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.2 > > > Continuation of JGRP-1821; handle failure cases -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-1680) RDMA based transport

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-1680?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-1680: --------------------------- Fix Version/s: 5.2 (was: 5.0) > RDMA based transport > -------------------- > > Key: JGRP-1680 > URL: https://issues.jboss.org/browse/JGRP-1680 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.2 > > > Investigate whether an RDMA based transport makes sense. > Advantages: > * Speed, low latency (TCP/IP is bypassed entirely) > * Low CPU usage > Disadvantages: > * JNI/C code > ** Such a transport implementation would have to live outside of the JGroups repo > ** Maintainability nightmare: the C code would also have to be ported to various OSes > *** Investigate Java based libs (IBM's jVerbs) and C based libs (Apache Portable Runtime?) > * High memory use, growing with cluster size: similarly to TCP, a 'group multicast' would involve N-1 sends. RDMA requires a Queue Pair (QP) for each destination. Each QP requires pinned memory (receive and send buffer), so each node would have to reserve (pin) N-1 memory buffers [1] > ** OTOH, we may not use many group multicasts, e.g. with Infinispan's partial replication (DIST mode) > * High cost of RDMA adapters, NICs and wiring: only a very small fraction of users would run such a transport. > [1] http://www.hpcwire.com/hpcwire/2006-08-18/a_critique_of_rdma-1.html -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-1681) SCTP transport

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-1681?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-1681: --------------------------- Fix Version/s: 5.2 (was: 5.0) > SCTP transport > -------------- > > Key: JGRP-1681 > URL: https://issues.jboss.org/browse/JGRP-1681 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.2 > > > Provide a new transport based on SCTP. The advantages of SCTP are: > * Message based (like UDP), not stream based (like TCP), but still reliable > * Allows to bind to multiple endpoints (failover) > This requires the new NIO2 based transport to be completed, so this feature will probably be moved into a next release. Also, in JDK we only have a com.sun.nio.sctp package; once this is standardized it will be in a java.nio.channels.sctp package. > [1] http://www.oracle.com/technetwork/articles/javase/index-139946.html -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-1424) TP: use of multiple transports

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-1424?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-1424: --------------------------- Fix Version/s: 5.1 (was: 5.0) > TP: use of multiple transports > ------------------------------ > > Key: JGRP-1424 > URL: https://issues.jboss.org/browse/JGRP-1424 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.1 > > > Refactor TP so that the socket sending and receiving is done in a separate class (UDP, TCP, TCP_NIO). Once this is done, add the ability to attach multiple transports to TP, e.g. UDP and TCP. > The UDP transport could then be used for cluster wide messages (null destination) and the TCP transport could be used for unicast messages (non-null destination). > Or this could be overridden by a message flag on a per-message basis ! > We could even attach multiple transports of the same type, e.g. one per physical network (10.x.x.x and 192.168.x.x), and do round-robin sending over them. -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-1672) Shared memory to send message between different processes on the same box

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-1672?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-1672: --------------------------- Fix Version/s: 5.2 (was: 5.0) > Shared memory to send message between different processes on the same box > ------------------------------------------------------------------------- > > Key: JGRP-1672 > URL: https://issues.jboss.org/browse/JGRP-1672 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.2 > > Attachments: ShmTest.java > > > Investigate whether it makes sense to use shared memory to pass messages between processes on the same box. Say if we have A, B and C on box-1 and X, Y, Z on box-2, when A multicasts a message, it could loop it back to itself, place it into shared memory for B and C to read and multicast it to X, Y, Z. The multicast socket could be non-loopback, so box-1 would not receive it. > Problems: > * Shared memory in Java can only be done via memory mapped (sequential or random access) files. To pass a lot of messages, something like a ring buffer would have to be created in shared memory > * Unless we use FileLock, or polling/busy reading, there is no way to know when a producer has written a message into shared memory. We'd therefore have to use a signalling mechanism, probably a small JGroups message, to notify the consumer(s) of new messages. > ** Alternatively, we could do busy waiting: the producer writes into a memory location when a message is ready to be consumed. Perhaps this memory location can be the number of messages ready to be read. The consumer could busy-wait, and decrement the number of messages read. This variable could be protected by a file lock, so after some amount of busy-waiting, the consumer could go back and do a real wait on the file lock, instead of burning CPU doing busy-waits. > * For multicast messages, we'd have 1 producer but many consumers. A RingBuffer would not work here, as we don't know when all consumers have read a given message, ie. when to advance the read pointer > ** As an alternative, we could have one shared memory buffer per member on the same host. This would also cater to unicast messages. However, then we'd use up a lot of memory. > * How would this work for TCP ? We'd have to send the message to only members which are outside the local box. How do we identify those members ? > * Message reception: a multicast message received and targetted to all members on the same box could also be placed into shared memory, so everyone on the same box receives it > ** How would this work for TCP ? E.g. A sending a multicast message M would use shared memory to deliver M to B and C on box-1, but if it sends it to X, Y and Z, then that's unneeded work, as it could send it only to X, which could place it into shared memory for Y and Z to consume M. > *** We'd have to include the knowledge of 'affinity' into an address -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-2407) CompositeMessage: collapse into BytesMessage at the receiver

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-2407?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-2407: --------------------------- Summary: CompositeMessage: collapse into BytesMessage at the receiver (was: CompositeMessage: collapse into BytesMessage at the receiver5) > CompositeMessage: collapse into BytesMessage at the receiver > ------------------------------------------------------------ > > Key: JGRP-2407 > URL: https://issues.jboss.org/browse/JGRP-2407 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.0 > > > If a sender sends a CompositeMessage, say of an NioMessage of 500 bytes and a BytesMessage of 10 bytes, provide the option to unmarshal the message into a BytesMessage of 510 bytes at the receiver. -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-2417) Ref-counting for messages

by Bela Ban (Jira)

[ https://issues.jboss.org/browse/JGRP-2417?page=com.atlassian.jira.plugin.... ] Bela Ban updated JGRP-2417: --------------------------- Description: When users do their own memory management, the message passed to {{Channel.send()}} might have a reference to a memory area that's allocated from a pool, and that needs to be returned when done. However, {{Channel.send()}} does not necessarily mean that the memory area can be reused. If, for example, NAKACK2 or UNICAST3 have the message in their retransmission tables (to potentially retransmit it), then the memory cannot be reused until that message has been purged from the retransmission table. Add a reference-counting mechanism to {{Message}} (implemented in {{BaseMessage}}) that allows NAKACK2 or UNICAST3 to increment a ref-count. When a message is purged from the retransmission table, decrement its ref-count. When the ref-count is 0, a callback could be called. The callback could for example return the associated memory chunk back to the memory pool. This could possibly be a trait, with a no-op implementation as default. This could be overwritten, ie. {code:java} Message release() { if(refcount <= 0) // give associated memory area back to pool } {code} See if this needs to be integrated with {{MessageFactory}} as well. was: When users do their own memory management, the message passed to {{Channel.send()}} might have a reference to a memory area that's allocated from a pool, and that needs to be returned when done. However, {{Channel.send()}} does not necessarily mean that the memory area can be reused. If, for example, NAKACK2 or UNICAST3 have the message in their retransmission tables (to potentially retransmit it), then the memory cannot be reused until that message has been purged from the retransmission table. Add a reference-counting mechanism to {{Message}} (implemented in {{BaseMessage}}) that allows NAKACK2 or UNICAST3 to increment a ref-count. When a message is purged from the retransmission table, decrement its ref-count. When the ref-count is 0, a callback could be called. The callback could for example return the associated memory chunk back to the memory pool. This could possibly be a trait, with a no-op implementation as default. See if this needs to be integrated with {{MessageFactory}} as well. > Ref-counting for messages > ------------------------- > > Key: JGRP-2417 > URL: https://issues.jboss.org/browse/JGRP-2417 > Project: JGroups > Issue Type: Feature Request > Reporter: Bela Ban > Assignee: Bela Ban > Priority: Major > Fix For: 5.0 > > > When users do their own memory management, the message passed to {{Channel.send()}} might have a reference to a memory area that's allocated from a pool, and that needs to be returned when done. > However, {{Channel.send()}} does not necessarily mean that the memory area can be reused. If, for example, NAKACK2 or UNICAST3 have the message in their retransmission tables (to potentially retransmit it), then the memory cannot be reused until that message has been purged from the retransmission table. > Add a reference-counting mechanism to {{Message}} (implemented in {{BaseMessage}}) that allows NAKACK2 or UNICAST3 to increment a ref-count. When a message is purged from the retransmission table, decrement its ref-count. When the ref-count is 0, a callback could be called. The callback could for example return the associated memory chunk back to the memory pool. > This could possibly be a trait, with a no-op implementation as default. This could be overwritten, ie. > {code:java} > Message release() { > if(refcount <= 0) > // give associated memory area back to pool > } > {code} > See if this needs to be integrated with {{MessageFactory}} as well. -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (JGRP-2417) Ref-counting for messages

by Bela Ban (Jira)

Bela Ban created JGRP-2417: ------------------------------ Summary: Ref-counting for messages Key: JGRP-2417 URL: https://issues.jboss.org/browse/JGRP-2417 Project: JGroups Issue Type: Feature Request Reporter: Bela Ban Assignee: Bela Ban Fix For: 5.0 When users do their own memory management, the message passed to {{Channel.send()}} might have a reference to a memory area that's allocated from a pool, and that needs to be returned when done. However, {{Channel.send()}} does not necessarily mean that the memory area can be reused. If, for example, NAKACK2 or UNICAST3 have the message in their retransmission tables (to potentially retransmit it), then the memory cannot be reused until that message has been purged from the retransmission table. Add a reference-counting mechanism to {{Message}} (implemented in {{BaseMessage}}) that allows NAKACK2 or UNICAST3 to increment a ref-count. When a message is purged from the retransmission table, decrement its ref-count. When the ref-count is 0, a callback could be called. The callback could for example return the associated memory chunk back to the memory pool. This could possibly be a trait, with a no-op implementation as default. See if this needs to be integrated with {{MessageFactory}} as well. -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

[JBoss JIRA] (WFLY-12824) Clustering: java.lang.StackOverflowError in scattered cache scenarios

by Tommasso Borgato (Jira)

[ https://issues.jboss.org/browse/WFLY-12824?page=com.atlassian.jira.plugin... ] Tommasso Borgato updated WFLY-12824: ------------------------------------ Priority: Critical (was: Blocker) > Clustering: java.lang.StackOverflowError in scattered cache scenarios > --------------------------------------------------------------------- > > Key: WFLY-12824 > URL: https://issues.jboss.org/browse/WFLY-12824 > Project: WildFly > Issue Type: Bug > Components: Clustering > Affects Versions: 17.0.1.Final, 18.0.1.Final > Reporter: Tommasso Borgato > Assignee: Paul Ferraro > Priority: Critical > > We are in clustering test using scattered cache where fail-over is introduced via server shutdown ([eap-7.x-clustering-http-session-shutdown-scattered|https://eap-qe-jenkins...]) and replication is set at the whole session level; > Server configuration is: > {noformat} > embed-server --server-config=standalone-ha.xml > /subsystem=jgroups/channel=ee:write-attribute(name=stack,value=tcp) > /subsystem=infinispan/cache-container=web/scattered-cache=testScattered:add() > /subsystem=infinispan/cache-container=web/scattered-cache=testScattered/component=state-transfer:add(timeout=0) > /subsystem=infinispan/cache-container=web:write-attribute(name=default-cache, value=testScattered) > {noformat} > With all the following distributions: > - [wildfly-18.0.1.Final.zip|https://download.jboss.org/wildfly/18.0.1.Final/...] > - [wildfly-17.0.1.Final.zip|https://download.jboss.org/wildfly/17.0.1.Final/...] > - [wildfly master|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/cluster...] > we get {{java.lang.StackOverflowError}} after WildFly restart (complete logs [1|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clus...], [2|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clus...], [3|https://eap-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/eap-7.x-clus...]): > {noformat} > 2019-11-24 18:30:00,221 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 86) WFLYCLINF0002: Started clusterbench-ee8.ear.clusterbench-ee8-web-passivating.war cache from web container > 2019-11-24 18:30:00,837 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-3) MSC000001: Failed to start service jboss.clustering.web."clusterbench-ee8.ear.clusterbench-ee8-web.war": org.jboss.msc.service.StartException in service jboss.clustering.web."clusterbench-ee8.ear.clusterbench-ee8-web.war": org.infinispan.commons.CacheException: java.util.concurrent.ExecutionException: java.lang.StackOverflowError > at org.wildfly.clustering.service@18.0.1.Final//org.wildfly.clustering.service.FunctionalService.start(FunctionalService.java:70) > at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1739) > at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$StartTask.execute(ServiceControllerImpl.java:1701) > at org.jboss.msc@1.4.11.Final//org.jboss.msc.service.ServiceControllerImpl$ControllerTask.run(ServiceControllerImpl.java:1559) > at org.jboss.threads@2.3.3.Final//org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35) > at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982) > at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486) > at org.jboss.threads@2.3.3.Final//org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: org.infinispan.commons.CacheException: java.util.concurrent.ExecutionException: java.lang.StackOverflowError > at org.infinispan@9.4.16.Final//org.infinispan.interceptors.impl.PrefetchInterceptor$BackingIterator.hasNext(PrefetchInterceptor.java:651) > at org.infinispan.commons@9.4.16.Final//org.infinispan.commons.util.IteratorMapper.hasNext(IteratorMapper.java:27) > at org.wildfly.clustering.web.infinispan@18.0.1.Final//org.wildfly.clustering.web.infinispan.session.InfinispanSessionManagerFactory.schedule(InfinispanSessionManagerFactory.java:232) > at org.wildfly.clustering.web.infinispan(a)18.0.1.Final//org.wildfly.clustering.web.infinispan.session.InfinispanSessionManagerFactory.<init>(InfinispanSessionManagerFactory.java:120) > at org.wildfly.clustering.web.infinispan@18.0.1.Final//org.wildfly.clustering.web.infinispan.session.InfinispanSessionManagerFactoryServiceConfigurator.get(InfinispanSessionManagerFactoryServiceConfigurator.java:92) > at org.wildfly.clustering.web.infinispan@18.0.1.Final//org.wildfly.clustering.web.infinispan.session.InfinispanSessionManagerFactoryServiceConfigurator.get(InfinispanSessionManagerFactoryServiceConfigurator.java:69) > at org.wildfly.clustering.service@18.0.1.Final//org.wildfly.clustering.service.FunctionalService.start(FunctionalService.java:67) > ... 8 more > Caused by: java.util.concurrent.ExecutionException: java.lang.StackOverflowError > at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) > at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2022) > at org.infinispan@9.4.16.Final//org.infinispan.interceptors.impl.PrefetchInterceptor$BackingIterator.hasNext(PrefetchInterceptor.java:649) > ... 14 more > Caused by: java.lang.StackOverflowError > at java.base/java.lang.Throwable.getMessage(Throwable.java:382) > at java.base/java.lang.Throwable.getLocalizedMessage(Throwable.java:396) > at java.base/java.lang.Throwable.toString(Throwable.java:485) > at java.base/java.lang.Throwable.<init>(Throwable.java:316) > at java.base/java.lang.Exception.<init>(Exception.java:102) > at java.base/java.lang.RuntimeException.<init>(RuntimeException.java:96) > at java.base/java.util.concurrent.CompletionException.<init>(CompletionException.java:88) > at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314) > at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1113) > at java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) > at org.infinispan@9.4.16.Final//org.infinispan.scattered.impl.ScatteredVersionManagerImpl.valuesFuture(ScatteredVersionManagerImpl.java:348) > at org.infinispan@9.4.16.Final//org.infinispan.scattered.impl.ScatteredVersionManagerImpl.lambda$valuesFuture$3(ScatteredVersionManagerImpl.java:348) > at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1106) > at java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) > at org.infinispan@9.4.16.Final//org.infinispan.scattered.impl.ScatteredVersionManagerImpl.valuesFuture(ScatteredVersionManagerImpl.java:348) > at org.infinispan@9.4.16.Final//org.infinispan.scattered.impl.ScatteredVersionManagerImpl.lambda$valuesFuture$3(ScatteredVersionManagerImpl.java:348) > at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1106) > at java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) > at org.infinispan@9.4.16.Final//org.infinispan.scattered.impl.ScatteredVersionManagerImpl.valuesFuture(ScatteredVersionManagerImpl.java:348) > at org.infinispan@9.4.16.Final//org.infinispan.scattered.impl.ScatteredVersionManagerImpl.lambda$valuesFuture$3(ScatteredVersionManagerImpl.java:348) > at java.base/java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:1106) > at java.base/java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2235) > ... > {noformat} -- This message was sent by Atlassian Jira (v7.13.8#713008)

6 years, 8 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

jboss-jira November 2019