[JBoss JIRA] (ISPN-9008) RpcManager.invokeCommandOnAll ignores cache member that are not in the cluster view
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9008?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9008:
-------------------------------
Sprint: Sprint 9.3.0.Beta1
> RpcManager.invokeCommandOnAll ignores cache member that are not in the cluster view
> -----------------------------------------------------------------------------------
>
> Key: ISPN-9008
> URL: https://issues.jboss.org/browse/ISPN-9008
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.3.0.Beta1
>
>
> {{RpcManager.invokeCommandOnAll}} broadcasts the command to all the members of the JGroups cluster view, and the {{ResponseCollector}} is not notified if any members of the cache are not in the cluster view.
> This is a problem in replicated caches with partition handling enabled, because it means a write can succeed in a minority partition in the time interval between {{JGroupsTransport}} seeing the minority cluster view and {{DistributionManagerImpl}} installing the {{DEGRADED_MODE}} cache topology.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9062) JGroupsTransport should only send messages to nodes in the cluster view
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9062?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9062:
-------------------------------
Sprint: Sprint 9.3.0.Beta1
> JGroupsTransport should only send messages to nodes in the cluster view
> -----------------------------------------------------------------------
>
> Key: ISPN-9062
> URL: https://issues.jboss.org/browse/ISPN-9062
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.2.2.Final, 9.3.0.Alpha1
>
>
> {{JGroupsTransport}} only waits for responses from nodes in the JGroups cluster view, but it still sends messages to all the nodes specified as a target. The idea was to optimize the common case by avoiding a {{HashSet.contains()}} call.
> However, when a node is not in the view, messages to it still pass through the entire JGroups stack, and UNICAST3 keeps those messages in a send table for a long time ({{UNICAST3.conn_expiry_timeout}}, changed with ISPN-9038 from {{0}} (unlimited) to 2 minutes (JGroups default)). Having a potentially unlimited number of messages of non-members, each with its own send table, makes it much harder to estimate memory usage.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-8962) PreferAvailabilityStrategy: Rely less on the stable topology
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-8962?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-8962:
-------------------------------
Sprint: Sprint 9.3.0.Beta1
> PreferAvailabilityStrategy: Rely less on the stable topology
> ------------------------------------------------------------
>
> Key: ISPN-8962
> URL: https://issues.jboss.org/browse/ISPN-8962
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.2.2.Final, 9.3.0.Beta1
>
>
> {{PreferAvailabilityStrategy}} checks the size of the stable topology, and only considers cache topologies that are derived from the biggest topology (in size) when picking a post-merge topology.
> Unfortunately, in some situations this algorithm fails pretty badly. If a node has a very long GC pause, when it comes back it will report the old topology *and* the old stable topology. If the rest of the cluster rebalanced, it now has both a smaller current topology and a smaller stable topology.
> Furthermore, the stable topology is updated asynchronously, independent from the current topology. So even if there's a split and the minority partition installs a current topology with fewer members, it may take some time for its stable topology to be updated with fewer members. In fact, it appears that when a rebalance is not needed (e.g. because the partition has a single node), the stable topology is never updated!
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9087) Timeout during put operation when a node is blocked
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-9087?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9087:
-------------------------------
Sprint: Sprint 9.3.0.Beta1
> Timeout during put operation when a node is blocked
> ---------------------------------------------------
>
> Key: ISPN-9087
> URL: https://issues.jboss.org/browse/ISPN-9087
> Project: Infinispan
> Issue Type: Bug
> Reporter: Diego Lovison
> Assignee: Dan Berindei
>
> {noformat}
> 2018-04-17 13:30:02.782 ERROR 14932 --- [timeout-thread--p3-t1] o.i.i.impl.InvocationContextInterceptor : ISPN000136: Error executing command PutKeyValueCommand, writing keys [5db796a3-3f65-468a-b86a-6d5ef8b4b330]
>
> org.infinispan.util.concurrent.TimeoutException: ISPN000427: Timeout after 15 seconds waiting for acks. Id=100000
> at org.infinispan.util.concurrent.CommandAckCollector.createTimeoutException(CommandAckCollector.java:188) ~[infinispan-embedded-8.5.0.Final-redhat-6.jar:8.5.0.Final-redhat-6]
> at org.infinispan.util.concurrent.CommandAckCollector.access$300(CommandAckCollector.java:51) ~[infinispan-embedded-8.5.0.Final-redhat-6.jar:8.5.0.Final-redhat-6]
> at org.infinispan.util.concurrent.CommandAckCollector$BaseCollector.call(CommandAckCollector.java:214) [infinispan-embedded-8.5.0.Final-redhat-6.jar:8.5.0.Final-redhat-6]
> at org.infinispan.util.concurrent.CommandAckCollector$BaseCollector.call(CommandAckCollector.java:191) [infinispan-embedded-8.5.0.Final-redhat-6.jar:8.5.0.Final-redhat-6]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_161]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_161]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_161]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> {noformat}
> After some investigation together with [~dan.berindei], we found that the FD_ALL is just too slow.
> {noformat}
> <FD_ALL timeout="60000"
> interval="15000"
> timeout_check_interval="5000"
> />
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-8981) Generate Hot Rod parser automatically
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-8981?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-8981:
----------------------------------
Sprint: (was: Sprint 9.3.0.Beta1)
> Generate Hot Rod parser automatically
> -------------------------------------
>
> Key: ISPN-8981
> URL: https://issues.jboss.org/browse/ISPN-8981
> Project: Infinispan
> Issue Type: Enhancement
> Components: Server
> Affects Versions: 9.2.0.Final
> Reporter: Radim Vansa
> Assignee: Radim Vansa
> Fix For: 9.3.0.Final
>
>
> This JIRA has two objectives:
> 1. reduce number of allocated objects
> 2. improve the parsing on server side to avoid chains of lambda mappings
> Manual parsing of Hot Rod protocol, invoking recursive methods that return {{Optional}}s or {{Optional<Optional<...>}}s seems to generate a lot of garbage. A better approach would be a finite state automaton that would read the byte stream and invoke callbacks.
> Such automaton can be generated from a high-level grammar as part of the build process.
> Along with these changes we can remove the {{Response}} abstraction and write responses directly as {{ByteBuf}}s.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9094) ArrayIndexOutOfBoundsException on server using scattered cache
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-9094?page=com.atlassian.jira.plugin.... ]
Radim Vansa updated ISPN-9094:
------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/5937
> ArrayIndexOutOfBoundsException on server using scattered cache
> ---------------------------------------------------------------
>
> Key: ISPN-9094
> URL: https://issues.jboss.org/browse/ISPN-9094
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.1.Final
> Reporter: Paul Ferraro
> Assignee: Radim Vansa
>
> We hit ArrayIndexOutOfBoundsException when running tests for RFE EAP7-867.
> EAP distribution was built from {{https://github.com/pferraro/wildfly/tree/scattered}} .
> Test description: Positive stress test (no failover), 4-node EAP cluster, clients: starting with 400 clients in the beginning, raising the number of clients to 6000 in the end of the test.
> Error occured on server dev215 around 7th iteration (can be seen in the performance report, link below):
> {code}
> [JBossINF] [0m[31m04:26:11,708 ERROR [stderr] (transport-thread--p15-t25) Exception in thread "transport-thread--p15-t25" java.lang.ArrayIndexOutOfBoundsException: 129
> [JBossINF] [0m[31m04:26:11,708 ERROR [stderr] (transport-thread--p15-t25) at org.infinispan.scattered.impl.ScatteredVersionManagerImpl.lambda$tryRegularInvalidations$4(ScatteredVersionManagerImpl.java:413)
> [JBossINF] [0m[31m04:26:11,708 ERROR [stderr] (transport-thread--p15-t25) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [JBossINF] [0m[31m04:26:11,708 ERROR [stderr] (transport-thread--p15-t25) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [JBossINF] [0m[31m04:26:11,708 ERROR [stderr] (transport-thread--p15-t25) at org.wildfly.clustering.service.concurrent.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:47)
> [JBossINF] [0m[31m04:26:11,708 ERROR [stderr] (transport-thread--p15-t25) at java.lang.Thread.run(Thread.java:748)
> {code}
> Clients were getting "SocketTimeoutException: Read timed out" exceptions even before the ArrayIndexOutOfBoundsException ocurred, but also after.
> Performance report (accessible only when connected to VPN):
> http://download.eng.brq.redhat.com/scratch/mvinkler/reports/2018-04-19_15...
> One can observe that dev215 CPU usage and network usage dropped after 7th iteration.
> dev215 server log link:
> https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/job/eap-7x-stress-se...
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9099) Staggered remote get throws IllegalStateException
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-9099?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-9099:
-------------------------------
Sprint: Sprint 9.3.0.Beta1
> Staggered remote get throws IllegalStateException
> -------------------------------------------------
>
> Key: ISPN-9099
> URL: https://issues.jboss.org/browse/ISPN-9099
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.3.0.Beta1
>
>
> {{StaggeredRequest.sendNextMessage()}} throws {{IllegalStateException}} if the last target has left the cluster and the next-to-last target didn't reply within the stagger timeout. This causes a random failure in {{TwoWaySplitAndMergeTest}}:
> {noformat}
> 12:29:10,960 ERROR (testng-TwoWaySplitAndMergeTest[DIST_SYNC]:[]) [TestSuiteProgress] Test failed: org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge4[DIST_SYNC]
> org.infinispan.commons.CacheException: java.lang.IllegalStateException: Request should have been completed already.
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.rethrowException(InvocationContextInterceptor.java:134) ~[classes/:?]
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.lambda$new$0(InvocationContextInterceptor.java:62) ~[classes/:?]
> at org.infinispan.interceptors.InvocationExceptionFunction.apply(InvocationExceptionFunction.java:21) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30) ~[classes/:?]
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_171]
> at org.infinispan.remoting.transport.AbstractRequest.completeExceptionally(AbstractRequest.java:74) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.sendNextMessage(StaggeredRequest.java:106) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onTimeout(StaggeredRequest.java:66) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) ~[classes/:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.commons.CacheException: java.lang.IllegalStateException: Request should have been completed already.
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) ~[?:1.8.0_171]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82) ~[classes/:?]
> at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:37) ~[classes/:?]
> at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.get(CacheImpl.java:485) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.get(CacheImpl.java:478) ~[classes/:?]
> at org.infinispan.cache.impl.AbstractDelegatingCache.get(AbstractDelegatingCache.java:348) ~[classes/:?]
> at org.infinispan.cache.impl.EncoderCache.get(EncoderCache.java:658) ~[classes/:?]
> at org.infinispan.partitionhandling.BasePartitionHandlingTest.assertKeyAvailableForRead(BasePartitionHandlingTest.java:396) ~[test-classes/:?]
> at org.infinispan.partitionhandling.BasePartitionHandlingTest$Partition.assertKeyAvailableForRead(BasePartitionHandlingTest.java:325) ~[test-classes/:?]
> at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.lambda$testSplitAndMerge$1(TwoWaySplitAndMergeTest.java:96) ~[test-classes/:?]
> at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110) ~[?:1.8.0_171]
> at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:557) ~[?:1.8.0_171]
> at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge(TwoWaySplitAndMergeTest.java:95) ~[test-classes/:?]
> at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge4(TwoWaySplitAndMergeTest.java:43) ~[test-classes/:?]
> Caused by: org.infinispan.commons.CacheException: java.lang.IllegalStateException: Request should have been completed already.
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.rethrowException(InvocationContextInterceptor.java:134) ~[classes/:?]
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.lambda$new$0(InvocationContextInterceptor.java:62) ~[classes/:?]
> at org.infinispan.interceptors.InvocationExceptionFunction.apply(InvocationExceptionFunction.java:21) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30) ~[classes/:?]
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_171]
> at org.infinispan.remoting.transport.AbstractRequest.completeExceptionally(AbstractRequest.java:74) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.sendNextMessage(StaggeredRequest.java:106) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onTimeout(StaggeredRequest.java:66) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) ~[classes/:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_171]
> ... 3 more
> Caused by: java.lang.IllegalStateException: Request should have been completed already.
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.sendNextMessage(StaggeredRequest.java:88) ~[classes/:?]
> ... 9 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-9099) Staggered remote get throws IllegalStateException
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-9099?page=com.atlassian.jira.plugin.... ]
Ryan Emerson resolved ISPN-9099.
--------------------------------
Resolution: Done
> Staggered remote get throws IllegalStateException
> -------------------------------------------------
>
> Key: ISPN-9099
> URL: https://issues.jboss.org/browse/ISPN-9099
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.3.0.Beta1
>
>
> {{StaggeredRequest.sendNextMessage()}} throws {{IllegalStateException}} if the last target has left the cluster and the next-to-last target didn't reply within the stagger timeout. This causes a random failure in {{TwoWaySplitAndMergeTest}}:
> {noformat}
> 12:29:10,960 ERROR (testng-TwoWaySplitAndMergeTest[DIST_SYNC]:[]) [TestSuiteProgress] Test failed: org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge4[DIST_SYNC]
> org.infinispan.commons.CacheException: java.lang.IllegalStateException: Request should have been completed already.
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.rethrowException(InvocationContextInterceptor.java:134) ~[classes/:?]
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.lambda$new$0(InvocationContextInterceptor.java:62) ~[classes/:?]
> at org.infinispan.interceptors.InvocationExceptionFunction.apply(InvocationExceptionFunction.java:21) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30) ~[classes/:?]
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_171]
> at org.infinispan.remoting.transport.AbstractRequest.completeExceptionally(AbstractRequest.java:74) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.sendNextMessage(StaggeredRequest.java:106) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onTimeout(StaggeredRequest.java:66) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) ~[classes/:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Suppressed: java.util.concurrent.ExecutionException: org.infinispan.commons.CacheException: java.lang.IllegalStateException: Request should have been completed already.
> at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) ~[?:1.8.0_171]
> at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82) ~[classes/:?]
> at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:37) ~[classes/:?]
> at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.get(CacheImpl.java:485) ~[classes/:?]
> at org.infinispan.cache.impl.CacheImpl.get(CacheImpl.java:478) ~[classes/:?]
> at org.infinispan.cache.impl.AbstractDelegatingCache.get(AbstractDelegatingCache.java:348) ~[classes/:?]
> at org.infinispan.cache.impl.EncoderCache.get(EncoderCache.java:658) ~[classes/:?]
> at org.infinispan.partitionhandling.BasePartitionHandlingTest.assertKeyAvailableForRead(BasePartitionHandlingTest.java:396) ~[test-classes/:?]
> at org.infinispan.partitionhandling.BasePartitionHandlingTest$Partition.assertKeyAvailableForRead(BasePartitionHandlingTest.java:325) ~[test-classes/:?]
> at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.lambda$testSplitAndMerge$1(TwoWaySplitAndMergeTest.java:96) ~[test-classes/:?]
> at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110) ~[?:1.8.0_171]
> at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:557) ~[?:1.8.0_171]
> at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge(TwoWaySplitAndMergeTest.java:95) ~[test-classes/:?]
> at org.infinispan.partitionhandling.TwoWaySplitAndMergeTest.testSplitAndMerge4(TwoWaySplitAndMergeTest.java:43) ~[test-classes/:?]
> Caused by: org.infinispan.commons.CacheException: java.lang.IllegalStateException: Request should have been completed already.
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.rethrowException(InvocationContextInterceptor.java:134) ~[classes/:?]
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.lambda$new$0(InvocationContextInterceptor.java:62) ~[classes/:?]
> at org.infinispan.interceptors.InvocationExceptionFunction.apply(InvocationExceptionFunction.java:21) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:118) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:81) ~[classes/:?]
> at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:30) ~[classes/:?]
> at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_171]
> at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[?:1.8.0_171]
> at org.infinispan.remoting.transport.AbstractRequest.completeExceptionally(AbstractRequest.java:74) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.sendNextMessage(StaggeredRequest.java:106) ~[classes/:?]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onTimeout(StaggeredRequest.java:66) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87) ~[classes/:?]
> at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22) ~[classes/:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_171]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_171]
> ... 3 more
> Caused by: java.lang.IllegalStateException: Request should have been completed already.
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.sendNextMessage(StaggeredRequest.java:88) ~[classes/:?]
> ... 9 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months
[JBoss JIRA] (ISPN-8962) PreferAvailabilityStrategy: Rely less on the stable topology
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8962?page=com.atlassian.jira.plugin.... ]
Ryan Emerson resolved ISPN-8962.
--------------------------------
Resolution: Done
> PreferAvailabilityStrategy: Rely less on the stable topology
> ------------------------------------------------------------
>
> Key: ISPN-8962
> URL: https://issues.jboss.org/browse/ISPN-8962
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.2.2.Final, 9.3.0.Beta1
>
>
> {{PreferAvailabilityStrategy}} checks the size of the stable topology, and only considers cache topologies that are derived from the biggest topology (in size) when picking a post-merge topology.
> Unfortunately, in some situations this algorithm fails pretty badly. If a node has a very long GC pause, when it comes back it will report the old topology *and* the old stable topology. If the rest of the cluster rebalanced, it now has both a smaller current topology and a smaller stable topology.
> Furthermore, the stable topology is updated asynchronously, independent from the current topology. So even if there's a split and the minority partition installs a current topology with fewer members, it may take some time for its stable topology to be updated with fewer members. In fact, it appears that when a rebalance is not needed (e.g. because the partition has a single node), the stable topology is never updated!
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 8 months