[JBoss JIRA] (ISPN-5241) Cache topology updates should use the NO_FC flag
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5241?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5241:
----------------------------------
Fix Version/s: 9.0.0.CR2
(was: 9.0.0.CR1)
> Cache topology updates should use the NO_FC flag
> ------------------------------------------------
>
> Key: ISPN-5241
> URL: https://issues.jboss.org/browse/ISPN-5241
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.1.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 9.0.0.CR2
>
>
> Topology updates are sent while holding the ClusterCacheStatus lock, so they should never block. However, when MFC is present, the topology update can block waiting for enough credits. As most CacheTopologyControlCommands need to acquire the ClusterCacheStatus lock, this can easily lead to a full remote-executor pool (and OOB pool) and the appearance of a deadlock.
> What's more, if one node is not responsive, it can block all the other nodes from receiving further topology updates. Topology updates should be as prompt as possible, so we should use the NO_FC flag to ensure that each node receives topology updates as soon as possible.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-5498) ClientClusterFailoverEventsTest.testEventReplayWithAndWithoutStateAfterFailover random failures
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5498?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5498:
----------------------------------
Fix Version/s: 9.0.0.CR2
(was: 9.0.0.CR1)
> ClientClusterFailoverEventsTest.testEventReplayWithAndWithoutStateAfterFailover random failures
> -----------------------------------------------------------------------------------------------
>
> Key: ISPN-5498
> URL: https://issues.jboss.org/browse/ISPN-5498
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Server
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 9.0.0.CR2
>
>
> It appears the cluster event notifier wants to send some events to a node that no longer exists, and the exception is propagated all the way to the HR client:
> {noformat}
> 12:04:03,088 ERROR (HotRodServerWorker-78-1:) [InvocationContextInterceptor] ISPN000136: Execution errorjava.lang.IllegalArgumentException: Target node ClientClusterFailoverEventsTest-NodeA-33510 is not a cluster member, members are [ClientClusterFailoverEventsTest-NodeB-14182]
> at org.infinispan.distexec.DefaultExecutorService.submit(DefaultExecutorService.java:414)
> at org.infinispan.distexec.DefaultExecutorService.submit(DefaultExecutorService.java:403)
> at org.infinispan.distexec.DistributedExecutionCompletionService.submit(DistributedExecutionCompletionService.java:178)
> at org.infinispan.notifications.cachelistener.cluster.impl.BatchingClusterEventManagerImpl$UnicastEventContext.sendToTargets(BatchingClusterEventManagerImpl.java:97)
> at org.infinispan.notifications.cachelistener.cluster.impl.BatchingClusterEventManagerImpl.sendEvents(BatchingClusterEventManagerImpl.java:50)
> at org.infinispan.notifications.cachelistener.CacheNotifierImpl.notifyCacheEntryCreated(CacheNotifierImpl.java:292)
> at org.infinispan.interceptors.locking.ClusteringDependentLogic$AbstractClusteringDependentLogic.notifyCommitEntry(ClusteringDependentLogic.java:138)
> at org.infinispan.interceptors.locking.ClusteringDependentLogic$DistributionLogic.commitSingleEntry(ClusteringDependentLogic.java:493)
> at org.infinispan.interceptors.locking.ClusteringDependentLogic$AbstractClusteringDependentLogic.commitEntry(ClusteringDependentLogic.java:108)
> at org.infinispan.interceptors.EntryWrappingInterceptor.commitContextEntry(EntryWrappingInterceptor.java:367)
> at org.infinispan.interceptors.EntryWrappingInterceptor.commitEntryIfNeeded(EntryWrappingInterceptor.java:545)
> at org.infinispan.interceptors.EntryWrappingInterceptor.commitContextEntries(EntryWrappingInterceptor.java:344)
> at org.infinispan.interceptors.EntryWrappingInterceptor.invokeNextAndApplyChanges(EntryWrappingInterceptor.java:418)
> at org.infinispan.interceptors.EntryWrappingInterceptor.setSkipRemoteGetsAndInvokeNextForDataCommand(EntryWrappingInterceptor.java:449)
> at org.infinispan.interceptors.EntryWrappingInterceptor.visitPutKeyValueCommand(EntryWrappingInterceptor.java:195)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97)
> at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitNonTxDataWriteCommand(AbstractLockingInterceptor.java:88)
> at org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.visitDataWriteCommand(NonTransactionalLockingInterceptor.java:40)
> at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitPutKeyValueCommand(AbstractLockingInterceptor.java:55)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97)
> at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:111)
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:44)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97)
> at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:324)
> at org.infinispan.statetransfer.StateTransferInterceptor.handleWriteCommand(StateTransferInterceptor.java:256)
> at org.infinispan.statetransfer.StateTransferInterceptor.visitPutKeyValueCommand(StateTransferInterceptor.java:115)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97)
> at org.infinispan.interceptors.CacheMgmtInterceptor.updateStoreStatistics(CacheMgmtInterceptor.java:191)
> at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:177)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:97)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:102)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:71)
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:44)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336)
> at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1617)
> at org.infinispan.cache.impl.CacheImpl.putInternal(CacheImpl.java:1097)
> at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1089)
> at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:522)
> at org.infinispan.server.hotrod.CacheDecodeContext.put(CacheDecodeContext.scala:216)
> at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$hotrod$HotRodDecoder$$decodeValue(HotRodDecoder.scala:132)
> at org.infinispan.server.hotrod.HotRodDecoder$$anonfun$decode$1.apply$mcV$sp(HotRodDecoder.scala:50)
> at org.infinispan.server.hotrod.HotRodDecoder.wrapSecurity(HotRodDecoder.scala:208)
> at org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.scala:45)
> org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for message id[922] returned server error (status=0x85): java.lang.IllegalArgumentException: Target node ClientClusterFailoverEventsTest-NodeA-33510 is not a cluster member, members are [ClientClusterFailoverEventsTest-NodeB-14182]
> at org.infinispan.client.hotrod.impl.protocol.Codec20.checkForErrorsInResponseStatus(Codec20.java:336)
> at org.infinispan.client.hotrod.impl.protocol.Codec20.readPartialHeader(Codec20.java:126)
> at org.infinispan.client.hotrod.impl.protocol.Codec20.readHeader(Codec20.java:112)
> at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56)
> at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:57)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:31)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:20)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:52)
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:247)
> at org.infinispan.client.hotrod.impl.RemoteCacheSupport.put(RemoteCacheSupport.java:79)
> at org.infinispan.client.hotrod.event.ClientClusterFailoverEventsTest.testEventReplayWithAndWithoutStateAfterFailover(ClientClusterFailoverEventsTest.java:59)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-5475) Narayana should be configured to use a volatile store by default
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5475?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5475:
----------------------------------
Fix Version/s: 9.0.0.CR2
(was: 9.0.0.CR1)
> Narayana should be configured to use a volatile store by default
> ----------------------------------------------------------------
>
> Key: ISPN-5475
> URL: https://issues.jboss.org/browse/ISPN-5475
> Project: Infinispan
> Issue Type: Task
> Components: Test Suite - Core
> Affects Versions: 7.2.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.0.0.CR2
>
>
> The {{jbossts-properties.xml}} configuration file in the core module configures a file store by default, and tests have to call {{TestCacheManagerFactory.markAsTransactional()}} (or one of the methods that calls it) to configure a volatile store instead.
> Furthermore, the {{jbossts-properties.xml}} file is explicitly filtered out of the core tests jar, so other modules can't use it.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-5584) Support fine-grained write skew check for FineGrainedAtomicMap entries
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5584?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5584:
----------------------------------
Fix Version/s: 9.0.0.CR2
(was: 9.0.0.CR1)
> Support fine-grained write skew check for FineGrainedAtomicMap entries
> ----------------------------------------------------------------------
>
> Key: ISPN-5584
> URL: https://issues.jboss.org/browse/ISPN-5584
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 8.0.0.Alpha2, 7.2.3.Final
> Reporter: Dan Berindei
> Fix For: 9.0.0.CR2
>
>
> FineGrainedAtomicMap doesn't currently work with write skew check enabled.
> I was able to make it work by adding a special case for DeltaAwareCacheEntry in WriteSkewHelper, however the map has a single version, so the write skew check fails if any of the sub-keys were modified in parallel. With pessimistic locking, fine-grained maps allow the user to modify different sub-keys concurrently, we should allow the same with optimistic locking.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-5570) Cross-site: retry backup commands
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5570?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5570:
----------------------------------
Fix Version/s: 9.0.0.CR2
(was: 9.0.0.CR1)
> Cross-site: retry backup commands
> ---------------------------------
>
> Key: ISPN-5570
> URL: https://issues.jboss.org/browse/ISPN-5570
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Cross-Site Replication
> Affects Versions: 7.2.3.Final
> Reporter: Dan Berindei
> Fix For: 9.0.0.CR2
>
>
> There are 3 phases in a backup RPC:
> 1. Sender -> Local site master: caused by the site master is shutting down or crashing, or by a network split.
> 2. Local site master -> Remote site master:
> 2.1. Local site master is no longer a site master, e.g. because it's shutting down or because it's no longer coordinator after a merge.
> 2.2. Remote site master is not longer a site master.
> 2.3. Link between local site and remote site is down.
> 3. Remote site master -> Backup targets
> Replication failures in phase 3 are handled by retrying (except for TimeoutExceptions), because {{BaseBackupReceiver}} uses regular cache methods to perform the updates.
> But replication failures in phases 1 and 2 are not handled in any way, except for causing the remote site to be taken offline after a certain number of replication failures (if backup is synchronous). We should instead retry backup RPCs when we get a {{SuspectException}} or {{UnreachableException}}, and perhaps even when we get no response (2.2?), and only stop when the timeout expires or when the backup is taken offline.
> Async backup probably needs retrying as well, and perhaps even a more sophisticated approach like I-RAC (ISPN-2634).
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-5614) Write performance regression after ISPN-5484
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5614?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5614:
----------------------------------
Fix Version/s: 9.0.0.CR2
(was: 9.0.0.CR1)
> Write performance regression after ISPN-5484
> --------------------------------------------
>
> Key: ISPN-5614
> URL: https://issues.jboss.org/browse/ISPN-5614
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.0.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.0.0.CR2
>
>
> Regression test shows a significant drop in throughput in the replicated and distributed write tests.
> This was after adjusting the internal thread pool settings in the JGroups configuration: with the default (min=5, max=20, queue=0), the distributed read test would fail to finish.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month