[JBoss JIRA] (ISPN-2697) HotRodServer startup fails when its record cannot be inserted into topology cache
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/ISPN-2697?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on ISPN-2697:
--------------------------------
A simple solution could be to have everyone send STABLE messages at desired_avg_gossip, not scaling that value by the cluster size.
Another solution would be to have only the coord process STABLE messages, see https://issues.jboss.org/browse/JGRP-1570 for details.
> HotRodServer startup fails when its record cannot be inserted into topology cache
> ---------------------------------------------------------------------------------
>
> Key: ISPN-2697
> URL: https://issues.jboss.org/browse/ISPN-2697
> Project: Infinispan
> Issue Type: Bug
> Components: Remote protocols
> Affects Versions: 5.2.0.Beta6
> Reporter: Radim Vansa
> Assignee: Galder Zamarreño
> Priority: Critical
> Fix For: 5.2.0.Final
>
>
> When the HotRodServer starts it inserts its record to __hotRodTopologyCache ({{HotRodServer.addSelfToTopologyView(...)}}).
> However, this put may very easily fail - as the command is broadcasted using NAKACK2 protocol, if the message gets lost and there's no following broadcasted message, the message will be not retransmitted and the put operation times out (Replication timeout), which fails the whole HotRodServer startup, all because of one lost UDP message.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2739) Incorrect help for "encoding" command
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-2739?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-2739:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Incorrect help for "encoding" command
> --------------------------------------
>
> Key: ISPN-2739
> URL: https://issues.jboss.org/browse/ISPN-2739
> Project: Infinispan
> Issue Type: Bug
> Components: CLI
> Affects Versions: 5.2.0.CR2
> Reporter: Martin Gencur
> Assignee: Tristan Tarrant
> Priority: Minor
> Fix For: 5.2.0.Final
>
>
> The help for "encoding" command shows the following:
> SYNOPSIS
> encoding [cachename]
> DESCRIPTION
> Shows the currently selected cache or selects a cache to be used as default for CLI operations
> ARGUMENTS
> cachename
> (optional) the name of the cache to set as default for the following operations
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Horia Chiorean (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Horia Chiorean commented on ISPN-2712:
--------------------------------------
After discussing with [~anistor] (based on the previous comment), I can confirm that changing the *concurrencyLevel* to {{concurrencyLevel=50}} fixes the issue, so this can be seen as a potential workaround.
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2736) ConcurrentModificationException in StateRequestCommand
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-2736?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-2736:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> ConcurrentModificationException in StateRequestCommand
> -------------------------------------------------------
>
> Key: ISPN-2736
> URL: https://issues.jboss.org/browse/ISPN-2736
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.0.CR2
> Reporter: Radim Vansa
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
>
> I have tried async replication scenario with async marshalling (optimistic transactions) and this has occurred:
> {code}
> 04:34:15,182 ERROR [org.infinispan.remoting.rpc.RpcManagerImpl] (OOB-77,edg-perf11-48420) ISPN000073: Unexpected error while replicating
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
> at java.util.HashMap$KeyIterator.next(HashMap.java:828)
> at org.infinispan.statetransfer.StateProviderImpl.collectTransactionsToTransfer(StateProviderImpl.java:239)
> at org.infinispan.statetransfer.StateProviderImpl.getTransactionsForSegments(StateProviderImpl.java:200)
> at org.infinispan.statetransfer.StateRequestCommand.perform(StateRequestCommand.java:88)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:245)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:218)
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:484)
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391)
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249)
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598)
> at org.jgroups.JChannel.up(JChannel.java:707)
> {code}
> Seems that StateRequestCommand is iterating through all transactions' locked keys but the collections of these locked keys can be modified by interfering prepare/commit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2664) DistSyncCacheStoreNotSharedNotConcurrentTest.testAtomicPutIfAbsentFromNonOwner fails randomly
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-2664?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-2664:
-----------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/1606
> DistSyncCacheStoreNotSharedNotConcurrentTest.testAtomicPutIfAbsentFromNonOwner fails randomly
> ---------------------------------------------------------------------------------------------
>
> Key: ISPN-2664
> URL: https://issues.jboss.org/browse/ISPN-2664
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Cache
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: testsuite_stability
> Fix For: 5.2.0.Final
>
> Attachments: testAtomicPutIfAbsentFromNonOwner-0.log
>
>
> {code}testAtomicPutIfAbsentFromNonOwner(org.infinispan.distribution.DistSyncCacheStoreNotSharedNotConcurrentTest) Time elapsed: 0.004 sec <<< FAILURE!
> java.lang.AssertionError
> at org.infinispan.distribution.DistSyncCacheStoreNotSharedTest.testAtomicPutIfAbsentFromNonOwner(DistSyncCacheStoreNotSharedTest.java:297)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680){code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2664) DistSyncCacheStoreNotSharedNotConcurrentTest.testAtomicPutIfAbsentFromNonOwner fails randomly
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-2664?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-2664:
-----------------------------------
Issue Type: Bug (was: Feature Request)
> DistSyncCacheStoreNotSharedNotConcurrentTest.testAtomicPutIfAbsentFromNonOwner fails randomly
> ---------------------------------------------------------------------------------------------
>
> Key: ISPN-2664
> URL: https://issues.jboss.org/browse/ISPN-2664
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: testsuite_stability
> Fix For: 5.2.0.Final
>
> Attachments: testAtomicPutIfAbsentFromNonOwner-0.log
>
>
> {code}testAtomicPutIfAbsentFromNonOwner(org.infinispan.distribution.DistSyncCacheStoreNotSharedNotConcurrentTest) Time elapsed: 0.004 sec <<< FAILURE!
> java.lang.AssertionError
> at org.infinispan.distribution.DistSyncCacheStoreNotSharedTest.testAtomicPutIfAbsentFromNonOwner(DistSyncCacheStoreNotSharedTest.java:297)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680){code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months