November 2014 - infinispan-issues

[JBoss JIRA] (ISPN-4568) DistSyncL1RepeatableReadFuncTest.testNoEntryInL1MultipleConcurrentGetsWithInvalidation random failures

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4568?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4568: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > DistSyncL1RepeatableReadFuncTest.testNoEntryInL1MultipleConcurrentGetsWithInvalidation random failures > ------------------------------------------------------------------------------------------------------ > > Key: ISPN-4568 > URL: https://issues.jboss.org/browse/ISPN-4568 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Core > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Priority: Blocker > Labels: testsuite_stability > Fix For: 7.1.0.Alpha1 > > > Very likely related to ISPN-4564, as there seem to be 2 unjustified pauses ~ 3s and some log messages also appear to be delayed: > {noformat} > 08:23:48,443 TRACE (transport-thread-DistSyncL1RepeatableReadFuncTest-NodeAN-p28720-t1:) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key-to-the-cache, value=second-put, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@e9a3538] > 08:23:48,470 TRACE (transport-thread-DistSyncL1RepeatableReadFuncTest-NodeAN-p28720-t1:) [JGroupsTransport] dests=[DistSyncL1RepeatableReadFuncTest-NodeAN-7764, DistSyncL1RepeatableReadFuncTest-NodeAM-739], command=SingleRpcCommand{cacheName='dist', command=PutKeyValueCommand{key=key-to-the-cache, value=second-put, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}}, mode=SYNCHRONOUS, timeout=60000 > 08:23:50,953 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28701-t6:) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key-to-the-cache, value=second-put, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true} and InvocationContext [org.infinispan.context.impl.NonTxInvocationContext@62801f8c] > 08:23:50,953 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28701-t6:) [L1ManagerImpl] Invalidating keys [key-to-the-cache] on nodes [DistSyncL1RepeatableReadFuncTest-NodeAK-9309]. Use multicast? false > 08:23:51,060 TRACE (transport-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28700-t2:) [JGroupsTransport] dests=[DistSyncL1RepeatableReadFuncTest-NodeAK-9309], command=SingleRpcCommand{cacheName='dist', command=InvalidateL1Command{num keys=1, origin=DistSyncL1RepeatableReadFuncTest-NodeAN-7764}}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=60000 > 08:23:51,062 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAK-p28661-t5:) [BaseRpcInvokingCommand] Invoking command InvalidateL1Command{num keys=1, origin=DistSyncL1RepeatableReadFuncTest-NodeAN-7764}, with originLocal flag set to false > 08:23:50,972 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28701-t6:) [CallInterceptor] Executing command: PutKeyValueCommand{key=key-to-the-cache, value=second-put, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedMetadata{version=null}, successful=true}. > 08:23:51,786 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAK-p28661-t5:) [InboundInvocationHandlerImpl] About to send back response null for command SingleRpcCommand{cacheName='dist', command=InvalidateL1Command{num keys=1, origin=DistSyncL1RepeatableReadFuncTest-NodeAN-7764}} > 08:23:51,796 TRACE (transport-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28700-t2:) [CommandAwareRpcDispatcher] Responses: [sender=DistSyncL1RepeatableReadFuncTest-NodeAK-9309, received=true, suspected=false] > 08:23:54,561 TRACE (transport-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28700-t2:) [RpcManagerImpl] Response(s) to SingleRpcCommand{cacheName='dist', command=InvalidateL1Command{num keys=1, origin=DistSyncL1RepeatableReadFuncTest-NodeAN-7764}} is {} > 08:23:56,955 ERROR (testng-DistSyncL1RepeatableReadFuncTest:) [UnitTestTestNGListener] Test testNoEntryInL1MultipleConcurrentGetsWithInvalidation(org.infinispan.distribution.DistSyncL1RepeatableReadFuncTest) failed. > java.util.concurrent.TimeoutException > at java.util.concurrent.FutureTask.get(FutureTask.java:201) > at org.infinispan.commons.util.concurrent.NotifyingFutureImpl.get(NotifyingFutureImpl.java:84) > at org.infinispan.distribution.BaseDistSyncL1Test.testNoEntryInL1MultipleConcurrentGetsWithInvalidation(BaseDistSyncL1Test.java:217) > 08:23:54,578 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28701-t6:) [L1NonTxInterceptor] Allowing entry to commit as local node is owner > 08:23:57,861 TRACE (remote-thread-DistSyncL1RepeatableReadFuncTest-NodeAM-p28701-t6:) [EntryWrappingInterceptor] About to commit entry RepeatableReadEntry(499752d9){key=key-to-the-cache, value=second-put, oldValue=first-put, isCreated=false, isChanged=true, isRemoved=false, isValid=true, skipRemoteGet=false, metadata=EmbeddedMetadata{version=null}} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4566) ManualIndexingTest.testManualIndexing random failures

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4566?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4566: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > ManualIndexingTest.testManualIndexing random failures > ----------------------------------------------------- > > Key: ISPN-4566 > URL: https://issues.jboss.org/browse/ISPN-4566 > Project: Infinispan > Issue Type: Bug > Components: Test Suite - Query > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Priority: Blocker > Labels: testsuite_stability > Fix For: 7.1.0.Alpha1 > > > Random timeouts when TRACE logging is enabled: > {noformat} > 04:58:33,679 ERROR (testng-ManualIndexingTest:) [UnitTestTestNGListener] Test testManualIndexing(org.infinispan.query.api.ManualIndexingTest) failed. > org.infinispan.commons.CacheException: java.util.concurrent.ExecutionException: Map phase executing at ManualIndexingTest-NodeA-44176 did not complete within 20 sec timeout > at org.infinispan.distexec.mapreduce.MapReduceTask.executeHelper(MapReduceTask.java:506) > at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:407) > at org.infinispan.query.impl.massindex.MapReduceMassIndexer.start(MapReduceMassIndexer.java:25) > at org.infinispan.query.api.ManualIndexingTest.testManualIndexing(ManualIndexingTest.java:52) > {noformat} > Trace log here: http://ci.infinispan.org/viewLog.html?buildId=9816&buildTypeId=Infinispan... -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4564) Optimize test suite GC settings in CI

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4564?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4564: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > Optimize test suite GC settings in CI > ------------------------------------- > > Key: ISPN-4564 > URL: https://issues.jboss.org/browse/ISPN-4564 > Project: Infinispan > Issue Type: Task > Components: Build process > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Assignee: Dan Berindei > Labels: testsuite_stability > Fix For: 7.1.0.Alpha1 > > > Some CI test logs show big pauses (> 5s) causing intermittent failure. We should monitor the GC activity during the builds and maybe enable UseConcMarkSweepGC or UseG1GC. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4585) Prioritize commands in the remote executor

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4585?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4585: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > Prioritize commands in the remote executor > ------------------------------------------ > > Key: ISPN-4585 > URL: https://issues.jboss.org/browse/ISPN-4585 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Fix For: 7.1.0.Alpha1 > > > The remote executor currently has an unlimited queue of blocked task, but the underlying executor cannot use a queue. With a queue, we wouldn't need to overflow remote commands to the OOB threads, and the OOB threads would be free to process response messages. > The problem is that {{ThreadPoolExecutor}} executes tasks in the order they are in the queue. If a node has a remote executor thread pool of 100 threads and receives a prepare(tx1, put(k, v1) comand, then 1000 prepare(tx_i, put(k, v_i)) commands, and finally a commit(tx1) command, the commit(tx1) command will block until all but 99 of the the prepare(tx_i, put(k, v_i)) commands have timed out. > I think we could help this by using a {{PriorityBlockingQueue}} for the underlying executor, with commands ordered so that state transfer commands < commit/tx completion notification < prepare/lock. The commit command would still have to wait for one of the prepare commands currently running to time out, but it wouldn't have to wait for all of them. > The current code, without a queue, would fill the remote executor and OOB thread pools, and it would discard the commit message (along with most of the prepare commands). The time it would take to process the commit successfully would depend on the timing of the retransmitted messages. > Another possible improvement would be to keep track of the commands currently being executed, and always keep some threads free for commands with higher priority. But I'm not sure how easy it would be to do that on top of an injected {{ExecutorService}}. > I believe there is also a problem with {{BlockingTaskAwareExecutorServiceImpl.checkForReadyTasks()}} after a topology change. Commands with the new topology id are all unblocked by submitting them to the underlying executor in FIFO order, on a single thread, so {{CallerRunsPolicy}} is not a valid rejection policy here. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4572) StateTransferReplicationQueueTest.testStateTransferWithNodeRestartedAndBusyNonTx random failures

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4572?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4572: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > StateTransferReplicationQueueTest.testStateTransferWithNodeRestartedAndBusyNonTx random failures > ------------------------------------------------------------------------------------------------ > > Key: ISPN-4572 > URL: https://issues.jboss.org/browse/ISPN-4572 > Project: Infinispan > Issue Type: Bug > Components: Core, State Transfer, Test Suite - Core > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Priority: Blocker > Labels: testsuite_stability > Fix For: 7.1.0.Alpha1 > > > {noformat} > java.lang.AssertionError: > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) > at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) > at org.testng.AssertJUnit.assertNull(AssertJUnit.java:282) > at org.testng.AssertJUnit.assertNull(AssertJUnit.java:274) > at org.infinispan.statetransfer.StateTransferReplicationQueueTest.doWritingCacheTest(StateTransferReplicationQueueTest.java:144) > at org.infinispan.statetransfer.StateTransferReplicationQueueTest.testStateTransferWithNodeRestartedAndBusyNonTx(StateTransferReplicationQueueTest.java:88) > {noformat} > No trace log available for now. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4586) Too many OutdatedTopologyExceptions in non-transactional caches

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4586?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4586: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > Too many OutdatedTopologyExceptions in non-transactional caches > --------------------------------------------------------------- > > Key: ISPN-4586 > URL: https://issues.jboss.org/browse/ISPN-4586 > Project: Infinispan > Issue Type: Enhancement > Components: Core > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Labels: performance > Fix For: 7.1.0.Alpha1 > > > In a non-tx cache, when the topology id is incremented, owners (both primary and backup) receiving a write command with a lower topology id throw an OutdatedTopologyException so that the originator retries the command on the new owners. > But the originator needs to retry the command only if the owners of the key changed in any way. During a join or a leave, most of the keys should not change owners, so throwing an OutdatedTopologyException is not necessary most of the time. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4587) Re-add old owners in the pending CH when a node leaves during rebalance

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4587?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4587: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > Re-add old owners in the pending CH when a node leaves during rebalance > ----------------------------------------------------------------------- > > Key: ISPN-4587 > URL: https://issues.jboss.org/browse/ISPN-4587 > Project: Infinispan > Issue Type: Enhancement > Components: Core, State Transfer > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Priority: Minor > Fix For: 7.1.0.Alpha1 > > > Say we have a distributed cache \[A, B\] with {{numSegments = 1}} and {{numOwners = 2}}. The initial topology is _T_: currentCH = \{0: A B\}, pendingCH = null > C joins, and A starts a rebalance. The topology is now _T + 1_: currentCH = \{0: A B\}, pendingCH = \{0: A C\} > C now leaves, A updates the consistent hashes to remove it with a new topology _T + 2: currentCH = \{0: A B\}, pendingCH = \{0: A\} > A doesn't need to receive any data, so the rebalance ends and the pending CH is installed as the current CH in topology _T + 3_: currentCH = \{0: A\}, pendingCH = null > This algorithm is relatively easy to follow and implement, but it does result in reduced availability of the cache data. It would be better if topology _T + 2_ could re-add B as an owner in the pending CH. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4610) Implement total order for non-transactional caches

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4610?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4610: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > Implement total order for non-transactional caches > -------------------------------------------------- > > Key: ISPN-4610 > URL: https://issues.jboss.org/browse/ISPN-4610 > Project: Infinispan > Issue Type: Feature Request > Components: Core > Affects Versions: 7.0.0.Alpha5 > Reporter: Dan Berindei > Assignee: Pedro Ruivo > Fix For: 7.1.0.Alpha1 > > > Current locking algorithm in non-transactional caches needs a remote thread on the primary owner to block while replicating the update to the backup owner. The thread is also holding the lock for the key, so it's blocking other threads that want to write to the same key. When there is a lot of contention, this can exhaust the remote executor thread pool and cause lock timeouts. > TO was designed with high contention in mind, and it doesn't block threads to acquire locks. So it should handle this much better. > An alternative solution would be the locking rework in ISPN-2849. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4619) Partition handling support for asymmetric clusters

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4619?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4619: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > Partition handling support for asymmetric clusters > -------------------------------------------------- > > Key: ISPN-4619 > URL: https://issues.jboss.org/browse/ISPN-4619 > Project: Infinispan > Issue Type: Feature Request > Components: Core > Affects Versions: 7.0.0.Beta1 > Reporter: Dan Berindei > Labels: partition_handling > Fix For: 7.1.0.Alpha1 > > > Partition handling currently relies on every cache being defined and running on the coordinator. We could relax this, as the partition handling configuration can be easily added to the cache join command. -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

[JBoss JIRA] (ISPN-4616) PartitionHandlingInterceptor should have special handling for DeltaCompositeKeys

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-4616?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-4616: ----------------------------------- Fix Version/s: 7.1.0.Alpha1 (was: 7.0.0.Final) > PartitionHandlingInterceptor should have special handling for DeltaCompositeKeys > -------------------------------------------------------------------------------- > > Key: ISPN-4616 > URL: https://issues.jboss.org/browse/ISPN-4616 > Project: Infinispan > Issue Type: Bug > Components: Core > Affects Versions: 7.0.0.Beta1 > Reporter: Dan Berindei > Labels: partition_handling > Fix For: 7.1.0.Alpha1 > > -- This message was sent by Atlassian JIRA (v6.3.1#6329)

10 years, 1 month

1
0
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues November 2014