[JBoss JIRA] (ISPN-5046) PartitionHandling: split during commit can leave the cache inconsistent after merge
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-5046?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-5046:
--------------------------------
Fix Version/s: 7.2.0.Alpha1
(was: 7.1.0.Final)
> PartitionHandling: split during commit can leave the cache inconsistent after merge
> -----------------------------------------------------------------------------------
>
> Key: ISPN-5046
> URL: https://issues.jboss.org/browse/ISPN-5046
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.2.0.Alpha1
>
>
> Say we have a cluster ABCD; a transaction T was started on A, with B as the primary owner and C the backup owner. B and C both acknowledge the prepare, and the network splits into AB and CD right before A sends the commit command. Eventually A suspects C and D, but the commit still succeeds on B before C and D are suspected. And SuspectExceptions are ignored for commit commands, so the user won't see any error.
> However, C will eventually suspect A and B. When the CD cache topology is installed, it will roll back transaction T. After the merge, both partitions are in degraded mode, so we assume that they both have the latest data and the key is never updated on C.
> From C's point of view, this is very similar to ISPN-3421. The fix should also be similar, we could delay the transaction rollback on C until we get a confirmation from B that T was not committed there. Since B is inaccessible, it will eventually get a SuspectException and the CD cache topology, at which point the cache is in degraded mode and it can wait for a merge. On merge, it should check the status of the transaction on B again, and either commit or rollback based on what B did.
> We also need to suspend the cleanup of completed transactions while the cache is in degraded mode, otherwise C might not find T on B after the merge.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5042) Remote gets caused by writes could be replicated only to the primary owner
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-5042?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-5042:
--------------------------------
Fix Version/s: 7.2.0.Alpha1
(was: 7.1.0.Final)
> Remote gets caused by writes could be replicated only to the primary owner
> --------------------------------------------------------------------------
>
> Key: ISPN-5042
> URL: https://issues.jboss.org/browse/ISPN-5042
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core, State Transfer
> Affects Versions: 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Labels: 7.0
> Fix For: 7.2.0.Alpha1
>
>
> For write operations that need the previous value, a write CH-only owner that doesn't have a key locally will attempt to retrieve the key from the read CH-owners.
> Sending the remote get command to all the previous owners will create extra load on the cluster during state transfer, so it should be more efficient to send the remote get only to the primary owner. Even though the latency of some write operations will be higher, the average latency should be better.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5163) A write operation with the SKIP_LOCKING flag can roll back the transaction
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-5163?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-5163:
--------------------------------
Fix Version/s: 7.2.0.Alpha1
(was: 7.1.0.Final)
> A write operation with the SKIP_LOCKING flag can roll back the transaction
> --------------------------------------------------------------------------
>
> Key: ISPN-5163
> URL: https://issues.jboss.org/browse/ISPN-5163
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.0.3.Final, 7.1.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 7.2.0.Alpha1
>
>
> When a write operation has the SKIP_LOCKING flag, it does not send a {{LockControlCommand}} to the primary owner, but it can send a {{ClusteredGetCommand}} with {{acquireRemoteLocks=true}} instead. The {{ClusteredGetCommmand}} will then execute a {{LockControlCommand}} with the origin not set properly, and {{TxInterceptor}} will roll back the transaction because the originator ({{null}}) appears to have left the cluster.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5151) DistributedSharedCacheTwoNodesMapReduceTest.testInvokeMapReduceOnAllKeys random failures
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-5151?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-5151:
--------------------------------
Fix Version/s: 7.2.0.Alpha1
(was: 7.1.0.Final)
> DistributedSharedCacheTwoNodesMapReduceTest.testInvokeMapReduceOnAllKeys random failures
> ----------------------------------------------------------------------------------------
>
> Key: ISPN-5151
> URL: https://issues.jboss.org/browse/ISPN-5151
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.0.3.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 7.2.0.Alpha1
>
>
> The method {{invokeMapReduce()}} doesn't really invoke the M/R task, it only creates it, and the execution only starts when the test method calls {{task.execute()}} explicitly. It shouldn't try to check the contents of the shared intermediary cache, because the intermediary cache may not exist yet - and it may accidentally create it with the wrong configuration. I get this error when I run only the {{testInvokeMapReduceOnAllKeys}} method:
> {noformat}
> 09:55:37,632 TRACE (testng-DistributedSharedCacheTwoNodesMapReduceTest:) [DefaultCacheManager] About to wire and start cache __tmpMapReduce
> 09:55:37,646 DEBUG (testng-DistributedSharedCacheTwoNodesMapReduceTest:) [MapReduceTask] Invoking CreateCacheCommand{cacheManager=null, cacheNameToCreate='__tmpMapReduce', cacheConfigurationName='__tmpMapReduce', start=true', size=2} across members [DistributedSharedCacheTwoNodesMapReduceTest-NodeA-19271, DistributedSharedCacheTwoNodesMapReduceTest-NodeB-10341]
> 10:32:56,324 ERROR (testng-DistributedSharedCacheTwoNodesMapReduceTest:) [UnitTestTestNGListener] Test testInvokeMapReduceOnAllKeys(org.infinispan.distexec.mapreduce.DistributedSharedCacheTwoNodesMapReduceTest) failed.
> org.infinispan.distexec.mapreduce.MapReduceException: Map phase failed
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhase(MapReduceTask.java:607)
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeHelper(MapReduceTask.java:473)
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:414)
> at org.infinispan.distexec.mapreduce.BaseWordCountMapReduceTest.testInvokeMapReduceOnAllKeys(BaseWordCountMapReduceTest.java:162)
> Caused by: org.infinispan.commons.CacheException: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:105)
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.invokeMapCombineLocally(MapReduceTask.java:1174)
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.access$300(MapReduceTask.java:1101)
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$1.call(MapReduceTask.java:1123)
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$1.call(MapReduceTask.java:1119)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapKeysToNodes(MapReduceManagerImpl.java:363)
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.migrateIntermediateKeysAndValues(MapReduceManagerImpl.java:327)
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombine(MapReduceManagerImpl.java:260)
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:103)
> ... 10 more
> {noformat}
> Even if the check is moved after the M/R task is finished, it still wouldn't be correct, because the task only cleans up the shared intermediary cache asynchronously. So it needs to use {{eventually()}} to avoid errors like this:
> {noformat}
> 04:06:32,260 ERROR (testng-DistributedSharedCacheTwoNodesMapReduceTest:) [UnitTestTestNGListener] Test testInvokeMapReduceOnAllKeys(org.infinispan.distexec.mapreduce.DistributedSharedCacheTwoNodesMapReduceTest) failed.
> java.lang.AssertionError: Shared cache __tmpMapReduce is not empty. It has 5 keys/values: [ImmortalCacheEntry{key=IntermediateCompositeKey [taskId=88948a8b-2a8a-4c13-bc45-4dc3a9f6b0fb, key=is], value=org.infinispan.distexec.mapreduce.MapReduceManagerImpl$DeltaAwareList@21ae10d3}, ImmortalCacheEntry{key=IntermediateCompositeKey [taskId=88948a8b-2a8a-4c13-bc45-4dc3a9f6b0fb, key=JUDCon], value=org.infinispan.distexec.mapreduce.MapReduceManagerImpl$DeltaAwareList@108d6b51}, ImmortalCacheEntry{key=IntermediateCompositeKey [taskId=88948a8b-2a8a-4c13-bc45-4dc3a9f6b0fb, key=cool], value=org.infinispan.distexec.mapreduce.MapReduceManagerImpl$DeltaAwareList@77949e8f}, ImmortalCacheEntry{key=IntermediateCompositeKey [taskId=88948a8b-2a8a-4c13-bc45-4dc3a9f6b0fb, key=Infinispan], value=org.infinispan.distexec.mapreduce.MapReduceManagerImpl$DeltaAwareList@712a6071}, ImmortalCacheEntry{key=IntermediateCompositeKey [taskId=88948a8b-2a8a-4c13-bc45-4dc3a9f6b0fb, key=community], value=org.infinispan.distexec.mapreduce.MapReduceManagerImpl$DeltaAwareList@291bdf76}] expected:<0> but was:<5>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.infinispan.distexec.mapreduce.DistributedSharedCacheTwoNodesMapReduceTest.invokeMapReduce(DistributedSharedCacheTwoNodesMapReduceTest.java:44)
> at org.infinispan.distexec.mapreduce.BaseWordCountMapReduceTest.testInvokeMapReduceOnAllKeys(BaseWordCountMapReduceTest.java:161)
> 04:06:32,579 TRACE (transport-thread-NodeA-p29577-t6:) [InvocationContextInterceptor] Invoked with command RemoveCommand{key=IntermediateCompositeKey [taskId=eb7da48a-5922-4671-9037-4077e209744c, key=RedHat], value=null, flags=null, valueMatcher=MATCH_ALWAYS} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@c0bbc61]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5127) LocalEntryRetrieverWithStoreAsBinaryTest.testFilterWithStoreAsBinaryPartialKeys random failures
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-5127?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-5127:
--------------------------------
Fix Version/s: 7.2.0.Alpha1
(was: 7.1.0.Final)
> LocalEntryRetrieverWithStoreAsBinaryTest.testFilterWithStoreAsBinaryPartialKeys random failures
> -----------------------------------------------------------------------------------------------
>
> Key: ISPN-5127
> URL: https://issues.jboss.org/browse/ISPN-5127
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.1.0.Alpha1, 7.0.3.Final
> Reporter: Dan Berindei
> Assignee: William Burns
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 7.2.0.Alpha1
>
>
> Sometimes the filtered retriever doesn't return any entries:
> {noformat}
> 15:16:26,328 ERROR (testng-LocalEntryRetrieverWithStoreAsBinaryTest:) [UnitTestTestNGListener] Test testFilterWithStoreAsBinaryPartialKeys(org.infinispan.iteration.LocalEntryRetrieverWithStoreAsBinaryTest) failed.java.util.NoSuchElementException
> at org.infinispan.iteration.impl.LocalEntryRetriever$Itr.next(LocalEntryRetriever.java:486)
> at org.infinispan.iteration.impl.LocalEntryRetriever$Itr.next(LocalEntryRetriever.java:428)
> at org.infinispan.iteration.LocalEntryRetrieverWithStoreAsBinaryTest.testFilterWithStoreAsBinaryPartialKeys(LocalEntryRetrieverWithStoreAsBinaryTest.java:93)
> {noformat}
> http://ci.infinispan.org/viewLog.html?buildId=14964
> The test should also use custom key/value types, as {{String}} keys/values are not marshalled when {{storeAsBinary}} is enabled (see {{MarshalledValue.isTypeExcluded()}}).
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-5093) Granularity of remote event listener implementations doing the same job
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-5093?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-5093:
--------------------------------
Fix Version/s: 7.2.0.Alpha1
(was: 7.1.0.Final)
> Granularity of remote event listener implementations doing the same job
> -----------------------------------------------------------------------
>
> Key: ISPN-5093
> URL: https://issues.jboss.org/browse/ISPN-5093
> Project: Infinispan
> Issue Type: Enhancement
> Components: Remote Protocols
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Fix For: 7.2.0.Alpha1
>
>
> Currently, if N clients add the same listener to a cache that does the same job, e.g. keeping a near cache consistent, this results in N server-side cluster listeners created, each potentially installed in different nodes. If one of those nodes fails, all clients that had a listener registered to that node will have to find a different node for this listener.
> The downsides of this approach is that there are as many cluster listeners installed as clients have added listeners (or have near cache enabled), which might not very efficient. If a node goes down, all clients that have cluster listeners there need to failover to some other node.
> The advantage of this approach is simplicity of the approach to decide where to add the listener and where to failover to.
> For this type of scenarios, an alternative set up might be worth exploring:
> If all these client side listeners are interested in exactly the same events, and the client ID would be exposed via the RemoteCache API, a server side cluster listener multi-plexing between all these clients could be potentially built. In other words, instead of having N clients register N cluster listeners, the first client would register the cluster listener with a client listener ID, and if more registrations were added with the same client listener ID, the connections would be added to the existing cluster listener implementation.
> The maximise the efficiency of this solution, all clients (even running in different JMVs), given the same client listener ID, should agree upon the node to add the listener in. For a distributed cache, hashing on the cache name would work. For replicated caches, since there's no hashing available, the first node of the view could be used.
> Since the logic to be executed server-side varies between being the first node adding the client listener vs the others, synchronization would be added to make sure that the first invocation only creates the cluster listener, and the others simply add the channel to the listener.
> Failover is a bit more tricky too, because if the node with the cluster listener goes down, all the clients have to failover, which again exposes a 1st vs the others type of logic.
> Advantages of this approach is the reduction in number of cluster listeners and potentially efficiency coming from a single cluster listener implementation server side.
> The disadvantages come from the server side logic to add/failover a cluster listener, which need to take into account if the listener is present or not. Other disadvantages come from needing the clients to use some specific routing for adding listeners for same node.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months
[JBoss JIRA] (ISPN-3826) CacheResultInterceptor has no InterceptorBindingType
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-3826?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec commented on ISPN-3826:
-------------------------------------------
Thank you for very valuable input!
> CacheResultInterceptor has no InterceptorBindingType
> ----------------------------------------------------
>
> Key: ISPN-3826
> URL: https://issues.jboss.org/browse/ISPN-3826
> Project: Infinispan
> Issue Type: Bug
> Components: JCache
> Affects Versions: 5.3.0.Final
> Reporter: Nicolai Mainiero
> Assignee: Sebastian Łaskawiec
> Priority: Critical
>
> I'm migrating an enterprise application from JBoss AS 7 to WebSphere AS 8. We use Infinispan as Cache via the JCache API. On JBoss everything works fine. On WebSphere we get the following exception:
> org.apache.webbeans.exception.WebBeansConfigurationException: WebBeans XML configuration defined in file infinispan-jcache-5.3.0.Final.jar!/META-INF/beans.xml is failed. Reason is : Interceptor class : org.infinispan.jcache.annotation.CacheResultInterceptor must have at least one @InterceptorBindingType
> I think the cause is that there is no InterceptorBindingType in the CacheResultInterceptor as required by the specification (https://docs.jboss.org/cdi/spec/1.0/html/interceptors.html) and the CDI implementation of the WebSphere (OpenWebBeans) does not accept Interceptors without bindings.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
9 years, 2 months