[JBoss JIRA] (ISPN-2955) Async marshalling executor retry when queue fills
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2955?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2955:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> Async marshalling executor retry when queue fills
> -------------------------------------------------
>
> Key: ISPN-2955
> URL: https://issues.jboss.org/browse/ISPN-2955
> Project: Infinispan
> Issue Type: Enhancement
> Components: Marshalling
> Affects Versions: 5.2.5.Final
> Reporter: Manik Surtani
> Assignee: Manik Surtani
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
>
> When using an async transport and async marshalling, an executor is used to process the marshalling task in a separate thread and the caller's thread is allowed to return immediately.
> When the executor's queue fills and the queue cannot accept any more tasks, it throws a {{RejectedExecutionException}}, causing a very bad user/developer experience. A more correct approach to this is to catch the {{RejectedExecutionException}}, block and retry the task submission.
> The end result is that, in the degenerate case (when the executor queue is full) instead of throwing exceptions, those invocations will perform slightly slower.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2974) DeltaAware based fine-grained replication corrupts cache data, if eviction is enabled
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2974?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2974:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> DeltaAware based fine-grained replication corrupts cache data, if eviction is enabled
> -------------------------------------------------------------------------------------
>
> Key: ISPN-2974
> URL: https://issues.jboss.org/browse/ISPN-2974
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.1.Final, 5.2.5.Final
> Reporter: Horia Chiorean
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.3.0.Beta1
>
>
> When using a custom {{DeltaAware}} implementation in a cluster with 2 replicated nodes with eviction enabled, data transferred from one node (the writer) to the another (the reader) causes data stored on this node and evicted at the time of the change, to be rewritten with whatever the partial latest delta was.
> In more detail:
> * configure 2 nodes in replicated mode, with eviction enabled
> * consider NodeA the writer and NodeB the reader
> * NodeA inserts some data (custom entries) into the cache
> * NodeB correctly receives via state transfer the initial data
> * NodeA loads & partially updates some information about an entry which was not in the cache - was evicted previously
> * NodeB receives the partial delta with the changes from NodeA, but *instead of merging* with whatever is stored in the persistent store, *replaces the entire entry in the cache*, leaving it in effect with "partial/corrupt information"
> If eviction is not enabled, everything works as expected.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2964) putForExternalRead to L1 not invalidated
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2964?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2964:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> putForExternalRead to L1 not invalidated
> ----------------------------------------
>
> Key: ISPN-2964
> URL: https://issues.jboss.org/browse/ISPN-2964
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.1.Final
> Reporter: Sebastian Tusk
> Assignee: Mircea Markus
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
>
> With transactional distributed caches it happens that Cache.putForExternalRead(k,v) places an entry into L1 that never gets invalidated. It seems to happen when the the owner of k doesn't have the entry. In this case the non owner puts k into his cache without having the owner registering this. Usually the owner stores all requesters in L1ManagerImpl.addRequester and sends out invalidations to the requesters. What should happen is that the entry is replicated to the owner.
> Cache config:
> <namedCache name="entity">
> <jmxStatistics enabled="true" />
> <clustering mode="dist">
> <stateTransfer fetchInMemoryState="false" timeout="20000" />
> <async />
> <l1 enabled="true" />
> <hash numOwners="1"/>
> </clustering>
> <locking isolationLevel="READ_COMMITTED"
> lockAcquisitionTimeout="15000" useLockStriping="false" />
> <eviction maxEntries="10000" strategy="LRU" />
> <expiration maxIdle="100000" wakeUpInterval="5000"/>
> <storeAsBinary storeKeysAsBinary="true" storeValuesAsBinary="false" enabled="false" />
> <transaction transactionMode="TRANSACTIONAL" autoCommit="false" lockingMode="OPTIMISTIC"/>
> </namedCache>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2965) L1 and early invalidation leaves inconsistent state
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2965?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2965:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> L1 and early invalidation leaves inconsistent state
> ---------------------------------------------------
>
> Key: ISPN-2965
> URL: https://issues.jboss.org/browse/ISPN-2965
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache, Transactions
> Affects Versions: 5.2.1.Final
> Reporter: Sebastian Tusk
> Assignee: Adrian Nistor
> Labels: 5.2.x
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
>
> In a distributed transactional cache with L1 enabled I can observe the following.
> Prepare cache by adding an entry with Cache.put( k, v1 ).
> 1. Node B starts with adding a changed value. Cache.put( k, v2 )
> 2. Node B TxDistributionInterceptor.visitPrepareCommand flushL1Caches sends invalidations.
> 3. Node A calls Cache.get( k ) retrieves v1 and stores this value in L1.
> 4. Node B proceeds with transaction.
> The result is that Node A answers subsequent Cache.get(k) with v1 and Node B answers with v2.
> It seems the invalidation is either send to early or must be synchronized in some way with the transaction.
> Cache config:
> <namedCache name="entity">
> <jmxStatistics enabled="true" />
> <clustering mode="dist">
> <stateTransfer fetchInMemoryState="false" timeout="20000" />
> <async />
> <l1 enabled="true" />
> <hash numOwners="1"/>
> </clustering>
> <locking isolationLevel="READ_COMMITTED"
> lockAcquisitionTimeout="15000" useLockStriping="false" />
> <eviction maxEntries="10000" strategy="LRU" />
> <expiration maxIdle="100000" wakeUpInterval="5000"/>
> <storeAsBinary storeKeysAsBinary="true" storeValuesAsBinary="false" enabled="false" />
> <transaction transactionMode="TRANSACTIONAL" autoCommit="false" lockingMode="OPTIMISTIC"/>
> </namedCache>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2950) In distributed mode cache store data should be read through the main data owner (vs directly from the store)
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2950?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2950:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> In distributed mode cache store data should be read through the main data owner (vs directly from the store)
> ------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2950
> URL: https://issues.jboss.org/browse/ISPN-2950
> Project: Infinispan
> Issue Type: Feature Request
> Components: Loaders and Stores
> Reporter: Sanne Grinovero
> Assignee: Mircea Markus
> Priority: Critical
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
>
> Dist cache with a cache store (shared or not), k owned by \{N1, N2\}. k is read on N3. What currently happens at this stage, if k is not present in N3's memory (likely unless L1 is configured), the N3's cache store is queried and data is loaded from there. This has several drawbacks:
> - the data might already be in the memory of the owner node (N1,N2) so reading it from the disk is highly inefficient. Especially for hot data: data requested from various nodes at the same time (see also mailing list discussion around lucene query performance depending on this)
> - if this is a local cache store, it might contain stale data which would be returned to the user
> - for async configured cache store this would result in dirty reads, given that a change might be in the async store's memory but not in the store at the moment when it is in read by N3. (Note that using async stores still leaves place to inconsistencies when a node leaves, e.g. because of node crashing before managing to flush the async store.)
> This JIRA is about changing the distribution mode: when asked for a specific key, a node would only touch a cache store if it is an owner of that key, otherwise would first go to the main owner of the key to read the value from there. The ClusterCacheLoader should be deprecated as well.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2971) AsyncStoreStressTest is broken
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2971?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2971:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> AsyncStoreStressTest is broken
> ------------------------------
>
> Key: ISPN-2971
> URL: https://issues.jboss.org/browse/ISPN-2971
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 5.2.5.Final
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Labels: testsuite
> Fix For: 5.3.0.Beta1
>
>
> This test is not run during normal test suite, but we still need to fix this failure and the leaked threads (see System.exit(0) in main method):
> {code}
> testReadWriteRemove(org.infinispan.stress.AsyncStoreStressTest) Time elapsed: 72.377 sec <<< FAILURE!
> java.lang.UnsupportedOperationException
> at java.util.AbstractMap$SimpleImmutableEntry.setValue(AbstractMap.java:726)
> at org.infinispan.stress.AsyncStoreStressTest.testReadWriteRemove(AsyncStoreStressTest.java:140)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2956) putIfAbsent on Hot Rod Java client doesn't reliably fulfil contract
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2956?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2956:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> putIfAbsent on Hot Rod Java client doesn't reliably fulfil contract
> -------------------------------------------------------------------
>
> Key: ISPN-2956
> URL: https://issues.jboss.org/browse/ISPN-2956
> Project: Infinispan
> Issue Type: Bug
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: hotrod-java-client, remote-clients
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
>
> Hot Rod's putIfAbsent might have issues on some edge cases:
> {quote}I want to know whether the putting entry already exists in the remote
> cache cluster, or not.
> I thought that RemoteCache.putIfAbsent() would be useful for that
> purpose, i.e.,
> {code}
> if (remoteCache.putIfAbsent(k,v) == null) {
> // new entry.
> } else {
> // k already exists.
> }
> {code}
> But no.
> The putIfAbsent() for new entry may return non-null value, if one of the
> server crushed while putting.
> The behavior is like the following:
> 1. Client do putIfAbsent(k,v).
> 2. The server receives the request and sends replication requests to
> other servers. If the server crushed before completing replication, some
> servers own that (k,v), but others not.
> 3. Client receives the error. The putIfAbsent() internally retries the
> same request to the next server in the cluster server list.
> 4. If the next server owns the (k,v), the putIfAbsent() returns the
> replicated (k,v) at the step 2, without any error.
> So, putIfAbsent() is not reliable for knowing whether the putting entry
> is *exactly* new or not.
> Does anyone have any idea/workaround for this purpose?{quote}
> A workaround is to do this:
> {quote}We got a simple solution, which can be applied to our customer's application.
> If each value part of putting (k,v) is unique or contains unique value,
> the client can do *double check* wether the entry is new.
> {code}
> val = System.nanoTime(); // or uuid is also useful.
> if ((ret = cache.putIfAbsent(key, val)) == null
> || ret.equals(val)) {
> // new entry, if the return value is just the same.
> } else {
> // key already exists.
> }
> {code}
> We are proposing this workaround which almost works fine.{quote}
> However, this is a bit of a cludge.
> Hot Rod should be improved with an operation that allows a version to be passed when entry is created, instead of relying on the client generating it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2891) Gap in time between commit of transaction and actual value update
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2891?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2891:
--------------------------------
Fix Version/s: 5.3.0.Beta1
(was: 5.3.0.Alpha1)
> Gap in time between commit of transaction and actual value update
> -----------------------------------------------------------------
>
> Key: ISPN-2891
> URL: https://issues.jboss.org/browse/ISPN-2891
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 5.2.1.Final
> Reporter: Jim Crossley
> Assignee: Galder Zamarreño
> Labels: 5.2.x
> Fix For: 5.3.0.Beta1, 5.3.0.Final
>
> Attachments: bad.log, good.log, pessimistic.log, ugly.log
>
>
> Since upgrading our AS7.2 dependency in Immutant (transitively pulling in 5.2.1.Final), one of our integration tests has begun failing intermittently on our CI server. We've yet to see the failure in local runs, only on CI, so I suspect there's something racist going on.
> The two tests (one for optimistic locking, the other for pessimistic) integrate an Infinispan cache (on which the Immutant cache is built) with HornetQ and XA transactions. A number of queue listeners respond to messages by attempting to increment a value in the cache. The failure occurs with both locking schemes, but much more often with optimistic.
> We've confirmed the failure on 5.2.2 as well.
> Attached you'll find three traces of the optimistic test: the good, the bad, and the ugly. All three correspond to this test: https://github.com/immutant/immutant/blob/31a2ef6222088ccb828898e9e3e4531...
> So you can correlate the log messages prefixed with "JC:" in the traces to the code. Note in particular the last two lines in locking.clj: a logged message containing the count, and then an assertion of the count. Note that the "bad" trace was an actual failing test, but the "ugly" trace was a successful test, even though the trace clearly shows the count logged as 2, not 3. The Infinispan TRACE output clearly shows the value as 3, hence the ugliness of this test.
> It's important to understand that the "work" function occurs within an XA transaction. This means, as I understand it, that if three messages are published to "/queue/done", the cached count should equal 3. Line #44 in locking.clj will block until it receives 3 messages, after which the cached count should be 3.
> These tests always pass locally. They only ever fail on CI, which runs *very* slowly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months