[JBoss JIRA] (ISPN-3315) Retry remote get after topology change if all the targets are no longer owners
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3315?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3315:
--------------------------------
Assignee: Pedro Ruivo (was: Dan Berindei)
> Retry remote get after topology change if all the targets are no longer owners
> ------------------------------------------------------------------------------
>
> Key: ISPN-3315
> URL: https://issues.jboss.org/browse/ISPN-3315
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.3.0.Final
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Priority: Critical
> Labels: 620
>
> It's possible for a remote get to reach its intended targets only after they are no longer owners, and they don't have the key anymore. If that happens, instead of returning a {{null}} value, the originator should retry on the new owners (possibly in a loop).
> Note that this is different from all the owners leaving the cluster at the same time: in that case retrying on the new owners wouldn't make a difference, because data would be lost anyway.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3315) Retry remote get after topology change if all the targets are no longer owners
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3315?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3315:
--------------------------------
Fix Version/s: 6.0.0.Final
> Retry remote get after topology change if all the targets are no longer owners
> ------------------------------------------------------------------------------
>
> Key: ISPN-3315
> URL: https://issues.jboss.org/browse/ISPN-3315
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.3.0.Final
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Priority: Critical
> Labels: 620
> Fix For: 6.0.0.Final
>
>
> It's possible for a remote get to reach its intended targets only after they are no longer owners, and they don't have the key anymore. If that happens, instead of returning a {{null}} value, the originator should retry on the new owners (possibly in a loop).
> Note that this is different from all the owners leaving the cluster at the same time: in that case retrying on the new owners wouldn't make a difference, because data would be lost anyway.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3454) Hot Rod client doesn't retry operation on RemoteException
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-3454?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño commented on ISPN-3454:
----------------------------------------
Try [branch|https://github.com/galderz/infinispan/tree/t_3454] with possible fix in resilience jobs.
> Hot Rod client doesn't retry operation on RemoteException
> ---------------------------------------------------------
>
> Key: ISPN-3454
> URL: https://issues.jboss.org/browse/ISPN-3454
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 6.0.0.Alpha3
> Reporter: Michal Linhard
> Assignee: Galder Zamarreño
> Priority: Critical
> Labels: 620
> Fix For: 6.0.0.CR2, 6.0.0.Final
>
>
> This is a client-side problem.
> In a resilience test with 4 nodes where 1 is killed, I'm getting a lot of these:
> {code}
> 08:30:55,198 ERROR [org.jboss.smartfrog.jdg.loaddriver.DriverThread] (DriverThread-369) Error doing: PUT key399869 to node node04, took 493 ms
> org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for message id[821188] returned server error (status=0x85): org.infinispan.remoting.RemoteException: ISPN000217: Received exception from node01/default, see cause for remote stack trace
> at org.infinispan.client.hotrod.impl.protocol.Codec10.checkForErrorsInResponseStatus(Codec10.java:143)
> at org.infinispan.client.hotrod.impl.protocol.Codec10.readHeader(Codec10.java:99)
> at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56)
> at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:50)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:30)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:19)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:46)
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:209)
> at org.infinispan.client.hotrod.impl.RemoteCacheSupport.put(RemoteCacheSupport.java:79)
> at org.jboss.qa.jdg.adapter.Infinispan60Adapter$HotRodRemoteCacheAdapter.put(Infinispan60Adapter.java:269)
> at org.jboss.smartfrog.jdg.loaddriver.DriverThreadImpl.makeRequest(DriverThreadImpl.java:265)
> at org.jboss.smartfrog.jdg.loaddriver.DriverThreadImpl.run(DriverThreadImpl.java:378)
> {code}
> Isn't this a recoverable problem that shouldn't be left to user to handle ?
> source:
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/RESILIENCE...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-2177) Refactor AbstractCacheTransaction
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-2177?page=com.atlassian.jira.plugin.... ]
Work on ISPN-2177 stopped by William Burns.
> Refactor AbstractCacheTransaction
> -----------------------------------
>
> Key: ISPN-2177
> URL: https://issues.jboss.org/browse/ISPN-2177
> Project: Infinispan
> Issue Type: Feature Request
> Components: Transactions
> Affects Versions: 5.1.2.FINAL
> Reporter: Mircea Markus
> Assignee: William Burns
> Labels: refactoring, transaction
>
> There are several collections holding transaction related information in the AbstractCacheTransaction:
> - lockedKeys: this holds all the keys that were actually locked on the local node
> - affectedKeys: this holds all the keys that were acquired by the transaction allover the cluster
> - backupKeyLocks: this holds all the locks for which the local node is a secondary data owner.
> To do:
> - affectedKeys belongs to LocalCacheTransaction(subclass) and no point in having it in the AbstractCacheTransaction
> - a better name for affectedKeys might be "clusterLockedKey" and for lockedKeys --> localLokedKeys
> - also add a Javadoc explaining the correlation between these key groups
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3645) StateTransferLargeObjectTest hangs randomly
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3645?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3645:
--------------------------------
Priority: Major (was: Critical)
> StateTransferLargeObjectTest hangs randomly
> -------------------------------------------
>
> Key: ISPN-3645
> URL: https://issues.jboss.org/browse/ISPN-3645
> Project: Infinispan
> Issue Type: Bug
> Components: RPC
> Affects Versions: 6.0.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: 620
> Fix For: 6.0.0.CR2
>
> Attachments: stlot.stack
>
>
> StateTransferLargeObject sometimes hangs in the second part of the test, when it checks that all the nodes in the cluster can read the inserted values. I was able to make it hang reliably when run separately, by increasing the number of keys from 1000 to 5000.
> The cause is probably JGRP-1675, as many OOB threads appear to be stuck in FlowControl.decrementIfEnoughCredits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3556) When LockControlCommand fails on an owner, the rollback command is not sent
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3556?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3556:
--------------------------------
Assignee: Pedro Ruivo (was: Dan Berindei)
> When LockControlCommand fails on an owner, the rollback command is not sent
> ---------------------------------------------------------------------------
>
> Key: ISPN-3556
> URL: https://issues.jboss.org/browse/ISPN-3556
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.2.7.Final, 5.3.0.Final, 6.0.0.Beta1
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Priority: Critical
> Labels: 620
> Fix For: 6.0.0.Final
>
>
> If a transaction starts with a {{lock()}} operation and the lock fails on one of the owners (e.g. because of a {{SuspectException}}), the rollback command should still be sent to all the live owners.
> However, because a locked key is only registered in the {{affectedKeys}} collection after a successful lock operation (in {{PessimisticLockingInterceptor.acquireRemoteIfNeeded()}}, the rollback command is not sent to any owners.
> This is in a pessimistic cache. However, looking at the {{OptimisticLockingInterceptor.acquireAllLocks()}} code I think I see a similar problem: it's possible that a key is locked, but the write skew check fails and the key is not added to the {{affectedKeys}} collection. We should always register the key first and attempt to lock it after.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3316) CDI Cache interceptor tests are failing while running in EAP container
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3316?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3316:
--------------------------------
Assignee: Tristan Tarrant (was: Mircea Markus)
> CDI Cache interceptor tests are failing while running in EAP container
> ----------------------------------------------------------------------
>
> Key: ISPN-3316
> URL: https://issues.jboss.org/browse/ISPN-3316
> Project: Infinispan
> Issue Type: Bug
> Components: CDI integration
> Affects Versions: 5.3.0.Final
> Reporter: Anna Manukyan
> Assignee: Tristan Tarrant
> Labels: 620
>
> While migration of the CDI related tests to work under EAP container (at the moment using 6.0.GA), it was found that the tests related to interceptors are failing. The issue relates to
> org.infinispan.cdi.test.interceptor.CachePutInterceptorTest, org.infinispan.cdi.test.interceptor.CacheRemoveAllInterceptorTest, org.infinispan.cdi.test.interceptor.CacheRemoveInterceptorTest, org.infinispan.cdi.test.interceptor.CacheResultInterceptorTest.
> The failure relates to assertions. The thing is that all actions which are done using javax.cache.cache-api annotations, work properly (I've added some logs e.g. in CachePutInterceptor, and it shows that the data is put to the cache properly).
> But later when the test wants to verify that the data is really put to the data, retrieves the cache from the injected CacheContainer, the cache is empty - the data is not there.
> The issue appeared since the latest changes to the Infinispan-CDI module, and split to infinispan-jcache module. For the previous version, the tests are passed under EAP container.
> The example above was given for the CachePutInterceptorTest, but the same refers to the rest of them.
> The git repo for the sources, will be provided later.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3432) Data put to index enabled cache with Infinispan Directory provider using Async. JDBC StringBased CacheStore fails
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3432?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3432:
--------------------------------
Fix Version/s: 6.0.0.Final
> Data put to index enabled cache with Infinispan Directory provider using Async. JDBC StringBased CacheStore fails
> -----------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-3432
> URL: https://issues.jboss.org/browse/ISPN-3432
> Project: Infinispan
> Issue Type: Bug
> Components: Querying
> Affects Versions: 6.0.0.Alpha1
> Reporter: Anna Manukyan
> Assignee: Adrian Nistor
> Priority: Critical
> Labels: 620
> Fix For: 6.0.0.Final
>
> Attachments: async-config.xml
>
>
> Hi,
> this issue is related to the ISPN-3090, but I thought to specify this case separately for bringing detailed explanation for the configuration and thrown exceptions.
> The issue relates to the performance tests for Index enabled Infinispan cache, with configured Infinispan directory and Async JDBC. String Based Cache store.
> The tests are running on 4 nodes and performing puts/gets on all nodes with many threads.
> The problem is that, during data put, the following exceptions are thrown continuously:
> {code}
> 04:04:05,633 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index query-1) HSEARCH000058: Exception occurred org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: []
> Primary Failure:
> Entity org.radargun.cachewrappers.InfinispanQueryWrapper$QueryableData Id S:_InstallBenchmarkStage_0 Work Type org.hibernate.search.backend.UpdateLuceneWork
> org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: []
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:554)
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1138)
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:148)
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:115)
> at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117)
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:101)
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> ......
> 04:14:21,605 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index query-1) HSEARCH000058: Exception occurred org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: []
> Primary Failure:
> Entity org.radargun.cachewrappers.InfinispanQueryWrapper$QueryableData Id S:key_0_0_0000000000000017 Work Type org.hibernate.search.backend.UpdateLuceneWork
> org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: []
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:554)
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359)
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1138)
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:148)
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:115)
> at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117)
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:101)
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> {code}
> You can find the cache configuration attached.
> Yet another thing to mention:
> if the following line is added to the cache configuration:
> {code}
> <property name="default.indexmanager" value="org.infinispan.query.indexmanager.InfinispanIndexManager" />
> {code}
> then the issue is gone - no lock issue appears then.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months