November 2013 - infinispan-issues

[JBoss JIRA] (ISPN-3315) Retry remote get after topology change if all the targets are no longer owners

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3315?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3315: -------------------------------- Assignee: Pedro Ruivo (was: Dan Berindei) > Retry remote get after topology change if all the targets are no longer owners > ------------------------------------------------------------------------------ > > Key: ISPN-3315 > URL: https://issues.jboss.org/browse/ISPN-3315 > Project: Infinispan > Issue Type: Bug > Components: State transfer > Affects Versions: 5.3.0.Final > Reporter: Dan Berindei > Assignee: Pedro Ruivo > Priority: Critical > Labels: 620 > > It's possible for a remote get to reach its intended targets only after they are no longer owners, and they don't have the key anymore. If that happens, instead of returning a {{null}} value, the originator should retry on the new owners (possibly in a loop). > Note that this is different from all the owners leaving the cluster at the same time: in that case retrying on the new owners wouldn't make a difference, because data would be lost anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3315) Retry remote get after topology change if all the targets are no longer owners

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3315?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3315: -------------------------------- Fix Version/s: 6.0.0.Final > Retry remote get after topology change if all the targets are no longer owners > ------------------------------------------------------------------------------ > > Key: ISPN-3315 > URL: https://issues.jboss.org/browse/ISPN-3315 > Project: Infinispan > Issue Type: Bug > Components: State transfer > Affects Versions: 5.3.0.Final > Reporter: Dan Berindei > Assignee: Pedro Ruivo > Priority: Critical > Labels: 620 > Fix For: 6.0.0.Final > > > It's possible for a remote get to reach its intended targets only after they are no longer owners, and they don't have the key anymore. If that happens, instead of returning a {{null}} value, the originator should retry on the new owners (possibly in a loop). > Note that this is different from all the owners leaving the cluster at the same time: in that case retrying on the new owners wouldn't make a difference, because data would be lost anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3454) Hot Rod client doesn't retry operation on RemoteException

by Galder Zamarreño (JIRA)

[ https://issues.jboss.org/browse/ISPN-3454?page=com.atlassian.jira.plugin.... ] Galder Zamarreño commented on ISPN-3454: ---------------------------------------- Try [branch|https://github.com/galderz/infinispan/tree/t_3454] with possible fix in resilience jobs. > Hot Rod client doesn't retry operation on RemoteException > --------------------------------------------------------- > > Key: ISPN-3454 > URL: https://issues.jboss.org/browse/ISPN-3454 > Project: Infinispan > Issue Type: Bug > Affects Versions: 6.0.0.Alpha3 > Reporter: Michal Linhard > Assignee: Galder Zamarreño > Priority: Critical > Labels: 620 > Fix For: 6.0.0.CR2, 6.0.0.Final > > > This is a client-side problem. > In a resilience test with 4 nodes where 1 is killed, I'm getting a lot of these: > {code} > 08:30:55,198 ERROR [org.jboss.smartfrog.jdg.loaddriver.DriverThread] (DriverThread-369) Error doing: PUT key399869 to node node04, took 493 ms > org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for message id[821188] returned server error (status=0x85): org.infinispan.remoting.RemoteException: ISPN000217: Received exception from node01/default, see cause for remote stack trace > at org.infinispan.client.hotrod.impl.protocol.Codec10.checkForErrorsInResponseStatus(Codec10.java:143) > at org.infinispan.client.hotrod.impl.protocol.Codec10.readHeader(Codec10.java:99) > at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:56) > at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:50) > at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:30) > at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:19) > at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:46) > at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:209) > at org.infinispan.client.hotrod.impl.RemoteCacheSupport.put(RemoteCacheSupport.java:79) > at org.jboss.qa.jdg.adapter.Infinispan60Adapter$HotRodRemoteCacheAdapter.put(Infinispan60Adapter.java:269) > at org.jboss.smartfrog.jdg.loaddriver.DriverThreadImpl.makeRequest(DriverThreadImpl.java:265) > at org.jboss.smartfrog.jdg.loaddriver.DriverThreadImpl.run(DriverThreadImpl.java:378) > {code} > Isn't this a recoverable problem that shouldn't be left to user to handle ? > source: > https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/RESILIENCE... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3426) L1 inconsistency in tx caches when backup owner replies to remote get

by William Burns (JIRA)

[ https://issues.jboss.org/browse/ISPN-3426?page=com.atlassian.jira.plugin.... ] William Burns resolved ISPN-3426. --------------------------------- Resolution: Duplicate Issue This is fixed by ISPN-3648 by having backup owners also invalidate. > L1 inconsistency in tx caches when backup owner replies to remote get > --------------------------------------------------------------------- > > Key: ISPN-3426 > URL: https://issues.jboss.org/browse/ISPN-3426 > Project: Infinispan > Issue Type: Bug > Affects Versions: 6.0.0.Alpha2 > Reporter: Pedro Ruivo > Assignee: William Burns > Priority: Critical > Labels: 620 > Fix For: 6.0.0.Final > > > Consider the following scenario > {noformat}node1: performs a remote get > node2 (backup owner): receives and replies to the remote get with v1 > node1: receives the repli and store in L1_ key => v1 > node2 (backup owner): commits a new value (v2) for the key (i.e. processes a commit command) > node3 (primary owner): sends the invalidation (but not for node1 because it hasn't received the remote get yet) and commits a new value (v2) for the key (i.e. processes a commit command) > node3 (primary owner): replies to the remote get with v2 > node1: ignores the reply because it used the node2 reply > {noformat} > *conclustion*: node1 keeps the old value stored in L1. > Possible solutions described here: > [L1 consistency for transactional caches|http://markmail.org/thread/ckbihuj5ch7qtdzj] > Could also be interesting: > [Staggered get question|http://markmail.org/thread/qun2frj2u5a2hnj2] > and > [ISPN-825|https://issues.jboss.org/browse/ISPN-825] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2177) Refactor AbstractCacheTransaction

by William Burns (JIRA)

[ https://issues.jboss.org/browse/ISPN-2177?page=com.atlassian.jira.plugin.... ] Work on ISPN-2177 stopped by William Burns. > Refactor AbstractCacheTransaction > ----------------------------------- > > Key: ISPN-2177 > URL: https://issues.jboss.org/browse/ISPN-2177 > Project: Infinispan > Issue Type: Feature Request > Components: Transactions > Affects Versions: 5.1.2.FINAL > Reporter: Mircea Markus > Assignee: William Burns > Labels: refactoring, transaction > > There are several collections holding transaction related information in the AbstractCacheTransaction: > - lockedKeys: this holds all the keys that were actually locked on the local node > - affectedKeys: this holds all the keys that were acquired by the transaction allover the cluster > - backupKeyLocks: this holds all the locks for which the local node is a secondary data owner. > To do: > - affectedKeys belongs to LocalCacheTransaction(subclass) and no point in having it in the AbstractCacheTransaction > - a better name for affectedKeys might be "clusterLockedKey" and for lockedKeys --> localLokedKeys > - also add a Javadoc explaining the correlation between these key groups -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3645) StateTransferLargeObjectTest hangs randomly

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3645?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3645: -------------------------------- Priority: Major (was: Critical) > StateTransferLargeObjectTest hangs randomly > ------------------------------------------- > > Key: ISPN-3645 > URL: https://issues.jboss.org/browse/ISPN-3645 > Project: Infinispan > Issue Type: Bug > Components: RPC > Affects Versions: 6.0.0.CR1 > Reporter: Dan Berindei > Assignee: Dan Berindei > Labels: 620 > Fix For: 6.0.0.CR2 > > Attachments: stlot.stack > > > StateTransferLargeObject sometimes hangs in the second part of the test, when it checks that all the nodes in the cluster can read the inserted values. I was able to make it hang reliably when run separately, by increasing the number of keys from 1000 to 5000. > The cause is probably JGRP-1675, as many OOB threads appear to be stuck in FlowControl.decrementIfEnoughCredits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3556) When LockControlCommand fails on an owner, the rollback command is not sent

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3556?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3556: -------------------------------- Assignee: Pedro Ruivo (was: Dan Berindei) > When LockControlCommand fails on an owner, the rollback command is not sent > --------------------------------------------------------------------------- > > Key: ISPN-3556 > URL: https://issues.jboss.org/browse/ISPN-3556 > Project: Infinispan > Issue Type: Bug > Components: Locking and Concurrency > Affects Versions: 5.2.7.Final, 5.3.0.Final, 6.0.0.Beta1 > Reporter: Dan Berindei > Assignee: Pedro Ruivo > Priority: Critical > Labels: 620 > Fix For: 6.0.0.Final > > > If a transaction starts with a {{lock()}} operation and the lock fails on one of the owners (e.g. because of a {{SuspectException}}), the rollback command should still be sent to all the live owners. > However, because a locked key is only registered in the {{affectedKeys}} collection after a successful lock operation (in {{PessimisticLockingInterceptor.acquireRemoteIfNeeded()}}, the rollback command is not sent to any owners. > This is in a pessimistic cache. However, looking at the {{OptimisticLockingInterceptor.acquireAllLocks()}} code I think I see a similar problem: it's possible that a key is locked, but the write skew check fails and the key is not added to the {{affectedKeys}} collection. We should always register the key first and attempt to lock it after. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3316) CDI Cache interceptor tests are failing while running in EAP container

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3316?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3316: -------------------------------- Assignee: Tristan Tarrant (was: Mircea Markus) > CDI Cache interceptor tests are failing while running in EAP container > ---------------------------------------------------------------------- > > Key: ISPN-3316 > URL: https://issues.jboss.org/browse/ISPN-3316 > Project: Infinispan > Issue Type: Bug > Components: CDI integration > Affects Versions: 5.3.0.Final > Reporter: Anna Manukyan > Assignee: Tristan Tarrant > Labels: 620 > > While migration of the CDI related tests to work under EAP container (at the moment using 6.0.GA), it was found that the tests related to interceptors are failing. The issue relates to > org.infinispan.cdi.test.interceptor.CachePutInterceptorTest, org.infinispan.cdi.test.interceptor.CacheRemoveAllInterceptorTest, org.infinispan.cdi.test.interceptor.CacheRemoveInterceptorTest, org.infinispan.cdi.test.interceptor.CacheResultInterceptorTest. > The failure relates to assertions. The thing is that all actions which are done using javax.cache.cache-api annotations, work properly (I've added some logs e.g. in CachePutInterceptor, and it shows that the data is put to the cache properly). > But later when the test wants to verify that the data is really put to the data, retrieves the cache from the injected CacheContainer, the cache is empty - the data is not there. > The issue appeared since the latest changes to the Infinispan-CDI module, and split to infinispan-jcache module. For the previous version, the tests are passed under EAP container. > The example above was given for the CachePutInterceptorTest, but the same refers to the rest of them. > The git repo for the sources, will be provided later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3531) LifecycleManager of query module keeps references to per cache manager resources (jmx related)

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3531?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3531: -------------------------------- Fix Version/s: 6.0.0.Final > LifecycleManager of query module keeps references to per cache manager resources (jmx related) > ---------------------------------------------------------------------------------------------- > > Key: ISPN-3531 > URL: https://issues.jboss.org/browse/ISPN-3531 > Project: Infinispan > Issue Type: Bug > Components: Querying > Affects Versions: 6.0.0.Alpha4 > Reporter: Adrian Nistor > Assignee: Adrian Nistor > Labels: 620 > Fix For: 6.0.0.Final > > > The resources in question are gathered when a cache manager is started and are used again when it is stopped. This does not work well if there are multiple cache managers that have different jmx domains. > Cache stop event also does not seem to be processed correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

[JBoss JIRA] (ISPN-3432) Data put to index enabled cache with Infinispan Directory provider using Async. JDBC StringBased CacheStore fails

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-3432?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-3432: -------------------------------- Fix Version/s: 6.0.0.Final > Data put to index enabled cache with Infinispan Directory provider using Async. JDBC StringBased CacheStore fails > ----------------------------------------------------------------------------------------------------------------- > > Key: ISPN-3432 > URL: https://issues.jboss.org/browse/ISPN-3432 > Project: Infinispan > Issue Type: Bug > Components: Querying > Affects Versions: 6.0.0.Alpha1 > Reporter: Anna Manukyan > Assignee: Adrian Nistor > Priority: Critical > Labels: 620 > Fix For: 6.0.0.Final > > Attachments: async-config.xml > > > Hi, > this issue is related to the ISPN-3090, but I thought to specify this case separately for bringing detailed explanation for the configuration and thrown exceptions. > The issue relates to the performance tests for Index enabled Infinispan cache, with configured Infinispan directory and Async JDBC. String Based Cache store. > The tests are running on 4 nodes and performing puts/gets on all nodes with many threads. > The problem is that, during data put, the following exceptions are thrown continuously: > {code} > 04:04:05,633 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index query-1) HSEARCH000058: Exception occurred org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: [] > Primary Failure: > Entity org.radargun.cachewrappers.InfinispanQueryWrapper$QueryableData Id S:_InstallBenchmarkStage_0 Work Type org.hibernate.search.backend.UpdateLuceneWork > org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: [] > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667) > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:554) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1138) > at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:148) > at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:115) > at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117) > at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:101) > at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > ...... > 04:14:21,605 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] (Hibernate Search: Index updates queue processor for index query-1) HSEARCH000058: Exception occurred org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: [] > Primary Failure: > Entity org.radargun.cachewrappers.InfinispanQueryWrapper$QueryableData Id S:key_0_0_0000000000000017 Work Type org.hibernate.search.backend.UpdateLuceneWork > org.apache.lucene.index.IndexNotFoundException: no segments* file found in InfinispanDirectory{indexName='query'}: files: [] > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667) > at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:554) > at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:359) > at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1138) > at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:148) > at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:115) > at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriter(AbstractWorkspaceImpl.java:117) > at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:101) > at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:67) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > {code} > You can find the cache configuration attached. > Yet another thing to mention: > if the following line is added to the cache configuration: > {code} > <property name="default.indexmanager" value="org.infinispan.query.indexmanager.InfinispanIndexManager" /> > {code} > then the issue is gone - no lock issue appears then. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 5 months

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues November 2013