March 2012 - infinispan-issues - Jboss List Archives

[JBoss JIRA] (ISPN-1586) inconsistent cache data in replication cluster with local (not shared) cache store

by dex chen (Created) (JIRA)

inconsistent cache data in replication cluster with local (not shared) cache store ---------------------------------------------------------------------------------- Key: ISPN-1586 URL: https://issues.jboss.org/browse/ISPN-1586 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Environment: ISPN 5.0.0. Final and ISPN 5.1 sanpshot Java 1.7 Linux Cent OS Reporter: dex chen Assignee: Manik Surtani I rerun my test (an embedded ISPN cluser) with ISPN 5.0.0. final and 5.1 Sanpshot code. It is configured in "replication", using local cache store, and preload=true, purgeOnStartup=false .. (see the whole config below). I will get the inconsistent data among the nodes in the following scenario: 1) start 2 node cluster 2) after the cluster is formed, add some data to the cache k1-->v1 k2-->v2 I will see the data replication working perfectly at this point. 3) bring node 2 down 4) delete entry k1-->v1 through node1 Note: At this point, on the local (persistent) cache store on the node2 have 2 entries. 5) start node2, and wait to join the cluster 6) after state merging, you will see now that node1 has 1 entry and nod2 has 2 entries. I am expecting that the data should be consistent across the cluster. Here is the infinispan config: <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:infinispan:config:5.0 http://www.infinispan.org/schemas/infinispan-config-5.0.xsd" xmlns="urn:infinispan:config:5.0"> <global> <transport clusterName="demoCluster" machineId="node1" rackId="r1" nodeName="dexlaptop" > <properties> <property name="configurationFile" value="./jgroups-tcp.xml" /> </properties> </transport> <globalJmxStatistics enabled="true"/> </global> <default> <locking isolationLevel="READ_COMMITTED" lockAcquisitionTimeout="20000" writeSkewCheck="false" concurrencyLevel="5000" useLockStriping="false" /> <jmxStatistics enabled="true"/> <clustering mode="replication"> <stateRetrieval timeout="240000" fetchInMemoryState="true" alwaysProvideInMemoryState="false" />  <sync replTimeout="20000"/> </clustering> <loaders passivation="false" shared="false" preload="true"> <loader class="org.infinispan.loaders.jdbc.stringbased.JdbcStringBasedCacheStore" fetchPersistentState="true" purgeOnStartup="false">  <properties> <property name="stringsTableNamePrefix" value="ISPN_STRING_TABLE"/> <property name="idColumnName" value="ID_COLUMN"/> <property name="dataColumnName" value="DATA_COLUMN"/> <property name="timestampColumnName" value="TIMESTAMP_COLUMN"/> <property name="timestampColumnType" value="BIGINT"/> <property name="connectionFactoryClass" value="org.infinispan.loaders.jdbc.connectionfactory.PooledConnectionFactory"/> <property name="connectionUrl" value="jdbc:h2:file:/var/tmp/h2cachestore;DB_CLOSE_DELAY=-1"/> <property name="userName" value="sa"/> <property name="driverClass" value="org.h2.Driver"/> <property name="idColumnType" value="VARCHAR(255)"/> <property name="dataColumnType" value="BINARY"/> <property name="dropTableOnExit" value="false"/> <property name="createTableOnStart" value="true"/> </properties>  </loader> </loaders> </default> </infinispan> Basically, current ISPN implementation in state transfer will result in data insistence among nodes in replication mode and each node has local cache store. I found code BaseStateTransferManagerImpl's applyState code does not remove stale data in the local cache store and result in inconsistent data when joins a cluster: Here is the code snipt of applyState(): public void applyState(Collection<InternalCacheEntry> state, Address sender, int viewId) throws InterruptedException { ..... for (InternalCacheEntry e : state) { InvocationContext ctx = icc.createInvocationContext(false, 1); // locking not necessary as during rehashing we block all transactions ctx.setFlags(CACHE_MODE_LOCAL, SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP, SKIP_SHARED_CACHE_STORE, SKIP_LOCKING, SKIP_OWNERSHIP_CHECK); try { PutKeyValueCommand put = cf.buildPutKeyValueCommand(e.getKey(), e.getValue(), e.getLifespan(), e.getMaxIdle(), ctx.getFlags()); interceptorChain.invoke(ctx, put); } catch (Exception ee) { log.problemApplyingStateForKey(ee.getMessage(), e.getKey()); } } ... } As we can see that the code bascically try to add all data entryies got from the cluster (other node). Hence, it does not know any previous entries were deleted from the cluster which exist in its local cache store. This is exactly my test case (my confiuration is that each node has its own cache store and in replication mode). To fix this, we need to delete any entries from the local cache/cache store which no longer exist in the new state. I modified the above method by adding the following code before put loop, and it fixed the problem in my configuration: //Remove entries which no loger exist in the new state from local cache/cache store for (InternalCacheEntry ie: dataContainer.entrySet()) { if (!state.contains(ie)) { log.debug("Try to delete local store entry no loger exists in the new state: " + ie.getKey()); InvocationContext ctx = icc.createInvocationContext(false, 1); // locking not necessary as during rehashing we block all transactions ctx.setFlags(CACHE_MODE_LOCAL, SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP, SKIP_SHARED_CACHE_STORE, SKIP_LOCKING, SKIP_OWNERSHIP_CHECK); try { RemoveCommand remove = cf.buildRemoveCommand(ie.getKey(), ie.getValue(), ctx.getFlags()); interceptorChain.invoke(ctx, remove); dataContainer.remove(ie.getKey()); } catch (Exception ee) { log.error("failed to delete local store entry", ee); } } } ... Obvious, the above "fix" is based on assumption/configure that dataContainer will have all local entries, i.e., preload=true, no enviction replication. The real fix, I think, we need delegate the syncState(state) to cache store impl, where we can check the configurations and do the right thing. For example, in the cache store impl, we can calculate the changes based on local data and new state, and apply the changes there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 7 months

12
25
0 / 0

[JBoss JIRA] (ISPN-1830) L1: On topology changes we should propagate the key requestors information to the new owners

by Dan Berindei (JIRA)

Dan Berindei created ISPN-1830: ---------------------------------- Summary: L1: On topology changes we should propagate the key requestors information to the new owners Key: ISPN-1830 URL: https://issues.jboss.org/browse/ISPN-1830 Project: Infinispan Issue Type: Task Components: Distributed Cache Affects Versions: 5.1.0.FINAL Reporter: Dan Berindei Assignee: Manik Surtani Fix For: 5.2.0.FINAL I think we are losing information about where a key needs to be invalidated when a node changes from owner to non-owner (e.g. because another node joined): * We lose the list of requestors stored on this node. Even if all ClusteredGetCommands reached all the current owners (which we are about to change with ISPN-825), once all the current owners leave the new owners will not know about this node's old requestors. * We don't add ourselves as a requestor for the key on the new owners when we invalidate the entry and move it to L1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

2
13
0 / 0

[JBoss JIRA] (ISPN-1797) Implement MongoDB based cache store

by Andrew Pushkin (JIRA)

Andrew Pushkin created ISPN-1797: ------------------------------------ Summary: Implement MongoDB based cache store Key: ISPN-1797 URL: https://issues.jboss.org/browse/ISPN-1797 Project: Infinispan Issue Type: Feature Request Components: Loaders and Stores Affects Versions: 5.1.0.FINAL Reporter: Andrew Pushkin Assignee: Manik Surtani I have an implementation to submit if you are interested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

7
17
0 / 0

[JBoss JIRA] Created: (ISPN-517) CacheStore impl based on LibIO, as per HornetQ's disk spooling

by Manik Surtani (JIRA)

CacheStore impl based on LibIO, as per HornetQ's disk spooling -------------------------------------------------------------- Key: ISPN-517 URL: https://jira.jboss.org/browse/ISPN-517 Project: Infinispan Issue Type: Feature Request Components: Loaders and Stores Reporter: Manik Surtani Fix For: 5.1.0.BETA1, 5.1.0.Final HornetQ uses Linux LibIO API to create a very fast and efficient disk spooling mechanism. Should look into the possibility of creating a CacheStore around this. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

5
11
0 / 0

[JBoss JIRA] (ISPN-1954) Wrong ServerWorker thread name in Hot rod server

by Michal Linhard (JIRA)

Michal Linhard created ISPN-1954: ------------------------------------ Summary: Wrong ServerWorker thread name in Hot rod server Key: ISPN-1954 URL: https://issues.jboss.org/browse/ISPN-1954 Project: Infinispan Issue Type: Bug Components: Cache Server Reporter: Michal Linhard Assignee: Galder Zamarreño Priority: Minor When debugging/profiling EDG, we're seeing HotRod server calls being executed by MemcachedServerWorker threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

4
18
0 / 0

[JBoss JIRA] Created: (ISPN-143) Security needs to be considered

by Manik Surtani (JIRA)

Security needs to be considered ------------------------------- Key: ISPN-143 URL: https://jira.jboss.org/jira/browse/ISPN-143 Project: Infinispan Issue Type: Feature Request Reporter: Manik Surtani Assignee: Manik Surtani Fix For: 4.1.0.CR1, 4.1.0.GA We need to consider security for Infinispan. Storing state on the cloud only makes sense if the data can be secured. Both encryption and authentication need to be considered, and all levels (wire protocol, cache store, public API, server module, etc) need to be taken in to consideration. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

3
6
0 / 0

[JBoss JIRA] (ISPN-1731) Threads waiting for key locks should not block state transfer

by Dan Berindei (JIRA)

Dan Berindei created ISPN-1731: ---------------------------------- Summary: Threads waiting for key locks should not block state transfer Key: ISPN-1731 URL: https://issues.jboss.org/browse/ISPN-1731 Project: Infinispan Issue Type: Task Components: State transfer Affects Versions: 5.1.0.CR3, 5.0.1.FINAL Reporter: Dan Berindei Assignee: Dan Berindei Fix For: 5.2.0.FINAL A write/lock command holds the state transfer lock for its entire duration, including while waiting to acquire key locks. Because of this, we can get a deadlock scenario: 1. Tx1 waits for key k1 while holding the state transfer lock 2. State transfer waits for Tx1 while blocking new write commands 3. Tx2 waits for state transfer to end while holding the k1 lock The only way out of this scenario at the moment is for Tx1 to time out and fail to acquire the lock. We should make it possible to release the state transfer lock temporarily and return to waiting for the key lock after state transfer has ended. ISPN-1424 might make this issue obsolete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

2
5
0 / 0

[JBoss JIRA] Created: (ISPN-1271) Consistent Hash externalizer loses grouping configuration

by Dan Berindei (JIRA)

Consistent Hash externalizer loses grouping configuration --------------------------------------------------------- Key: ISPN-1271 URL: https://issues.jboss.org/browse/ISPN-1271 Project: Infinispan Issue Type: Bug Components: Marshalling Affects Versions: 5.0.0.CR8 Reporter: Dan Berindei Assignee: Pete Muir -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

6
13
0 / 0

[JBoss JIRA] (ISPN-1704) IllegalStateException in surviving nodes during node crash in cluster

by Michal Linhard (Created) (JIRA)

IllegalStateException in surviving nodes during node crash in cluster --------------------------------------------------------------------- Key: ISPN-1704 URL: https://issues.jboss.org/browse/ISPN-1704 Project: Infinispan Issue Type: Bug Affects Versions: 5.1.0.CR3 Reporter: Michal Linhard Assignee: Manik Surtani This bug appeared in EDG build 96: http://hudson.qa.jboss.com/hudson/view/EDG6/view/EDG-QE/job/edg-60-build-... that contains Infinispan 5.1.0.CR3 Test scenario: 1. start 4 nodes (distributed cache) 2. wait 2 min 3. kill node2 4. wait 2 min 5. start node2 6. wait 2 min and end the test server side logs: http://hudson.qa.jboss.com/hudson/view/EDG6/view/EDG-QE/job/edg-60-experi... client side logs: http://hudson.qa.jboss.com/hudson/view/EDG6/view/EDG-QE/job/edg-60-experi... http://hudson.qa.jboss.com/hudson/view/EDG6/view/EDG-QE/job/edg-60-experi... after crashing of the node2, there were no other succesfull requests, most of the requests ended with this error: {code} ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (HotRodServerWorker-1-43) ISPN000136: Execution error: java.lang.IllegalStateException: Trying to release state transfer shared lock without acquiring it first {code} before showing the error on the client side, the requests had been blocked around 1,5min -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

7
26
0 / 0

[JBoss JIRA] Created: (ISPN-1263) Allow a type safe selection of the cache used by the JCache interceptors

by Kevin Pollet (JIRA)

Allow a type safe selection of the cache used by the JCache interceptors ------------------------------------------------------------------------ Key: ISPN-1263 URL: https://issues.jboss.org/browse/ISPN-1263 Project: Infinispan Issue Type: Feature Request Components: CDI integration Reporter: Kevin Pollet Assignee: Kevin Pollet Priority: Minor Currently the name of the cache which will be used by the JCache interceptors has to be specified in JCache annotations (e.g {{@CacheResult(cacheName="greeting-cache"}}). This can be error prone (typo, refactoring ...). The CDI integration module provides the possibility to associate a qualifier to a cache. This qualifier could be re-used to provide a type safe selection of the cache used by a JCache interceptor. Something like: {code} public class GreetingService { @CacheResult @GreetingCache public String greet(String name) { return "Hello" + name; } } {code} Here we need to be in sync with JCache specification. In my understanding of the spec the cache name is required for {{@CacheRemoveEntry}} and {{@CacheRemoveAll}} (but currently in the API there is a default value). For {{@CacheResult}} if no cache name is defined a default one is used. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

13 years, 8 months

2
5
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues March 2012