[JBoss JIRA] (ISPN-1586) inconsistent cache data in replication cluster with local (not shared) cache store
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-1586?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-1586:
--------------------------------
Fix Version/s: 5.2.1
5.3.0.Final
(was: 6.0.0.Final)
> inconsistent cache data in replication cluster with local (not shared) cache store
> ----------------------------------------------------------------------------------
>
> Key: ISPN-1586
> URL: https://issues.jboss.org/browse/ISPN-1586
> Project: Infinispan
> Issue Type: Bug
> Components: Core API
> Affects Versions: 5.0.0.FINAL, 5.1.0.CR1
> Environment: ISPN 5.0.0. Final and ISPN 5.1 sanpshot
> Java 1.7
> Linux Cent OS
> Reporter: dex chen
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.1, 5.3.0.Final
>
>
> I rerun my test (an embedded ISPN cluser) with ISPN 5.0.0. final and 5.1 Sanpshot code.
> It is configured in "replication", using local cache store, and preload=true, purgeOnStartup=false .. (see the whole config below).
> I will get the inconsistent data among the nodes in the following scenario:
> 1) start 2 node cluster
> 2) after the cluster is formed, add some data to the cache
> k1-->v1
> k2-->v2
> I will see the data replication working perfectly at this point.
> 3) bring node 2 down
> 4) delete entry k1-->v1 through node1
> Note: At this point, on the local (persistent) cache store on the node2 have 2 entries.
> 5) start node2, and wait to join the cluster
> 6) after state merging, you will see now that node1 has 1 entry and nod2 has 2 entries.
> I am expecting that the data should be consistent across the cluster.
> Here is the infinispan config:
> {code:xml}
> <infinispan
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="urn:infinispan:config:5.0 http://www.infinispan.org/schemas/infinispan-config-5.0.xsd"
> xmlns="urn:infinispan:config:5.0">
> <global>
> <transport clusterName="demoCluster"
> machineId="node1"
> rackId="r1" nodeName="dexlaptop"
> >
> <properties>
> <property name="configurationFile" value="./jgroups-tcp.xml" />
> </properties>
> </transport>
> <globalJmxStatistics enabled="true"/>
> </global>
> <default>
> <locking
> isolationLevel="READ_COMMITTED"
> lockAcquisitionTimeout="20000"
> writeSkewCheck="false"
> concurrencyLevel="5000"
> useLockStriping="false"
> />
> <jmxStatistics enabled="true"/>
> <clustering mode="replication">
> <stateRetrieval
> timeout="240000"
> fetchInMemoryState="true"
> alwaysProvideInMemoryState="false"
> />
> <!--
> Network calls are synchronous.
> -->
> <sync replTimeout="20000"/>
> </clustering>
> <loaders
> passivation="false"
> shared="false"
> preload="true">
> <loader
> class="org.infinispan.loaders.jdbc.stringbased.JdbcStringBasedCacheStore"
> fetchPersistentState="true"
> purgeOnStartup="false">
> <!-- set to true for not first node in the cluster in testing/demo -->
> <properties>
> <property name="stringsTableNamePrefix" value="ISPN_STRING_TABLE"/>
> <property name="idColumnName" value="ID_COLUMN"/>
> <property name="dataColumnName" value="DATA_COLUMN"/>
> <property name="timestampColumnName" value="TIMESTAMP_COLUMN"/>
> <property name="timestampColumnType" value="BIGINT"/>
> <property name="connectionFactoryClass" value="org.infinispan.loaders.jdbc.connectionfactory.PooledConnectionFactory"/>
> <property name="connectionUrl" value="jdbc:h2:file:/var/tmp/h2cachestore;DB_CLOSE_DELAY=-1"/>
> <property name="userName" value="sa"/>
> <property name="driverClass" value="org.h2.Driver"/>
> <property name="idColumnType" value="VARCHAR(255)"/>
> <property name="dataColumnType" value="BINARY"/>
> <property name="dropTableOnExit" value="false"/>
> <property name="createTableOnStart" value="true"/>
> </properties>
> <!--
> <async enabled="false" />
> -->
> </loader>
> </loaders>
> </default>
> </infinispan>
> {code}
> Basically, current ISPN implementation in state transfer will result in data insistence among nodes in replication mode and each node has local cache store.
> I found code BaseStateTransferManagerImpl's applyState code does not remove stale data in the local cache store and result in inconsistent data when joins a cluster:
> Here is the code snipt of applyState():
> {code:java}
> public void applyState(Collection<InternalCacheEntry> state,
> Address sender, int viewId) throws InterruptedException {
> .....
>
> for (InternalCacheEntry e : state) {
> InvocationContext ctx = icc.createInvocationContext(false, 1);
> // locking not necessary as during rehashing we block all transactions
> ctx.setFlags(CACHE_MODE_LOCAL, SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP, SKIP_SHARED_CACHE_STORE, SKIP_LOCKING,
> SKIP_OWNERSHIP_CHECK);
> try {
> PutKeyValueCommand put = cf.buildPutKeyValueCommand(e.getKey(), e.getValue(), e.getLifespan(), e.getMaxIdle(), ctx.getFlags());
> interceptorChain.invoke(ctx, put);
> } catch (Exception ee) {
> log.problemApplyingStateForKey(ee.getMessage(), e.getKey());
> }
> }
>
> ...
> }
> {code}
> As we can see that the code bascically try to add all data entryies got from the cluster (other node). Hence, it does not know any previous entries were deleted from the cluster which exist in its local cache store. This is exactly my test case (my confiuration is that each node has its own cache store and in replication mode).
> To fix this, we need to delete any entries from the local cache/cache store which no longer exist in the new state.
> I modified the above method by adding the following code before put loop, and it fixed the problem in my configuration:
> {code:java}
> //Remove entries which no loger exist in the new state from local cache/cache store
> for (InternalCacheEntry ie: dataContainer.entrySet()) {
>
> if (!state.contains(ie)) {
> log.debug("Try to delete local store entry no loger exists in the new state: " + ie.getKey());
> InvocationContext ctx = icc.createInvocationContext(false, 1);
> // locking not necessary as during rehashing we block all transactions
> ctx.setFlags(CACHE_MODE_LOCAL, SKIP_CACHE_LOAD, SKIP_REMOTE_LOOKUP, SKIP_SHARED_CACHE_STORE, SKIP_LOCKING,
> SKIP_OWNERSHIP_CHECK);
> try {
> RemoveCommand remove = cf.buildRemoveCommand(ie.getKey(), ie.getValue(), ctx.getFlags());
> interceptorChain.invoke(ctx, remove);
> dataContainer.remove(ie.getKey());
> } catch (Exception ee) {
> log.error("failed to delete local store entry", ee);
> }
> }
> }
> ...
> {code}
> Obvious, the above "fix" is based on assumption/configure that dataContainer will have all local entries, i.e., preload=true, no enviction replication.
> The real fix, I think, we need delegate the syncState(state) to cache store impl, where we can check the configurations and do the right thing.
> For example, in the cache store impl, we can calculate the changes based on local data and new state, and apply the changes there.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2598) NullPointerException in case of passing customized FetchOptions to ClusteredQuery iterator() method
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-2598?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero updated ISPN-2598:
----------------------------------
Fix Version/s: 6.0.0.Beta1
(was: 5.3.0.Final)
> NullPointerException in case of passing customized FetchOptions to ClusteredQuery iterator() method
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-2598
> URL: https://issues.jboss.org/browse/ISPN-2598
> Project: Infinispan
> Issue Type: Bug
> Components: Querying
> Reporter: Anna Manukyan
> Assignee: Sanne Grinovero
> Fix For: 6.0.0.Beta1
>
> Attachments: ClusteredQueryTest.java
>
>
> While running tests for query module, I've found the following thing. I'm not sure whether this is a bug, but the same flow for CacheQueryImpl works in different way rather than for ClusteredCacheQueryImpl.
> I'm running the following command on already created ClusteredQuery:
> {code}
> ResultIterator iterator = cacheQuery.iterator(new FetchOptions() {
> public FetchOptions fetchMode(FetchMode fetchMode) {
> return null;
> }
> });
> {code}
> This code throws NullPointerException, as the check of FetchMode is done in switch/case statement.
> The same code for CacheQuery throws IllegalArgumentException, as the check is done with if/else statement.
> Please find the test attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-825) Consider staggering remote get requests when using DIST
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-825?page=com.atlassian.jira.plugin.s... ]
Mircea Markus commented on ISPN-825:
------------------------------------
This should also halve the number of OOB threads, see ISPN-2710
> Consider staggering remote get requests when using DIST
> -------------------------------------------------------
>
> Key: ISPN-825
> URL: https://issues.jboss.org/browse/ISPN-825
> Project: Infinispan
> Issue Type: Feature Request
> Components: RPC
> Affects Versions: 4.1.0.Final
> Reporter: Manik Surtani
> Assignee: Mircea Markus
> Labels: optimization, performance
> Fix For: 5.3.0.Final
>
>
> In DIST mode, when a request is made on a key that is not mapped locally, a remote get is sent to all data owners of that key and the first response is used. This can add unnecessary load on the network as all nodes still eventually respond, and if values are large this can cause a lot of unnecessary network traffic.
> The purpose of broadcasting to all data owners is so that (1) if one is down, another could still respond (2) if one is overloaded, others may respond faster.
> A solution around this could be based on either (or both) of:
> * Provide a configurable stagger timeout, e.g. 100ms. E.g., RPC to (random) Owner1. Wait for timeout t. If no response, RPC to Owner2. etc.
> * Always broadcast to a (configurable) subset of owners, e.g., always 2 even if numOwners is 5.
> Needs careful thought and design.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2661) Not a mapped entity (don't forget to add @Indexed) is thrown while using Clustered Query for DIST cache with InfinispanIndexManager enabled
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-2661?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero updated ISPN-2661:
----------------------------------
Fix Version/s: 6.0.0.Beta1
(was: 5.3.0.Final)
> Not a mapped entity (don't forget to add @Indexed) is thrown while using Clustered Query for DIST cache with InfinispanIndexManager enabled
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2661
> URL: https://issues.jboss.org/browse/ISPN-2661
> Project: Infinispan
> Issue Type: Bug
> Components: Querying
> Reporter: Anna Manukyan
> Assignee: Sanne Grinovero
> Fix For: 6.0.0.Beta1
>
> Attachments: ClusteredQueryMassIndexingTest.java, DistributedMassIndexingTest.java
>
>
> Running ClusteredQuery on Infinispan cache in distributed mode with enabled org.infinispan.query.indexmanager.InfinispanIndexManager as indexmanager and using infinispan as directory_provider, throws the following exception:
> {code}
> org.infinispan.CacheException: org.hibernate.search.SearchException: Not a mapped entity (don't forget to add @Indexed): class org.infinispan.query.queries.faceting.Car
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:189)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:206)
> at org.infinispan.query.clustered.ClusteredQueryInvoker.broadcast(ClusteredQueryInvoker.java:113)
> at org.infinispan.query.clustered.ClusteredCacheQueryImpl.getResultSize(ClusteredCacheQueryImpl.java:96)
> at org.infinispan.query.distributed.ClusteredQueryMassIndexingTest.verifyFindsCar(ClusteredQueryMassIndexingTest.java:26)
> at org.infinispan.query.distributed.DistributedMassIndexingTest.verifyFindsCar(DistributedMassIndexingTest.java:105)
> at org.infinispan.query.distributed.DistributedMassIndexingTest.testReindexing(DistributedMassIndexingTest.java:68)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.hibernate.search.SearchException: Not a mapped entity (don't forget to add @Indexed): class org.infinispan.query.queries.faceting.Car
> at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:567)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:511)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.queryDocumentExtractor(HSQueryImpl.java:304)
> at org.infinispan.query.clustered.commandworkers.CQGetResultSize.perform(CQGetResultSize.java:40)
> at org.infinispan.query.clustered.ClusteredQueryCommand.perform(ClusteredQueryCommand.java:132)
> at org.infinispan.query.clustered.ClusteredQueryCommand.perform(ClusteredQueryCommand.java:127)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:245)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:218)
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483)
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248)
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598)
> at org.jgroups.JChannel.up(JChannel.java:703)
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
> at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
> at org.jgroups.protocols.FC.up(FC.java:479)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)
> at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:432)
> at org.jgroups.protocols.pbcast.NAKACK2.handleMessage(NAKACK2.java:721)
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:574)
> at org.jgroups.protocols.Discovery.up(Discovery.java:359)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1287)
> at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850)
> at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823)
> ... 3 more
> {code}
> You can find the test attached - ClusteredQueryMassIndexingTest.java which extends the DistributedMassIndexingTest (already from the testsuite). I've marked the verifyFindsCar() method there as protected for being able to override it. The cache configuration for the specified issue is:
> https://github.com/andyuk1986/infinispan/blob/master/query/src/test/resou...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-1568) Clustered Query fail when hibernate search not fully initialized
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-1568?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero updated ISPN-1568:
----------------------------------
Fix Version/s: 6.0.0.Beta1
(was: 5.3.0.Final)
> Clustered Query fail when hibernate search not fully initialized
> ----------------------------------------------------------------
>
> Key: ISPN-1568
> URL: https://issues.jboss.org/browse/ISPN-1568
> Project: Infinispan
> Issue Type: Bug
> Components: Querying, RPC
> Affects Versions: 5.1.0.BETA5
> Reporter: Mathieu Lachance
> Assignee: Sanne Grinovero
> Fix For: 6.0.0.Beta1
>
>
> Hi,
> I'm running into this issue when doing a clustered query in distribution mode :
> org.infinispan.CacheException: org.hibernate.search.SearchException: Not a mapped entity (don't forget to add @Indexed): class com.XXX.Client
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:166)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:181)
> at org.infinispan.query.clustered.ClusteredQueryInvoker.broadcast(ClusteredQueryInvoker.java:113)
> at org.infinispan.query.clustered.ClusteredCacheQueryImpl.broadcastQuery(ClusteredCacheQueryImpl.java:115)
> at org.infinispan.query.clustered.ClusteredCacheQueryImpl.iterator(ClusteredCacheQueryImpl.java:90)
> at org.infinispan.query.impl.CacheQueryImpl.iterator(CacheQueryImpl.java:129)
> at org.infinispan.query.clustered.ClusteredCacheQueryImpl.list(ClusteredCacheQueryImpl.java:133)
> at com.XXX.DistributedCache.cacheQueryList(DistributedCache.java:313)
> at com.XXX.DistributedCache.cacheQueryList(DistributedCache.java:274)
> at com.XXX.ClientCache.getClientsByServerId(ClientCache.java:127)
> at com.XXX.ClientManager.getClientsByServerId(ClientManager.java:157)
> at com.XXX$PingClient.run(PlayerBll.java:890)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
> Caused by: org.hibernate.search.SearchException: Not a mapped entity (don't forget to add @Indexed): class com.XXX.Client
> at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:549)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:493)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.queryDocumentExtractor(HSQueryImpl.java:292)
> at org.infinispan.query.clustered.commandworkers.CQCreateEagerQuery.perform(CQCreateEagerQuery.java:44)
> at org.infinispan.query.clustered.ClusteredQueryCommand.perform(ClusteredQueryCommand.java:135)
> at org.infinispan.query.clustered.ClusteredQueryCommand.perform(ClusteredQueryCommand.java:129)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:170)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:179)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithRetry(InboundInvocationHandlerImpl.java:208)
> at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:156)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:162)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:141)
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:447)
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:354)
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:230)
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:556)
> at org.jgroups.JChannel.up(JChannel.java:716)
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1026)
> at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:881)
> at org.jgroups.protocols.pbcast.StreamingStateTransfer.up(StreamingStateTransfer.java:262)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)
> at org.jgroups.protocols.UNICAST.up(UNICAST.java:332)
> at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:700)
> at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:561)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:140)
> at org.jgroups.protocols.FD.up(FD.java:273)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:284)
> at org.jgroups.protocols.MERGE2.up(MERGE2.java:205)
> at org.jgroups.protocols.Discovery.up(Discovery.java:354)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1174)
> at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1709)
> at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1691)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> With the use of the following
> cache configuration :
> <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="urn:infinispan:config:5.1 http://www.infinispan.org/schemas/infinispan-config-5.1.xsd"
> xmlns="urn:infinispan:config:5.1">
> <global>
> <transport clusterName="XXX-cluster" machineId="XXX" siteId="XXX" rackId="XXX" distributedSyncTimeout="15000">
> <properties>
> <property name="configurationFile" value="jgroups-jdbc-ping.xml" />
> </properties>
> </transport>
> </global>
> <default>
> <transaction
> cacheStopTimeout="30000"
> transactionManagerLookupClass="org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
> lockingMode="PESSIMISTIC"
> useSynchronization="true"
> transactionMode="TRANSACTIONAL"
> syncCommitPhase="true"
> syncRollbackPhase="false"
> >
> <recovery enabled="false" />
> </transaction>
> <clustering mode="local" />
> <indexing enabled="true" indexLocalOnly="true">
> <properties>
> <property name="hibernate.search.default.directory_provider" value="ram" />
> </properties>
> </indexing>
> </default>
> <namedCache name="XXX-Client">
> <transaction
> cacheStopTimeout="30000"
> transactionManagerLookupClass="org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
> lockingMode="PESSIMISTIC"
> useSynchronization="true"
> transactionMode="TRANSACTIONAL"
> syncCommitPhase="true"
> syncRollbackPhase="false"
> >
> <recovery enabled="false" />
> </transaction>
> <invocationBatching enabled="false" />
> <loaders passivation="false" />
> <clustering mode="distribution" >
> <sync replTimeout="15000" />
> <stateRetrieval
> timeout="240000"
> retryWaitTimeIncreaseFactor="2"
> numRetries="5"
> maxNonProgressingLogWrites="100"
>
> fetchInMemoryState="false"
> logFlushTimeout="60000"
> alwaysProvideInMemoryState="false"
> />
> </clustering>
> <storeAsBinary enabled="false" storeValuesAsBinary="true" storeKeysAsBinary="true" />
> <deadlockDetection enabled="true" spinDuration="100" />
> <eviction strategy="NONE" threadPolicy="PIGGYBACK" maxEntries="-1" />
> <jmxStatistics enabled="true" />
> <locking writeSkewCheck="false" lockAcquisitionTimeout="10000" isolationLevel="READ_COMMITTED" useLockStriping="false" concurrencyLevel="32" />
> <expiration wakeUpInterval="60000" lifespan="-1" maxIdle="3000000" />
> </namedCache>
> </infinispan>
> and jgroups configuration :
> <config xmlns="urn:org:jgroups"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="urn:org:jgroups file:schema/JGroups-3.0.xsd">
> <TCP
> bind_port="7800"
> loopback="true"
> port_range="30"
> recv_buf_size="20000000"
> send_buf_size="640000"
> discard_incompatible_packets="true"
> max_bundle_size="64000"
> max_bundle_timeout="30"
> enable_bundling="true"
> use_send_queues="true"
> sock_conn_timeout="300"
> enable_diagnostics="false"
> thread_pool.enabled="true"
> thread_pool.min_threads="2"
> thread_pool.max_threads="30"
> thread_pool.keep_alive_time="5000"
> thread_pool.queue_enabled="false"
> thread_pool.queue_max_size="100"
> thread_pool.rejection_policy="Discard"
> oob_thread_pool.enabled="true"
> oob_thread_pool.min_threads="2"
> oob_thread_pool.max_threads="30"
> oob_thread_pool.keep_alive_time="5000"
> oob_thread_pool.queue_enabled="false"
> oob_thread_pool.queue_max_size="100"
> oob_thread_pool.rejection_policy="Discard"
> />
> <JDBC_PING
> connection_url="jdbc:jtds:sqlserver://XXX;databaseName=XXX"
> connection_username="XXX"
> connection_password="XXX"
> connection_driver="net.sourceforge.jtds.jdbcx.JtdsDataSource"
> initialize_sql=""
> />
> <MERGE2 max_interval="30000"
> min_interval="10000"/>
> <FD_SOCK/>
> <FD timeout="3000" max_tries="3"/>
> <VERIFY_SUSPECT timeout="1500"/>
> <pbcast.NAKACK
> use_mcast_xmit="false"
> retransmit_timeout="300,600,1200,2400,4800"
> discard_delivered_msgs="false"/>
> <UNICAST timeout="300,600,1200"/>
> <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
> max_bytes="400000"/>
> <pbcast.STATE />
> <pbcast.GMS print_local_addr="false" join_timeout="7000" view_bundling="true"/>
> <UFC max_credits="2000000" min_threshold="0.10"/>
> <MFC max_credits="2000000" min_threshold="0.10"/>
> <FRAG2 frag_size="60000"/>
> </config>
> Tough my entity is well annotated.
> Here's the steps to reproduce :
> 1. boot node A completly.
> 2. boot node B, make all caches start (DefaultCacheManager::startCaches(...)), then breakpoint just after.
> 3. on node A, do a clustered query.
> 4. node A fail because node b has not been fully initialized.
> Here's how I do my query :
> private CacheQuery getClusteredNonClusteredQuery(Query query)
> {
> CacheQuery cacheQuery;
> if (useClusteredQuery)
> {
> cacheQuery = searchManager.getClusteredQuery(query, cacheValueClass);
> }
> else
> {
> cacheQuery = searchManager.getQuery(query, cacheValueClass);
> }
> return cacheQuery;
> }
> I've tried also without supplying any "cacheValueClass" without any success.
> One ugly "workaround" I've found is, to as soon as possible in the application, to force the local insertion and removal of one dummy key and value to force initialization of the search manager like :
> cache.getAdvancedCache().withFlags(Flag.CACHE_MODE_LOCAL).put("XXX", new Client("XXX");
> cache.getAdvancedCache().withFlags(Flag.CACHE_MODE_LOCAL).remove("XXX");
> Tough this technique won't still garanty me that any clustered query will occur before.
> I think the issue this might as well be related to issue : ISPN-627 Provision to get Cache from CacheManager.
> Any idea or workaround ? Do you think by just adding a try catch and return an empty list could "fix" the problem ?
> EDIT :
> I've added a try catch inside org.infinispan.query.clustered.commandworkers.CQCreateEagerQuery::perform()
> @Override
> public QueryResponse perform() {
> query.afterDeserialise((SearchFactoryImplementor) getSearchFactory());
> try
> {
> DocumentExtractor extractor = query.queryDocumentExtractor();
> int resultSize = query.queryResultSize();
>
> ISPNEagerTopDocs eagerTopDocs = collectKeys(extractor);
>
> QueryResponse queryResponse = new QueryResponse(eagerTopDocs,
> getQueryBox().getMyId(), resultSize);
> queryResponse.setAddress(cache.getAdvancedCache().getRpcManager()
> .getAddress());
> return queryResponse;
> }
> catch (SearchException e)
> {
> QueryResponse queryResponse = new QueryResponse(new ISPNEagerTopDocs(), getQueryBox().getMyId(), 0);
> queryResponse.setAddress(cache.getAdvancedCache().getRpcManager().getAddress());
> return queryResponse;
> }
> }
> And made a default constructor in org.infinispan.query.clustered.ISPNEagerTopDocs
> public ISPNEagerTopDocs()
> {
> super(0, new ScoreDoc[0], 0);
> this.keys = new Object[0];
> }
> Swallowing the exception appear to be a "good" workaround.
> Thanks a lot,
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2240) Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin.... ]
Mircea Markus commented on ISPN-2240:
-------------------------------------
[~snazy] I've commented on ISPN-2710. Does this still reproduce or you haven't been able to test it because of ISPN-2710?
> Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-2240
> URL: https://issues.jboss.org/browse/ISPN-2240
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.1.6.FINAL, 5.1.x
> Reporter: Robert Stupp
> Assignee: Mircea Markus
> Priority: Critical
> Fix For: 5.3.0.Final
>
> Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
>
>
> Hi,
> I've encountered a lot of TimeoutExceptions just running a load test against an infinispan cluster.
> I tracked down the reason and found out, that the code in org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock() causes these superfluous TimeoutExceptions.
> A small test case (which just prints out timeouts, too late timeouts and "paints" a lot of dots to the console - more dots/second on the console means better throughput ;-)
> In a short test I extended the class ReentrantPerEntryLockContainer and changed the implementation of releaseLock() as follows:
> {noformat}
> public void releaseLock(Object lockOwner, Object key) {
> ReentrantLock l = locks.get(key);
> if (l != null) {
> if (!l.isHeldByCurrentThread())
> throw new IllegalStateException("Lock for [" + key + "] not held by current thread " + Thread.currentThread());
> while (l.isHeldByCurrentThread())
> unlock(l, lockOwner);
> if (!l.hasQueuedThreads())
> locks.remove(key);
> }
> else
> throw new IllegalStateException("No lock for [" + key + ']');
> }
> {noformat}
> The main improvement is that locks are not removed from the concurrent map as long as other threads are waiting on that lock.
> If the lock is removed from the map while other threads are waiting for it, they may run into timeouts and force TimeoutExceptions to the client.
> The above methods "paints more dots per second" - means: it gives a better throughput for concurrent accesses to the same key.
> The re-implemented method should also fix some replication timeout exceptions.
> Please, please add this to 5.1.7, if possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months