[JBoss JIRA] (ISPN-1830) L1: On topology changes we should propagate the key requestors information to the new owners
by Dan Berindei (JIRA)
Dan Berindei created ISPN-1830:
----------------------------------
Summary: L1: On topology changes we should propagate the key requestors information to the new owners
Key: ISPN-1830
URL: https://issues.jboss.org/browse/ISPN-1830
Project: Infinispan
Issue Type: Task
Components: Distributed Cache
Affects Versions: 5.1.0.FINAL
Reporter: Dan Berindei
Assignee: Manik Surtani
Fix For: 5.2.0.FINAL
I think we are losing information about where a key needs to be invalidated when a node changes from owner to non-owner (e.g. because another node joined):
* We lose the list of requestors stored on this node. Even if all ClusteredGetCommands reached all the current owners (which we are about to change with ISPN-825), once all the current owners leave the new owners will not know about this node's old requestors.
* We don't add ourselves as a requestor for the key on the new owners when we invalidate the entry and move it to L1.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 1 month
[JBoss JIRA] (ISPN-2439) Deadlock in Map/Reduce tasks
by Dan Berindei (JIRA)
Dan Berindei created ISPN-2439:
----------------------------------
Summary: Deadlock in Map/Reduce tasks
Key: ISPN-2439
URL: https://issues.jboss.org/browse/ISPN-2439
Project: Infinispan
Issue Type: Bug
Components: Distributed Execution and Map/Reduce
Affects Versions: 5.2.0.Beta2
Reporter: Dan Berindei
Assignee: Vladimir Blagojevic
Fix For: 5.2.0.Final
It looks like the Map/Reduce intermediate caches use pessimistic transactions, but the transactions are not guaranteed to write to the keys in the same order. So it's possible for two tasks to get into a deadlock, ending with a TimeoutException:
{noformat}
16:18:40,649 ERROR (testng-DistributedFourNodesMapReduceTest:) [UnitTestTestNGListener] Test testCombinerDoesNotChangeResult(org.infinispan.distexec.mapreduce.DistributedFourNodesMapReduceTest) failed.
org.infinispan.CacheException: Could not invoke map phase of MapReduce task on remote nodes
at org.infinispan.distexec.mapreduce.MapReduceTask.invokeEverywhere(MapReduceTask.java:562)
at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhase(MapReduceTask.java:374)
at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:315)
at org.infinispan.distexec.mapreduce.BaseWordCountMapReduceTest.testCombinerDoesNotChangeResult(BaseWordCountMapReduceTest.java:188)
...
Caused by: org.infinispan.CacheException: org.infinispan.CacheException: Could not move intermediate keys/values for M/R task 04244b4b-08b1-4fc4-9755-ed02f3f35a3a
at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:97)
at org.infinispan.commands.read.MapCombineCommand.perform(MapCombineCommand.java:89)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:95)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:110)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:82)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:244)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:217)
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483)
...
Caused by: org.infinispan.CacheException: Could not move intermediate keys/values for M/R task 04244b4b-08b1-4fc4-9755-ed02f3f35a3a
at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.combine(MapReduceManagerImpl.java:281)
at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:95)
... 26 more
Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [JBoss] for requestor [GlobalTransaction:<NodeD-56763>:10429:remote]! Lock held by [GlobalTransaction:<NodeB-55590>:10432:remote]
at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:217)
at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLock(LockManagerImpl.java:190)
at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockKeyAndCheckOwnership(AbstractTxLockingInterceptor.java:190)
at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAndRegisterBackupLock(AbstractTxLockingInterceptor.java:125)
at org.infinispan.interceptors.locking.PessimisticLockingInterceptor.visitLockControlCommand(PessimisticLockingInterceptor.java:248)
at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:131)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)
at org.infinispan.commands.AbstractVisitor.visitLockControlCommand(AbstractVisitor.java:177)
at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:131)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:125)
at org.infinispan.interceptors.TxInterceptor.visitLockControlCommand(TxInterceptor.java:174)
at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:131)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:212)
at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:187)
at org.infinispan.statetransfer.StateTransferInterceptor.visitLockControlCommand(StateTransferInterceptor.java:131)
at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:131)
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:129)
at org.infinispan.interceptors.InvocationContextInterceptor.visitLockControlCommand(InvocationContextInterceptor.java:98)
at org.infinispan.commands.control.LockControlCommand.acceptVisitor(LockControlCommand.java:131)
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:347)
at org.infinispan.commands.control.LockControlCommand.perform(LockControlCommand.java:150)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:95)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:110)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:82)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:244)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:217)
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483)
...
{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 1 month
[JBoss JIRA] (ISPN-2454) The current SQL query that checks table existence is under optimized on big database
by Nicolas Filotto (JIRA)
Nicolas Filotto created ISPN-2454:
-------------------------------------
Summary: The current SQL query that checks table existence is under optimized on big database
Key: ISPN-2454
URL: https://issues.jboss.org/browse/ISPN-2454
Project: Infinispan
Issue Type: Enhancement
Components: Loaders and Stores
Affects Versions: 5.1.0.FINAL
Reporter: Nicolas Filotto
Assignee: Mircea Markus
Last year I proposed an enhancement allowing to use the implicit db schema to ease life of end-user, unfortunately it seems that it affects the startup time of the product when the tables contain a lot of rows. Indeed the query can take several seconds to return a result.
The idea of this task would be to optimize this query for all supported databases in order to avoid this drawback.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 1 month
[JBoss JIRA] (ISPN-2431) Default task failover policy enhancements
by Erik Salter (JIRA)
Erik Salter created ISPN-2431:
---------------------------------
Summary: Default task failover policy enhancements
Key: ISPN-2431
URL: https://issues.jboss.org/browse/ISPN-2431
Project: Infinispan
Issue Type: Feature Request
Components: Distributed Execution and Map/Reduce
Affects Versions: 5.2.0.Beta2
Reporter: Erik Salter
Assignee: Vladimir Blagojevic
The new failover policy enhancements behave differently than the 5.1 release. The default is a random failover, which causes problems in my environment due to pessimistic locks being acquired. The default policy should be "none"
Secondly, if a user specifies a key set, the failover policy should respect that. Using pessimistic locks as an example, a valid optimization would be to submit on the local data owner to avoid additional RPCs. A random node failover policy really doesn't fit in this case.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 1 month
[JBoss JIRA] (ISPN-2460) Allow 1PCs for synchronous xsite replication
by Erik Salter (JIRA)
Erik Salter created ISPN-2460:
---------------------------------
Summary: Allow 1PCs for synchronous xsite replication
Key: ISPN-2460
URL: https://issues.jboss.org/browse/ISPN-2460
Project: Infinispan
Issue Type: Feature Request
Components: Cross-Site Replication
Affects Versions: 5.2.0.Beta3
Reporter: Erik Salter
Assignee: Mircea Markus
We should allow a 1PC optimization for xsite synchronous replication. Since synchronous replication is performed in the originating site's transactional context and the remote locks are acquired at prepare time, we can still guarantee consistency of data.
The default should be to use 1PC. However, this feature will require a new configuration option. <backup ... use2PC="true"/> to allow the traditional 2PC to be enabled.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 1 month
[JBoss JIRA] (ISPN-1965) Some entries not available during view change
by Michal Linhard (JIRA)
Michal Linhard created ISPN-1965:
------------------------------------
Summary: Some entries not available during view change
Key: ISPN-1965
URL: https://issues.jboss.org/browse/ISPN-1965
Project: Infinispan
Issue Type: Bug
Affects Versions: 5.1.3.FINAL
Reporter: Michal Linhard
Assignee: Manik Surtani
In the 4 node, dist mode, num-owners=2, elasticity test
http://www.qa.jboss.com/~mlinhard/hyperion/run44-elas-dist/
there is a cca 90 sec period of time where clients get null responses to GET
requests on entries that should exist in the cache.
first occurence:
hyperion1139.log 05:31:01,202 286.409
last occurence:
hyperion1135.log 05:32:45,441 390.648
total occurence count: (in all 19 driver nodes)
152241
(this doesn't mean it happens for 152K keys, because each key is retried after
erroneous attempt)
data doesn't seem to be lost, because these errors cease after a while and
number of entries returns back to normal (see cache_entries.csv)
this happens approximately in the period between node0001 is killed and cluster
{node0002 - node0004} is formed (and shortly after).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 1 month