[JBoss JIRA] (ISPN-2990) L1ManagerImpl doesn't reliably invalidate with async caches
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-2990?page=com.atlassian.jira.plugin.... ]
Work on ISPN-2990 stopped by William Burns.
> L1ManagerImpl doesn't reliably invalidate with async caches
> -----------------------------------------------------------
>
> Key: ISPN-2990
> URL: https://issues.jboss.org/browse/ISPN-2990
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.1.Final
> Reporter: Sebastian Tusk
> Assignee: William Burns
> Labels: onboard
>
> B is owner of k,v1
> A has k,v1 in L1
> 1. TX: A puts k,v2
> 2. TX: A sends async PrepareCommand k,v2 to B (one-phase-commit)
> 3. TX: A removes k,v1 from L1
> 4. A putForExternalRead k,v1 and has it in L1 again
> 5. TX: B executes PrepareCommand k,v2 but doesn't send invalidation to origin
> Result: A has k,v1 and B has k,v2
> Solution: For async caches send invalidation to origin too.
> The problem is that the owner updates the cache entry asynchronously. This gives the origin of the transaction time to request the entry. Here an outdated version is received and placed in L1. The owner never invalidates the entry and as result the cache is inconsistent.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3199) API to manipulate multiple versions of an entry
by Manik Surtani (JIRA)
Manik Surtani created ISPN-3199:
-----------------------------------
Summary: API to manipulate multiple versions of an entry
Key: ISPN-3199
URL: https://issues.jboss.org/browse/ISPN-3199
Project: Infinispan
Issue Type: Feature Request
Components: Core API
Affects Versions: 5.3.0.Final
Reporter: Manik Surtani
Assignee: Mircea Markus
Fix For: 7.0.0.Final
In addition to the existing API, the following will be needed to allow user applications to directly control versioning. E.g.,
{code}
put(K key, EntryVersion version);
get(K key, EntryVersion version);
getLatest(K key, EntryVersion upperBound);
evict(K key, EntryVersion upTo);
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3198) Allow multiple versions of entries to be stored
by Manik Surtani (JIRA)
Manik Surtani created ISPN-3198:
-----------------------------------
Summary: Allow multiple versions of entries to be stored
Key: ISPN-3198
URL: https://issues.jboss.org/browse/ISPN-3198
Project: Infinispan
Issue Type: Feature Request
Components: Core API
Affects Versions: 5.3.0.Final
Reporter: Manik Surtani
Assignee: Mircea Markus
Fix For: 7.0.0.Final
Storing multiple versions of a given entry will pave the way for eventual consistency based on vector clocks as well as the ability for API-controlled versioning.
Will require that each entry is a structure that contains multiple actual entries ordered by version.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-2836) org.jgroups.TimeoutException after invoking MapCombineCommand in Map/Reduce task with 2 nodes
by Pedro Ruivo (JIRA)
[ https://issues.jboss.org/browse/ISPN-2836?page=com.atlassian.jira.plugin.... ]
Pedro Ruivo updated ISPN-2836:
------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/1882
Pull Request sent. We don't have control how long a task takes, so I created a timeout parameter per Map/Reduce task.
> org.jgroups.TimeoutException after invoking MapCombineCommand in Map/Reduce task with 2 nodes
> ---------------------------------------------------------------------------------------------
>
> Key: ISPN-2836
> URL: https://issues.jboss.org/browse/ISPN-2836
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 5.2.1.Final
> Reporter: Alan Field
> Assignee: Pedro Ruivo
> Priority: Blocker
> Labels: onboard
> Fix For: 5.3.0.Final
>
> Attachments: afield-tcp-521-final.txt, benchmark-mapreduce-multifilesize.xml, dist-udp-no-tx.xml, jgroups-udp.xml, udp-edg-perf01.txt, udp-edg-perf02.txt
>
>
> Using RadarGun and two nodes to execute the example WordCount Map/Reduce job against a cache with ~550 keys with a value size of 1MB is producing a thread deadlock. The cache is distributed with transactions disabled.
> TCP transport deadlocks without throwing an exception. Disabling the send queue and setting UNICAST2.conn_expiry_timeout=0 prevents the deadlock, but the job does not complete. The nodes send "are-you-alive" messages back and forth, and I have seen the following exception:
> {noformat}
> 11:44:29,970 ERROR [org.jgroups.protocols.TCP] (OOB-98,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (76 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:352)
> at org.radargun.cachewrappers.InfinispanMapReduceWrapper.executeMapReduceTask(InfinispanMapReduceWrapper.java:98)
> at org.radargun.stages.MapReduceStage.executeOnSlave(MapReduceStage.java:74)
> at org.radargun.Slave$2.run(Slave.java:103)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:832)
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhaseWithLocalReduction(MapReduceTask.java:477)
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:350)
> ... 9 more
> Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.infinispan.util.Util.rewrapAsCacheException(Util.java:541)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:186)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
> 11:44:29,978 ERROR [org.jgroups.protocols.TCP] (Timer-3,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (60 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:175)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:197)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:254)
> at org.infinispan.remoting.rpc.RpcManagerImpl.access$000(RpcManagerImpl.java:80)
> at org.infinispan.remoting.rpc.RpcManagerImpl$1.call(RpcManagerImpl.java:288)
> ... 5 more
> Caused by: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:390)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301)
> 11:44:29,979 ERROR [org.jgroups.protocols.TCP] (Timer-4,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (63 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179)
> ... 11 more
> {noformat}
> With UDP transport, both threads are deadlocked. I will attach thread dumps from runs using TCP and UDP transport.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3190) Memcached server throwing NullPointerException if the mashaller is not explicitly set
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-3190?page=com.atlassian.jira.plugin.... ]
Work on ISPN-3190 started by Galder Zamarreño.
> Memcached server throwing NullPointerException if the mashaller is not explicitly set
> -------------------------------------------------------------------------------------
>
> Key: ISPN-3190
> URL: https://issues.jboss.org/browse/ISPN-3190
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 5.3.0.CR1
> Reporter: Martin Gencur
> Assignee: Galder Zamarreño
> Fix For: 5.3.0.Final
>
>
> According to the tutorial at https://docs.jboss.org/author/display/ISPN/Interoperability+between+Embed... it is not necessary to specify a special marshaller under normal circumstances. However, the Memcached server seems to require it in any case and the Memcached client's "get" operation fails because we did not set any marshaller for the compatibility mode and it remained null. Note that I'm not using SpyMemcached.
> My understanding is that even if I use SpyMemcached to retrieve the entry, it will be found and retrieved but without the marshaller I simply won't understand the returned value. But something will be returned. Is my assumption correct?
> {code}
> 2013-06-05 14:12:19,054 TRACE (MemcachedServerWorker-6) [org.infinispan.container.EntryFactoryImpl] Retrieved from container ImmortalCacheEntry{key=4, value=MetadataImmortalCacheValue {value=v1, metadata=EmbeddedMetadata{lifespan=-1, maxIdle=-1, version=ServerEntryVersion(1)}}}
> 2013-06-05 14:12:19,054 TRACE (MemcachedServerWorker-6) [org.infinispan.interceptors.CallInterceptor] Executing command: GetKeyValueCommand {key=4, flags=[OPERATION_MEMCACHED]}.
> 2013-06-05 14:12:19,054 TRACE (MemcachedServerWorker-6) [org.infinispan.commands.read.GetKeyValueCommand] Found entry ImmortalCacheEntry{key=4, value=MetadataImmortalCacheValue {value=v1, metadata=EmbeddedMetadata{lifespan=-1, maxIdle=-1, version=ServerEntryVersion(1)}}}
> 2013-06-05 14:12:19,055 ERROR (MemcachedServerWorker-6) [org.infinispan.interceptors.InvocationContextInterceptor] ISPN000136: Execution error
> java.lang.NullPointerException
> at org.infinispan.server.memcached.MemcachedTypeConverter.marshall(MemcachedTypeConverter.scala:58)
> at org.infinispan.server.memcached.MemcachedTypeConverter.unboxValue(MemcachedTypeConverter.scala:45)
> at org.infinispan.server.memcached.MemcachedTypeConverter.unboxValue(MemcachedTypeConverter.scala:37)
> at org.infinispan.interceptors.compat.TypeConverterInterceptor.visitGetKeyValueCommand(TypeConverterInterceptor.java:93)
> at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:120)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92)
> at org.infinispan.commands.AbstractVisitor.visitGetKeyValueCommand(AbstractVisitor.java:96)
> at org.infinispan.commands.read.GetKeyValueCommand.acceptVisitor(GetKeyValueCommand.java:62)
> at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343)
> at org.infinispan.CacheImpl.getCacheEntry(CacheImpl.java:399)
> at org.infinispan.DecoratedCache.getCacheEntry(DecoratedCache.java:514)
> at org.infinispan.server.memcached.MemcachedDecoder.get(MemcachedDecoder.scala:122)
> at org.infinispan.server.core.AbstractProtocolDecoder.decodeKey(AbstractProtocolDecoder.scala:121)
> at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:75)
> at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:49)
> at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:500)
> at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
> at org.infinispan.server.core.AbstractProtocolDecoder.messageReceived(AbstractProtocolDecoder.scala:385)
> at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
> at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
> at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
> at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
> at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
> at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
> at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-2965) L1 and early invalidation leaves inconsistent state
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-2965?page=com.atlassian.jira.plugin.... ]
William Burns commented on ISPN-2965:
-------------------------------------
Actually getting a minute to think about your second case - is there a reason the ClusteredGetCommand value returned is kept in the tx context and not just directly written to the data container? We aren't updating the value with this and it would be fully consistent. If the tx updates the key after it will be contained in the tx context and eventually overwrite the L1 cached value and if it is rolled back then we still have a consistent L1 cache.
> L1 and early invalidation leaves inconsistent state
> ---------------------------------------------------
>
> Key: ISPN-2965
> URL: https://issues.jboss.org/browse/ISPN-2965
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache, Transactions
> Affects Versions: 5.2.1.Final
> Reporter: Sebastian Tusk
> Assignee: William Burns
> Labels: 5.2.x
> Fix For: 5.3.0.Final
>
>
> In a distributed transactional cache with L1 enabled I can observe the following.
> Prepare cache by adding an entry with Cache.put( k, v1 ).
> 1. Node B starts with adding a changed value. Cache.put( k, v2 )
> 2. Node B TxDistributionInterceptor.visitPrepareCommand flushL1Caches sends invalidations.
> 3. Node A calls Cache.get( k ) retrieves v1 and stores this value in L1.
> 4. Node B proceeds with transaction.
> The result is that Node A answers subsequent Cache.get(k) with v1 and Node B answers with v2.
> It seems the invalidation is either send to early or must be synchronized in some way with the transaction.
> Cache config:
> <namedCache name="entity">
> <jmxStatistics enabled="true" />
> <clustering mode="dist">
> <stateTransfer fetchInMemoryState="false" timeout="20000" />
> <async />
> <l1 enabled="true" />
> <hash numOwners="1"/>
> </clustering>
> <locking isolationLevel="READ_COMMITTED"
> lockAcquisitionTimeout="15000" useLockStriping="false" />
> <eviction maxEntries="10000" strategy="LRU" />
> <expiration maxIdle="100000" wakeUpInterval="5000"/>
> <storeAsBinary storeKeysAsBinary="true" storeValuesAsBinary="false" enabled="false" />
> <transaction transactionMode="TRANSACTIONAL" autoCommit="false" lockingMode="OPTIMISTIC"/>
> </namedCache>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3197) Message ordering of Get and Invalidation can cause L1 to be inconsistent
by William Burns (JIRA)
William Burns created ISPN-3197:
-----------------------------------
Summary: Message ordering of Get and Invalidation can cause L1 to be inconsistent
Key: ISPN-3197
URL: https://issues.jboss.org/browse/ISPN-3197
Project: Infinispan
Issue Type: Bug
Reporter: William Burns
Assignee: Mircea Markus
This is based off of discussion here: https://issues.jboss.org/browse/ISPN-2990?focusedCommentId=12779491&page=...
This can occur with a synchronous cache.
1. A reads k1. This is an OOB call.
2. B processes the read message and sends back the response
3. C updates k1, at this stage B sends the invalidation message to A (OOB call)
4. A processes(ignores) the invalidation message
5. A puts the stale value sent at 2 in L1
The OOB portions don't actually matter that they are OOB as even if B's messages were ordered it sill could process the get and update in a different order since they originate from different nodes.
The thought is to solve this with some type of tombstone to sygnal the removal of the L1 cache, but this also still doesn't catch the problem if A did not have key k1 in it's L1 cache to receive an invalidation message.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3197) Message ordering of Get and Invalidation can cause L1 to be inconsistent
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-3197?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-3197:
--------------------------------
Description:
This is based off of discussion here: https://issues.jboss.org/browse/ISPN-2990?focusedCommentId=12779491&page=...
This can occur with a synchronous cache.
1. A reads k1. This is an OOB call.
2. B processes the read message and sends back the response
3. C updates k1, at this stage B sends the invalidation message to A (OOB call)
4. A processes(ignores) the invalidation message
5. A puts the stale value sent at 2 in L1
The OOB portions don't actually matter that they are OOB as even if B's messages were ordered it sill could process the get and update in a different order since they originate from different nodes.
The initial thought is to solve this with some type of tombstone to sygnal the removal of the L1 cache, but this also still doesn't catch the problem if A did not have key k1 in it's L1 cache to receive an invalidation message.
was:
This is based off of discussion here: https://issues.jboss.org/browse/ISPN-2990?focusedCommentId=12779491&page=...
This can occur with a synchronous cache.
1. A reads k1. This is an OOB call.
2. B processes the read message and sends back the response
3. C updates k1, at this stage B sends the invalidation message to A (OOB call)
4. A processes(ignores) the invalidation message
5. A puts the stale value sent at 2 in L1
The OOB portions don't actually matter that they are OOB as even if B's messages were ordered it sill could process the get and update in a different order since they originate from different nodes.
The thought is to solve this with some type of tombstone to sygnal the removal of the L1 cache, but this also still doesn't catch the problem if A did not have key k1 in it's L1 cache to receive an invalidation message.
> Message ordering of Get and Invalidation can cause L1 to be inconsistent
> ------------------------------------------------------------------------
>
> Key: ISPN-3197
> URL: https://issues.jboss.org/browse/ISPN-3197
> Project: Infinispan
> Issue Type: Bug
> Reporter: William Burns
> Assignee: Mircea Markus
>
> This is based off of discussion here: https://issues.jboss.org/browse/ISPN-2990?focusedCommentId=12779491&page=...
> This can occur with a synchronous cache.
> 1. A reads k1. This is an OOB call.
> 2. B processes the read message and sends back the response
> 3. C updates k1, at this stage B sends the invalidation message to A (OOB call)
> 4. A processes(ignores) the invalidation message
> 5. A puts the stale value sent at 2 in L1
> The OOB portions don't actually matter that they are OOB as even if B's messages were ordered it sill could process the get and update in a different order since they originate from different nodes.
> The initial thought is to solve this with some type of tombstone to sygnal the removal of the L1 cache, but this also still doesn't catch the problem if A did not have key k1 in it's L1 cache to receive an invalidation message.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months