[JBoss JIRA] (ISPN-3209) Server operation to suppress state transfer
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-3209?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-3209:
----------------------------------
Description: This issue is to provide a Server operation for the functionality implemented in ISPN-3140 (was: This feature request is to expose a JMX operation on each node, to suppress state transfer for a period of time. This flag would be {{false}} by default.
The use case of this flag would be to ease bringing down (and up) a cluster for maintenance work. A typical workflow would be:
1) Shut down application requests to the data grid
2) Suppress state transfer on all nodes via JMX
3) Bring down all nodes
4) Perform maintenance work
5) Bring up nodes, one at a time. As each node comes up, disable state transfer for the node via JMX.
6) Once all nodes are up, enable state transfer for each node again via JMX
7) Allow application requests to reach the grid again.
The purpose of this is to allow smooth and fast shutdown and startup, remove the risk of OOM errors (when bringing a grid down).
This is a small but useful subset of full manual state transfer as defined in ISPN-1394.)
> Server operation to suppress state transfer
> -------------------------------------------
>
> Key: ISPN-3209
> URL: https://issues.jboss.org/browse/ISPN-3209
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Cache, State transfer
> Affects Versions: 5.2.6.Final
> Reporter: Manik Surtani
> Assignee: Mircea Markus
> Fix For: 5.3.0.Final
>
>
> This issue is to provide a Server operation for the functionality implemented in ISPN-3140
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3209) Server operation to suppress state transfer
by Tristan Tarrant (JIRA)
Tristan Tarrant created ISPN-3209:
-------------------------------------
Summary: Server operation to suppress state transfer
Key: ISPN-3209
URL: https://issues.jboss.org/browse/ISPN-3209
Project: Infinispan
Issue Type: Feature Request
Components: Distributed Cache, State transfer
Affects Versions: 5.2.6.Final
Reporter: Manik Surtani
Assignee: Mircea Markus
Fix For: 5.3.0.Final
This feature request is to expose a JMX operation on each node, to suppress state transfer for a period of time. This flag would be {{false}} by default.
The use case of this flag would be to ease bringing down (and up) a cluster for maintenance work. A typical workflow would be:
1) Shut down application requests to the data grid
2) Suppress state transfer on all nodes via JMX
3) Bring down all nodes
4) Perform maintenance work
5) Bring up nodes, one at a time. As each node comes up, disable state transfer for the node via JMX.
6) Once all nodes are up, enable state transfer for each node again via JMX
7) Allow application requests to reach the grid again.
The purpose of this is to allow smooth and fast shutdown and startup, remove the risk of OOM errors (when bringing a grid down).
This is a small but useful subset of full manual state transfer as defined in ISPN-1394.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3163) Replacing entry via HotRod which was initially stored via Memcached does not change CAS
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-3163?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-3163:
----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 5.3.0.CR2
Resolution: Done
> Replacing entry via HotRod which was initially stored via Memcached does not change CAS
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-3163
> URL: https://issues.jboss.org/browse/ISPN-3163
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.3.0.CR1
> Reporter: Martin Gencur
> Assignee: Galder Zamarreño
> Fix For: 5.3.0.CR2, 5.3.0.Final
>
>
> Users might expect that CAS (check-and-set) operation will work even in compatibility mode which is currently not true in the following scenario:
> 1) store a key/value via Memcached
> 2) change the value via HotRod or Embedded
> 3) use Memcached's CAS operation
> In step #3 the memcached client will update the value even though the value was changed by another client in the meantime. The memcached client was supposed to change it only if it had not been changed in the meantime.
> The following test snippet shows the problem:
> {code:java}
> public void testMemcachedPutHotRodEmbbeddedReplaceMemcachedCASTest() throws Exception {
> final String key1 = "5";
> // 1. Put with Memcached
> Future<Boolean> f = cacheFactory.getMemcachedClient().set(key1, 0, "v1");
> assertTrue(f.get(60, TimeUnit.SECONDS));
> CASValue oldValue = cacheFactory.getMemcachedClient().gets(key1);
> // 2. Replace with Hot Rod
> VersionedValue versioned = cacheFactory.getHotRodCache().getVersioned(key1);
> assertTrue(cacheFactory.getHotRodCache().replaceWithVersion(key1, "v2", versioned.getVersion()));
> // 3. Replace with Embedded
> assertTrue(cacheFactory.getEmbeddedCache().replace(key1, "v2", "v3"));
> // 4. Get with Memcached and verify value/CAS
> CASValue newValue = cacheFactory.getMemcachedClient().gets(key1);
> assertEquals("v3", newValue.getValue());
> assertTrue("The version (CAS) should have changed", oldValue.getCas() != newValue.getCas());
> //<---- fails here
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-3089) The first write to a joiner in invalidation mode can be ignored
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-3089?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-3089:
-----------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 5.3.0.CR2
Resolution: Done
> The first write to a joiner in invalidation mode can be ignored
> ---------------------------------------------------------------
>
> Key: ISPN-3089
> URL: https://issues.jboss.org/browse/ISPN-3089
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.6.Final, 5.3.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 5.3.0.CR2, 5.3.0.Final
>
>
> In invalidation mode we don't wait for the initial state transfer to finish, or even for the joiner to become a member in the "write" CH, before returning to the user from {{getCache()}}.
> Writes to the joiner before it becomes a member of the write CH are not committed to the local cache, because of a check in {{EntryWrappingInterceptor.shouldWrap()}}. So it's possible for the joiner to ignore the first write completely.
> If {{StateTransferConfigurationBuilder}} enabled {{awaitInitialTransfer}} by default in invalidation mode, like it does in distributed/replicated mode, this wouldn't happen.
> The fact that it will still be possible for the user to disable {{awaitInitialTransfer}} manually shouldn't be a problem, because in invalidation mode the user should expect values to be invalidated at any time. This is just about improving the default configuration.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months
[JBoss JIRA] (ISPN-2836) org.jgroups.TimeoutException after invoking MapCombineCommand in Map/Reduce task with 2 nodes
by Alan Field (JIRA)
[ https://issues.jboss.org/browse/ISPN-2836?page=com.atlassian.jira.plugin.... ]
Alan Field updated ISPN-2836:
-----------------------------
Workaround Description: Pedro is adding the ability to set a timeout on the MapReduceTask object in Infinispan 5.3. In previous versions of Infinispan, the timeout can be increased using the Sync.replTimeout value in the cache configuration.
Workaround: Workaround Exists
Affects: Documentation (Ref Guide, User Guide, etc.)
> org.jgroups.TimeoutException after invoking MapCombineCommand in Map/Reduce task with 2 nodes
> ---------------------------------------------------------------------------------------------
>
> Key: ISPN-2836
> URL: https://issues.jboss.org/browse/ISPN-2836
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 5.2.1.Final
> Reporter: Alan Field
> Assignee: Pedro Ruivo
> Priority: Blocker
> Labels: onboard
> Fix For: 5.3.0.Final
>
> Attachments: afield-tcp-521-final.txt, benchmark-mapreduce-multifilesize.xml, dist-udp-no-tx.xml, jgroups-udp.xml, udp-edg-perf01.txt, udp-edg-perf02.txt
>
>
> Using RadarGun and two nodes to execute the example WordCount Map/Reduce job against a cache with ~550 keys with a value size of 1MB is producing a thread deadlock. The cache is distributed with transactions disabled.
> TCP transport deadlocks without throwing an exception. Disabling the send queue and setting UNICAST2.conn_expiry_timeout=0 prevents the deadlock, but the job does not complete. The nodes send "are-you-alive" messages back and forth, and I have seen the following exception:
> {noformat}
> 11:44:29,970 ERROR [org.jgroups.protocols.TCP] (OOB-98,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (76 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:352)
> at org.radargun.cachewrappers.InfinispanMapReduceWrapper.executeMapReduceTask(InfinispanMapReduceWrapper.java:98)
> at org.radargun.stages.MapReduceStage.executeOnSlave(MapReduceStage.java:74)
> at org.radargun.Slave$2.run(Slave.java:103)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:832)
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhaseWithLocalReduction(MapReduceTask.java:477)
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:350)
> ... 9 more
> Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.infinispan.util.Util.rewrapAsCacheException(Util.java:541)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:186)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
> 11:44:29,978 ERROR [org.jgroups.protocols.TCP] (Timer-3,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (60 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:175)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:197)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:254)
> at org.infinispan.remoting.rpc.RpcManagerImpl.access$000(RpcManagerImpl.java:80)
> at org.infinispan.remoting.rpc.RpcManagerImpl$1.call(RpcManagerImpl.java:288)
> ... 5 more
> Caused by: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:390)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301)
> 11:44:29,979 ERROR [org.jgroups.protocols.TCP] (Timer-4,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (63 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179)
> ... 11 more
> {noformat}
> With UDP transport, both threads are deadlocked. I will attach thread dumps from runs using TCP and UDP transport.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 7 months