[JBoss JIRA] (ISPN-2871) All nodes are not replicated when eviction is enabled
by Chris Beer (JIRA)
Chris Beer created ISPN-2871:
--------------------------------
Summary: All nodes are not replicated when eviction is enabled
Key: ISPN-2871
URL: https://issues.jboss.org/browse/ISPN-2871
Project: Infinispan
Issue Type: Bug
Components: Eviction
Affects Versions: 5.2.1.Final
Reporter: Chris Beer
Assignee: Mircea Markus
When I enable replication and eviction, it appear that not all nodes are replicated to all hosts. This problem was discovered when clustering modeshape with eviction, and critical nodes were not being properly replicated.
I've modified the clustered-cache quick-start to (hopefully) demonstrate this problem:
https://github.com/cbeer/infinispan-quickstart/tree/replication-eviction-...
Node1 creates 100 cache entries (key0 -> key99). When eviction is disabled, the final cache size on Node0 is 100. When eviction is enabled, the final cache size is 78.
This seems suspiciously similar to ISPN-2712.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2836) Thread deadlock in Map/Reduce with 2 nodes
by Alan Field (JIRA)
[ https://issues.jboss.org/browse/ISPN-2836?page=com.atlassian.jira.plugin.... ]
Alan Field commented on ISPN-2836:
----------------------------------
OK, I am running the job again in Jenkins using the TCP configuration. (https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/afield-radargun-mapr...)
It is not generating a deadlock, but the Map/Reduce job with two nodes does not complete. With one node executing the job, I see this sequence of log messages: (From https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/afield-radargun-mapr...)
{noformat}
11:09:54,934 INFO [org.radargun.Slave] (pool-1-thread-1) Executing stage: MapReduce {collatorFqn=null, exitBenchmarkOnSlaveFailure=false, mapperFqn=org.infinispan.demo.mapreduce.WordCountMapper, reducerFqn=org.infinispan.demo.mapreduce.WordCountReducer, runOnAllSlaves=false, slaves=null, useSmartClassLoading=true }
11:09:54,935 INFO [org.radargun.stages.MapReduceStage] (pool-1-thread-1) --------------------
11:09:54,944 INFO [org.radargun.utils.ClassLoadHelper] (pool-1-thread-1) Creating newInstance org.infinispan.demo.mapreduce.WordCountMapper with classloader java.net.URLClassLoader@4bf54c5f
11:09:54,948 INFO [org.radargun.utils.ClassLoadHelper] (pool-1-thread-1) Creating newInstance org.infinispan.demo.mapreduce.WordCountReducer with classloader java.net.URLClassLoader@4bf54c5f
11:09:55,188 DEBUG [org.infinispan.distexec.mapreduce.MapReduceTask] (transport-thread-0) Invoking MapCombineCommand [keys=[], taskId=4d6ddf79-d919-4db6-ae34-20a25a158434] locally
11:11:10,174 DEBUG [org.infinispan.distexec.mapreduce.MapReduceTask] (transport-thread-0) Invoked MapCombineCommand [keys=[], taskId=4d6ddf79-d919-4db6-ae34-20a25a158434] locally
11:11:47,849 INFO [org.radargun.stages.MapReduceStage] (pool-1-thread-1) MapReduce task completed in 112.91 seconds
{noformat}
The RadarGun stage starts. It instantiates the Mapper and Reducer class, then the task completes after MapCombineCommand is invoked.
When two nodes are executing the Map/Reduce job, an exception occurs after MapCombineCommand is invoked on the second node in the cluster. The "MapReduce task completed" from the RadarGun stage does not appear.
{noformat}
11:12:45,617 INFO [org.radargun.Slave] (pool-1-thread-1) Executing stage: MapReduce {collatorFqn=null, exitBenchmarkOnSlaveFailure=false, mapperFqn=org.infinispan.demo.mapreduce.WordCountMapper, reducerFqn=org.infinispan.demo.mapreduce.WordCountReducer, runOnAllSlaves=false, slaves=null, useSmartClassLoading=true }
11:12:45,618 INFO [org.radargun.stages.MapReduceStage] (pool-1-thread-1) --------------------
11:12:45,618 INFO [org.radargun.utils.ClassLoadHelper] (pool-1-thread-1) Creating newInstance org.infinispan.demo.mapreduce.WordCountMapper with classloader java.net.URLClassLoader@4bf54c5f
11:12:45,618 INFO [org.radargun.utils.ClassLoadHelper] (pool-1-thread-1) Creating newInstance org.infinispan.demo.mapreduce.WordCountReducer with classloader java.net.URLClassLoader@4bf54c5f
11:12:45,622 DEBUG [org.infinispan.distexec.mapreduce.MapReduceTask] (transport-thread-5) Invoking MapCombineCommand [keys=[], taskId=262e0a68-ee5c-4534-8458-97340158b129] locally
11:12:45,622 DEBUG [org.infinispan.distexec.mapreduce.MapReduceTask] (pool-1-thread-1) Invoking MapCombineCommand [keys=[], taskId=262e0a68-ee5c-4534-8458-97340158b129] on edg-perf02-43110
11:12:45,625 DEBUG [org.infinispan.distexec.mapreduce.MapReduceTask] (pool-1-thread-1) Invoked MapCombineCommand [keys=[], taskId=262e0a68-ee5c-4534-8458-97340158b129] on edg-perf02-43110
11:12:46,228 DEBUG [org.jgroups.protocols.FD] (Timer-4,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:12:49,230 DEBUG [org.jgroups.protocols.FD] (Timer-2,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:12:52,232 DEBUG [org.jgroups.protocols.FD] (Timer-2,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:12:55,233 DEBUG [org.jgroups.protocols.FD] (Timer-4,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:12:58,233 DEBUG [org.jgroups.protocols.FD] (Timer-3,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:01,235 DEBUG [org.jgroups.protocols.FD] (Timer-2,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:04,237 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:07,237 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:10,238 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:13,963 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:13,964 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) heartbeat missing from edg-perf02-43110 (number=0)
11:13:16,965 DEBUG [org.jgroups.protocols.FD] (Timer-3,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:19,966 DEBUG [org.jgroups.protocols.FD] (Timer-4,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:22,289 DEBUG [org.infinispan.distexec.mapreduce.MapReduceTask] (transport-thread-5) Invoked MapCombineCommand [keys=[], taskId=262e0a68-ee5c-4534-8458-97340158b129] locally
11:13:22,966 DEBUG [org.jgroups.protocols.FD] (Timer-4,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:25,967 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:28,969 DEBUG [org.jgroups.protocols.FD] (Timer-5,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:13:31,970 DEBUG [org.jgroups.protocols.FD] (Timer-4,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
11:15:03,446 DEBUG [org.jgroups.protocols.FD] (Timer-4,default,edg-perf01-54809) sending are-you-alive msg to edg-perf02-43110 (own address=edg-perf01-54809)
Exception in thread "main" java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-43110
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
at java.util.concurrent.FutureTask.get(FutureTask.java:91)
at org.radargun.Slave.startCommunicationWithMaster(Slave.java:124)
at org.radargun.Slave.start(Slave.java:67)
at org.radargun.Slave.main(Slave.java:230)
Caused by: org.infinispan.CacheException: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-43110
at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:352)
at org.radargun.cachewrappers.InfinispanMapReduceWrapper.executeMapReduceTask(InfinispanMapReduceWrapper.java:98)
at org.radargun.stages.MapReduceStage.executeOnSlave(MapReduceStage.java:74)
at org.radargun.Slave$2.run(Slave.java:103)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-43110
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:829)
at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhaseWithLocalReduction(MapReduceTask.java:474)
at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:350)
... 9 more
Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-43110
at org.infinispan.util.Util.rewrapAsCacheException(Util.java:532)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:185)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:180)
11:15:03,459 WARN [org.jgroups.protocols.FD] (OOB-90,default,edg-perf01-54809) I was suspected by edg-perf02-43110; ignoring the SUSPECT message and sending back a HEARTBEAT_ACK
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:202)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:259)
at org.infinispan.remoting.rpc.RpcManagerImpl.access$000(RpcManagerImpl.java:82)
at org.infinispan.remoting.rpc.RpcManagerImpl$1.call(RpcManagerImpl.java:293)
... 5 more
Caused by: org.jgroups.TimeoutException: timeout sending message to edg-perf02-43110
at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:390)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:299)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:178)
... 11 more
{noformat}
After this exception occurs, a number of message sends fail, then a MERGE event and a VIEW_CHANGE occur. After this point, the two nodes send a stream of "are-you-alive" messages to each other, and the RadarGun stage never completes. Jenkins eventually kills the job. I will run this job again using the UDP configuration, but I think the behavior will be the same, since the org.jgroups.TimeoutException is also seen above.
> Thread deadlock in Map/Reduce with 2 nodes
> ------------------------------------------
>
> Key: ISPN-2836
> URL: https://issues.jboss.org/browse/ISPN-2836
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 5.2.1.Final
> Reporter: Alan Field
> Assignee: Vladimir Blagojevic
> Attachments: afield-tcp-521-final.txt, udp-edg-perf01.txt, udp-edg-perf02.txt
>
>
> Using RadarGun and two nodes to execute the example WordCount Map/Reduce job against a cache with ~550 keys with a value size of 1MB is producing a thread deadlock. The cache is distributed with transactions disabled.
> TCP transport deadlocks without throwing an exception. Disabling the send queue and setting UNICAST2.conn_expiry_timeout=0 prevents the deadlock, but the job does not complete. The nodes send "are-you-alive" messages back and forth, and I have seen the following exception:
> {noformat}
> 11:44:29,970 ERROR [org.jgroups.protocols.TCP] (OOB-98,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (76 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:352)
> at org.radargun.cachewrappers.InfinispanMapReduceWrapper.executeMapReduceTask(InfinispanMapReduceWrapper.java:98)
> at org.radargun.stages.MapReduceStage.executeOnSlave(MapReduceStage.java:74)
> at org.radargun.Slave$2.run(Slave.java:103)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:832)
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhaseWithLocalReduction(MapReduceTask.java:477)
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:350)
> ... 9 more
> Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.infinispan.util.Util.rewrapAsCacheException(Util.java:541)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:186)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
> 11:44:29,978 ERROR [org.jgroups.protocols.TCP] (Timer-3,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (60 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:175)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:197)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:254)
> at org.infinispan.remoting.rpc.RpcManagerImpl.access$000(RpcManagerImpl.java:80)
> at org.infinispan.remoting.rpc.RpcManagerImpl$1.call(RpcManagerImpl.java:288)
> ... 5 more
> Caused by: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:390)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301)
> 11:44:29,979 ERROR [org.jgroups.protocols.TCP] (Timer-4,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (63 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179)
> ... 11 more
> {noformat}
> With UDP transport, both threads are deadlocked. I will attach thread dumps from runs using TCP and UDP transport.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2787) NPE after ReplaceCommand
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2787?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2787:
--------------------------------
Fix Version/s: 5.3.0.Final
(was: 5.2.2.Final)
> NPE after ReplaceCommand
> ------------------------
>
> Key: ISPN-2787
> URL: https://issues.jboss.org/browse/ISPN-2787
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.Final
> Reporter: Michal Linhard
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.3.Final, 5.3.0.Final
>
>
> (from https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/view/EDG-REPOR...)
> {code}
> 05:11:10,804 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host].[/].[Resteasy]] (http-/172.18.1.7:8080-15) Servlet.service() for servlet Resteasy threw exception: org.jboss.resteasy.spi.UnhandledException: java.lang.NullPointerException
> at org.jboss.resteasy.core.SynchronousDispatcher.handleApplicationException(SynchronousDispatcher.java:351) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.SynchronousDispatcher.handleException(SynchronousDispatcher.java:220) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.SynchronousDispatcher.handleInvokerException(SynchronousDispatcher.java:196) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.SynchronousDispatcher.getResponse(SynchronousDispatcher.java:551) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:513) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:125) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:208) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:55) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:50) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:847) [jboss-servlet-api_3.0_spec-1.0.1.Final-redhat-2.jar:1.0.1.Final-redhat-2]
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:329) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:372) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:877) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:679) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:931) [jbossweb-7.0.17.Final-redhat-1.jar:]
> at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38]
> Caused by: java.lang.NullPointerException
> at org.infinispan.CacheImpl.replaceInternal(CacheImpl.java:828) [infinispan-core-5.2.0.CR3-redhat-1.jar:5.2.0.CR3-redhat-1]
> at org.infinispan.CacheImpl.replace(CacheImpl.java:822) [infinispan-core-5.2.0.CR3-redhat-1.jar:5.2.0.CR3-redhat-1]
> at org.infinispan.CacheImpl.replace(CacheImpl.java:817) [infinispan-core-5.2.0.CR3-redhat-1.jar:5.2.0.CR3-redhat-1]
> at org.infinispan.AbstractDelegatingCache.replace(AbstractDelegatingCache.java:153) [infinispan-core-5.2.0.CR3-redhat-1.jar:5.2.0.CR3-redhat-1]
> at org.infinispan.rest.Server.putOrReplace(Server.scala:186) [infinispan-server-rest-5.2.0.CR3-redhat-1-classes.jar:]
> at org.infinispan.rest.Server.org$infinispan$rest$Server$$putInCache(Server.scala:157) [infinispan-server-rest-5.2.0.CR3-redhat-1-classes.jar:]
> at org.infinispan.rest.Server$$anonfun$putEntry$1.apply(Server.scala:133) [infinispan-server-rest-5.2.0.CR3-redhat-1-classes.jar:]
> at org.infinispan.rest.Server$$anonfun$putEntry$1.apply(Server.scala:120) [infinispan-server-rest-5.2.0.CR3-redhat-1-classes.jar:]
> at org.infinispan.rest.Server.protectCacheNotFound(Server.scala:254) [infinispan-server-rest-5.2.0.CR3-redhat-1-classes.jar:]
> at org.infinispan.rest.Server.putEntry(Server.scala:120) [infinispan-server-rest-5.2.0.CR3-redhat-1-classes.jar:]
> at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source) [:1.6.0_38]
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_38]
> at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_38]
> at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:167) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.ResourceMethod.invokeOnTarget(ResourceMethod.java:257) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.ResourceMethod.invoke(ResourceMethod.java:222) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.ResourceMethod.invoke(ResourceMethod.java:211) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> at org.jboss.resteasy.core.SynchronousDispatcher.getResponse(SynchronousDispatcher.java:536) [resteasy-jaxrs-2.3.4.Final-redhat-2.jar:2.3.4.Final-redhat-2]
> ... 18 more
> {code}
> Seems like the NPE is caused by ReplaceCommand.perform returning null:
> https://github.com/infinispan/infinispan/blob/5.2.0.Final/core/src/main/j...
> Made possible here:
> https://github.com/infinispan/infinispan/blob/5.2.0.Final/core/src/main/j...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2836) Thread deadlock in Map/Reduce with 2 nodes
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/ISPN-2836?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on ISPN-2836:
--------------------------------
IRC with Alan:
{quote}
[4:01pm] bela: There's no deadlock: most threads are parked
[4:01pm] bela: and the multicast and unicast receiver are waiting for data to be received
[4:01pm] bela: ah, ok
[4:02pm] bela: This is a *completely idle* system !
[4:02pm] afield: OK, so do you think the issue is somewhere in the Map/Reduce code?
[4:02pm] bela: yes
[4:02pm] afield: Because somehow it isn't thinking the job has completed
[4:03pm] bela: I don't see a main thread ?
[4:03pm] afield: OK, I will see if I can do some remote debugging to see where it is stuck
[4:03pm] bela: I do see gang workers, but there's no useful info there
[4:03pm] bela: ok, cool
[4:03pm] bela: add comments to the case
[4:03pm] bela: I believe the JGroups issue is only happeninig in TCP mode
[4:03pm] afield: And the TCP configuration *does* show a deadlock?
[4:04pm] afield: OK, sorry typing past each other!
[4:04pm] bela: no, but it shows writers are blocking on a send queue
[4:04pm] bela: which isn't serviced by a reader
[4:04pm] bela: use_send_queues was off ?
[4:04pm] afield: Yes it was
[4:05pm] bela: Sorry, Alan, I don't believe you !
[4:06pm] afield: Uh-oh, what is your clue?
[4:06pm] afield: I can look for my config file to verify
[4:06pm] bela: in
[4:06pm] bela: afield-tcp-521-final.txt:
[4:06pm] bela: at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:306)
[4:06pm] bela: at org.jgroups.blocks.TCPConnectionMap$TCPConnection$Sender.addToQueue(TCPConnectionMap.java:615)
[4:06pm] bela: at org.jgroups.blocks.TCPConnectionMap$TCPConnection.send(TCPConnectionMap.java:451)
[4:06pm] bela: at org.jgroups.blocks.TCPConnectionMap.send(TCPConnectionMap.java:174)
[4:07pm] bela: This does use a send queue
[4:07pm] bela: I assume with send queues being disabled, this *should* work
[4:07pm] bela: and also with UDP
[4:07pm] afield: OK, I'll check my config file and rerun
[4:08pm] bela: BTW: the gang workers are all in RUNNABLE states, so they're doing *something*, but I can't see what as there's only 1 line on the trace
[4:08pm] bela: ok, thx
[4:09pm] afield: I see use_send_queues="false" in the config file. That's what I need, right?
[4:09pm] bela: yes, but that's no what *was* used ha ha
[4:10pm] afield: OK, I'll run again. Thanks
[4:10pm] bela: ok. I'll copy this conv into the case, please update the case
[4:10pm] bela: cheers
{quote}
> Thread deadlock in Map/Reduce with 2 nodes
> ------------------------------------------
>
> Key: ISPN-2836
> URL: https://issues.jboss.org/browse/ISPN-2836
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 5.2.1.Final
> Reporter: Alan Field
> Assignee: Vladimir Blagojevic
> Attachments: afield-tcp-521-final.txt, udp-edg-perf01.txt, udp-edg-perf02.txt
>
>
> Using RadarGun and two nodes to execute the example WordCount Map/Reduce job against a cache with ~550 keys with a value size of 1MB is producing a thread deadlock. The cache is distributed with transactions disabled.
> TCP transport deadlocks without throwing an exception. Disabling the send queue and setting UNICAST2.conn_expiry_timeout=0 prevents the deadlock, but the job does not complete. The nodes send "are-you-alive" messages back and forth, and I have seen the following exception:
> {noformat}
> 11:44:29,970 ERROR [org.jgroups.protocols.TCP] (OOB-98,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (76 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:352)
> at org.radargun.cachewrappers.InfinispanMapReduceWrapper.executeMapReduceTask(InfinispanMapReduceWrapper.java:98)
> at org.radargun.stages.MapReduceStage.executeOnSlave(MapReduceStage.java:74)
> at org.radargun.Slave$2.run(Slave.java:103)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:832)
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhaseWithLocalReduction(MapReduceTask.java:477)
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:350)
> ... 9 more
> Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.infinispan.util.Util.rewrapAsCacheException(Util.java:541)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:186)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:515)
> 11:44:29,978 ERROR [org.jgroups.protocols.TCP] (Timer-3,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (60 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:175)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:197)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:254)
> at org.infinispan.remoting.rpc.RpcManagerImpl.access$000(RpcManagerImpl.java:80)
> at org.infinispan.remoting.rpc.RpcManagerImpl$1.call(RpcManagerImpl.java:288)
> ... 5 more
> Caused by: org.jgroups.TimeoutException: timeout sending message to edg-perf02-32536
> at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:390)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:301)
> 11:44:29,979 ERROR [org.jgroups.protocols.TCP] (Timer-4,default,edg-perf01-1907) failed sending message to edg-perf02-32536 (63 bytes): java.net.SocketException: Socket closed, cause: null
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:179)
> ... 11 more
> {noformat}
> With UDP transport, both threads are deadlocked. I will attach thread dumps from runs using TCP and UDP transport.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (ISPN-2869) Optimize GridInputStream.skip()
by Marko Lukša (JIRA)
Marko Lukša created ISPN-2869:
---------------------------------
Summary: Optimize GridInputStream.skip()
Key: ISPN-2869
URL: https://issues.jboss.org/browse/ISPN-2869
Project: Infinispan
Issue Type: Enhancement
Reporter: Marko Lukša
Assignee: Marko Lukša
{{GridInputStream.skip()}} is currently very inefficient, especially when skipping past the currently loaded chunk.
The method also has a small border-case bug: when the parameter is negative, the method should not skip any bytes, but it actually skips/reads all the remaining bytes of the stream.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months