[infinispan-issues] [JBoss JIRA] (ISPRK-22) InfinispanRDD is not fault tolerant

Gustavo Fernandes (JIRA) issues at jboss.org
Fri May 6 04:31:00 EDT 2016


    [ https://issues.jboss.org/browse/ISPRK-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201913#comment-13201913 ] 

Gustavo Fernandes edited comment on ISPRK-22 at 5/6/16 4:30 AM:
----------------------------------------------------------------

[~vjuranek] After a closer inspection, the data loss that happens in the RDD Write Failover test is not related to the connector; it can be reproduced by simply doing a series of PutAll to a cache and shutting down a server during the puts


was (Author: gustavonalle):
[~vjuranek] After a closer inspection, the data loss that happens in the RDD Failover test is not related to the connector; it can be reproduced by simply doing a series of PutAll to a cache and shutting down a server during the puts

> InfinispanRDD is not fault tolerant
> -----------------------------------
>
>                 Key: ISPRK-22
>                 URL: https://issues.jboss.org/browse/ISPRK-22
>             Project: Infinispan Spark
>          Issue Type: Bug
>          Components: RDD
>    Affects Versions: 0.3
>            Reporter: Vojtech Juranek
>            Assignee: Gustavo Fernandes
>             Fix For: 0.4
>
>
> When primary ISPN server fails during processing InfinispanRDD, Spark is not able to overcome this failure. 
> This is caused by re-creating {{RemoteCachManager}} with pre-configured ISPN server address (for read [here|https://github.com/infinispan/infinispan-spark/blob/master/src/main/scala/org/infinispan/spark/rdd/InfinispanRDD.scala#L66], for writes [here|https://github.com/infinispan/infinispan-spark/blob/master/src/main/scala/org/infinispan/spark/package.scala#L33]), so when this server fails during RDD processing and Spark calls some function, which under the hood creates {{RemoteCacheManager}}, it will fail with connection refused exception.
> [Here|https://github.com/vjuranek/infinispan-spark/commit/fa56b5f072ce24e055036064291f9ac0b2a195c8] are some basic tests and example of exception thrown by HR client:
> {noformat}
> org.infinispan.client.hotrod.exceptions.TransportException:: Could not fetch transport
>         at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.borrowTransportFromPool(TcpTransportFactory.java:395)
>         at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.getTransport(TcpTransportFactory.java:241)
>         at org.infinispan.client.hotrod.impl.operations.FaultTolerantPingOperation.getTransport(FaultTolerantPingOperation.java:26)
>         at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:53)
>         at org.infinispan.client.hotrod.impl.RemoteCacheImpl.ping(RemoteCacheImpl.java:490)
>         at org.infinispan.client.hotrod.impl.RemoteCacheImpl.resolveCompatibility(RemoteCacheImpl.java:551)
>         at org.infinispan.client.hotrod.RemoteCacheManager.createRemoteCache(RemoteCacheManager.java:341)
>         at org.infinispan.client.hotrod.RemoteCacheManager.getCache(RemoteCacheManager.java:222)
>         at org.infinispan.client.hotrod.RemoteCacheManager.getCache(RemoteCacheManager.java:217)
>         at org.infinispan.spark.rdd.InfinispanRDD$$anonfun$1.apply(InfinispanRDD.scala:52)
>         at org.infinispan.spark.rdd.InfinispanRDD$$anonfun$1.apply(InfinispanRDD.scala:52)
>         at scala.Option.map(Option.scala:146)
>         at org.infinispan.spark.rdd.InfinispanRDD.compute(InfinispanRDD.scala:52)
>         at org.infinispan.spark.rdd.InfinispanRDD.compute(InfinispanRDD.scala:66)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.infinispan.client.hotrod.exceptions.TransportException:: Could not connect to server: /127.0.0.1:11222
>         at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.<init>(TcpTransport.java:78)
>         at org.infinispan.client.hotrod.impl.transport.tcp.TransportObjectFactory.makeObject(TransportObjectFactory.java:35)
>         at org.infinispan.client.hotrod.impl.transport.tcp.TransportObjectFactory.makeObject(TransportObjectFactory.java:16)
>         at org.apache.commons.pool.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:1220)
>         at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.borrowTransportFromPool(TcpTransportFactory.java:390)
>         ... 21 more
> Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>         at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111)
>         at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.<init>(TcpTransport.java:68)
>         ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the infinispan-issues mailing list