[JBoss JIRA] (ISPN-6388) Spark integration - TimeoutException: Replication timeout on application execution - infinispan-issues

Friday, 22 April 2016

    [
https://issues.jboss.org/browse/ISPN-6388?page=com.atlassian.jira.plugin....
] 

Gustavo Fernandes commented on ISPN-6388:
-----------------------------------------

Here's my theory of what happened in your test.

There were failures during the iteration: either a server was down or for some reason it
stopped responding, maybe due to GC (it does not matter the reason). 
When such failures occur, there is a retry with the segments that were not done, and since
from the logs you were using the Hot Rod client version 8.1.0.Final, it was being affected
by https://issues.jboss.org/browse/ISPN-6234, where after a failover it would retry with
the wrong segments. Since the segments were wrong, the iteration would not be confined to
the local server where it contacted, causing remote RPC to obtain the segments, ultimately
provoking a cascade effect resulting on timeouts. 

I believe the timeouts should not occur anymore (I was not able to reproduce), could you
maybe test again with Infinispan 8.2.1.Final (both client and server) and the Spark
connector 0.3?

...
 Spark integration - TimeoutException: Replication timeout on
application execution 
 -----------------------------------------------------------------------------------

                 Key: ISPN-6388
                 URL: https://issues.jboss.org/browse/ISPN-6388
             Project: Infinispan
          Issue Type: Bug
          Components: Spark
    Affects Versions: 8.2.0.Final
            Reporter: Matej Čimbora
            Assignee: Gustavo Fernandes
         Attachments: app_0.txt, driver.txt, server.txt

 The issue occurs sporadically while application is executing (e.g. WordCount example). To
some degree it seems to be affected by number of partitions used (i.e. higher the count,
the less likely the issue occurs).
 Using 8 node cluster (1 worker/1 ISPN server per physical node), connector v. 0.2.
 Attached sample driver, server, application logs. 

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[JBoss JIRA] (ISPN-6388) Spark integration - TimeoutException: Replication timeout on application execution