[infinispan-issues] [JBoss JIRA] (ISPN-3947) HotRod client keep trying recover connections to a failed cluster

Thursday, 30 January 2014

    [
https://issues.jboss.org/browse/ISPN-3947?page=com.atlassian.jira.plugin....
] 

Dan Berindei commented on ISPN-3947:
------------------------------------

I don't think configuring a number of retries per server is a good idea, it would mean
that actual timeout increases linearly with the number of servers - just like it does now.
I think a total number of retries and/or a total timeout would be much better.

Note that with the default configuration (i.e. `testOnBorrow == false`), only the first
attempt is done on the primary owner of the key. All the other attempts use a random
server. (Actually it's round-robin, but the state is shared, so for an individual
thread it would look random.) A setting named `retriesPerServer` would make the me think
that that's the number of retries on the server before trying another.

Also, I haven't tested this, but with `testOnBorrow == true` I think the pool will
catch the timeout in the ping operation and retry `maxActive` times internally, before the
HotRod client does its own retrying in `RetryOnFailureOperation`. We can probably ignore
it, since enabling `testOnBorrow` would be bad for performance anyway, but we should
probably document it.

...
 HotRod client keep trying recover connections to a failed cluster
 -----------------------------------------------------------------

                 Key: ISPN-3947
                 URL: https://issues.jboss.org/browse/ISPN-3947
             Project: Infinispan
          Issue Type: Feature Request
          Components: Remote Protocols
    Affects Versions: 6.0.1.Final, 7.0.0.Alpha1
            Reporter: Wolf-Dieter Fink
            Assignee: Galder Zamarreño
              Labels: hotrod, hotrod-java-client

 If an infinispan-server cluster is not longer reachable for some reason, i.e. network
disconnect, the hot-rod client try to re-establish the lost connections.
 The client library will retry this by a fixed calculation based on the max numbers of
connections from the pool or 10 multiplied with the number of available servers.
 This can lead in a very long time until the application can continue and react as it will
wait for the read- or connect-timeout for each try.
 To improve this behaviour there should be a configurable limit of retries per server
and/or a timeout in total.
 This will give the application the chance to handle a remote-cache failure and reply to
the user instead of hanging for minutes (with the default settings) 
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] (ISPN-3947) HotRod client keep trying recover connections to a failed cluster