Reconnecting a client automatically *without losing state*

tsuna tsunanet at gmail.com
Sun Aug 1 21:59:11 EDT 2010


Hi,
so I have a simple 2-handler pipeline I use to communicate with a
remote server.  When the connection gets closed unexpectedly (or when
I can't connect for the first time), I want to try to re-connect
automatically a few times.  Now this seems to be some kind of a FAQ,
and I know you're gonna point me to the example/uptime code, but this
code is too simplistic and doesn't work in non-trivial examples, so
are the answers previously given on this list that I found.

Here's the problem: it seems that the recommended way to implement
re-connection is to keep a reference to ClientBootstrap object
associated with the client and use it to reconnect.  The uptime
example stores the reference in the UptimeClientHandler.  The
reference gets all the way to there because the ChannelPipelineFactory
has access to it and gives it to UptimeClientHandler's constructor.
Unfortunately this approach has two problems for me:
  1. My ChannelPipelineFactory doesn't have access to the ClientBootstrap.
  2. Using the ClientBootstrap to reconnect loses all the state of
stateful handlers because it recreates a new ChannelPipeline.

The reason I have problem 1. is because I have a single
ChannelPipelineFactory instance and I re-use it many times to create
multiple clients that talk to different remote servers.  I could
always work around this problem by creating a new
ChannelPipelineFactory every time, but it sorts of defeats the purpose
of having a factory in the first place, doesn't it?  Plus, what if I
have some shared state or handler in that factory that I need to give
to all my ChannelPipelines?  It's gonna get messy.

The reason problem 2. exists is because what everyone seems to do to
reconnect is to call the connect() method on the ClientBootstrap and
give it the SocketAddress stored in the ClientBootstrap's options map
(the "remoteAddress" option).  However, the implementation of
ClientBootstrap#connect(SocketAddress, SocketAddress) – which is where
we end up ultimately – is really weird.  It creates a new
ChannelPipeline using the ChannelPipelineFactory but it stores the
result in a local variable named `pipeline' that shadows the
`pipeline' attribute declared in its parent class (Bootstrap).  So
first of all, the parent class is unnecessarily always allocating a
useless ChannelPipeline and a useless ChannelPipelineFactory.  Then it
creates a new Channel and connects it to the remote side.  Because it
created a new ChannelPipeline, we now have a new Channel with a new
set of handlers.  If one of the handlers is stateful, then that state
was lost.

In my handler, I keep some state I want to preserve across
reconnections.  Let's say there's a small network blip, I wanna try to
see if I can reconnect to the remote side and keep going before I do
some really expensive error recovery.  So in the exceptionCaught
method of my handler, what I wanna do is reconnect that very same
channel I had before.  The scenario that's the most complicated to
deal with is if the channel was never successfully connected in the
first place.  In that case, when my exceptionCaught is invoked, I
can't use the Channel's getRemoteAddress method to get the address of
the remote side I need to reconnect to, because it will return null
(since the Channel was never properly connected).  So the only
location where I can find the SocketAddress I need is in the
ClientBootstrap's options map, which I don't have easy access to from
my handler due to problem 1.

This kind of confusing and not very easy to explain but I hope it
makes sense.  Generally speaking, what's the proper way of trying to
recover from transient connection failures with stateful handlers that
need to try to preserve their state as much as possible across
reconnections?  For instance how would you change the uptime client
example so that the client doesn't reset the uptime to 0 when there's
a disconnection, so it'd keep track of cumulative uptime and downtime,
while avoiding problem 1 (that is, the ChannelPipelineFactory is
shared and you want to be able to use it to create a number of clients
at the same time, so you cant really give it the reference to the
ClientBootstrap).

One thing I tried to do is to create a ClientBootstrap, give it the
shared ChannelPipelineFactory, connect(), then get the ChannelPipeline
(connect() returns a ChannelFuture on which you can
.getChannel().getPipeline()), then .get() the handler I want from the
ChannelPipeline, and give it the SocketAddress to where it should
reconnect.  The problem is that this is racy, if the other side is
down, I'm going to get a ConnectException which may trigger my
exceptionCaught in a Netty IO thread before the the other thread had a
chance to give it the SocketAddress.  I can work around this by
storing extra state, using extra synchronization in my handler, but
this is really messy and sounds unnecessarily complicated, surely
there must be a better way.

PS: I Stumbled Upon this: http://su.pr/1Nk3MT – hopefully it'll help
solve similar problems in the future ;-)

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com



More information about the netty-users mailing list