Hi Manik,
On 1/10/12 7:02 PM, Manik Surtani wrote:
Sorry for the late response on this thread.
I think solution #2 is overkill - this could be a huge bottleneck. Something like
solution #1 will work, but that's still expensive (in terms of memory consumption, and
consequently GC).
Yes
Are the communications between relay coordinators (RC) asynchronous?
Yes, everything is asynchronous. Synchronous constructs, such as RPCs,
are then run over this async 'transport'. This is the same model JGroups
has in a purely local cluster.
Bridging is done at the transport; this can be compared to a switch,
bridging subnets: whenever the switch is down, any IP packets sent
during this time are not forwarded to the other subnet and a higher-up
layer has to do repair. TCP for instance does this, whereas UDP doesn't.
I realise the communication is not on the critical path of a
transaction in either data centre, but it could still be sync. If this
were the case, we could have something like this:
That would slow things down though ! Plus, it would require an ack, so
that's additional traffic (not much though). The big issue here is that
latency is usually high between sites, so the round trip time would kill
us !
Assume {A, B, C} and {X, Y, Z}. A --> X synchronously, but
offline (in a separate thread) so calling transactions aren't blocked. Assume key K
is on {A, B, Z}. A change to K would synchronously update A and B, and put the update on
the RC (and backup RC)'s processing queue. A then flushes to X, and on receiving the
ack from X, informs B that the message was delivered. If X crashes/doesn't ack, A
keeps retrying, potentially batching queued updates.
While this could certainly be done, what happens if X itself crashes ?
Also, I recall you sending the update for K as 3 unicasts: 1 to A, 1 to
B and 1 to Z. The one to Z is relayed by A to X, and sent from X to Z.
WDYT?
Cheers
Manik
On 14 Dec 2011, at 11:51, Bela Ban wrote:
> Tobias made me aware of a bug that can happen when we use RELAY and one
> of the relay coordinators crash: messages sent between the crash of the
> relay coordinator and the switch of the backup relay to full relay are
> not relayed.
>
> This can lead to inconsistencies, see [1] for details. If I implement
> solution #1, then the chances of this happening are vastly reduced.
>
> I wanted to ask the bright folks on this list though, if you see a
> solution that only involves Infinispan (rebalancing) ?
> Cheers,
>
> [1]
https://issues.jboss.org/browse/JGRP-1401
>
> --
> Bela Ban
--
Bela Ban
Lead JGroups (
http://www.jgroups.org)
JBoss / Red Hat