[infinispan-dev] Distribution across multiple sites

Wed Jan 11 04:33:50 EST 2012

Hi Manik,

On 1/10/12 7:02 PM, Manik Surtani wrote:
> Sorry for the late response on this thread.
>
> I think solution #2 is overkill - this could be a huge bottleneck.  Something like solution #1 will work, but that's still expensive (in terms of memory consumption, and consequently GC).

Yes

> Are the communications between relay coordinators (RC) asynchronous?

Yes, everything is asynchronous. Synchronous constructs, such as RPCs, 
are then run over this async 'transport'. This is the same model JGroups 
has in a purely local cluster.

Bridging is done at the transport; this can be compared to a switch, 
bridging subnets: whenever the switch is down, any IP packets sent 
during this time are not forwarded to the other subnet and a higher-up 
layer has to do repair. TCP for instance does this, whereas UDP doesn't.

>   I realise the communication is not on the critical path of a 
transaction in either data centre, but it could still be sync.  If this 
were the case, we could  have something like this:

That would slow things down though ! Plus, it would require an ack, so 
that's additional traffic (not much though). The big issue here is that 
latency is usually high between sites, so the round trip time would kill 
us !

> Assume {A, B, C} and {X, Y, Z}.  A -->  X synchronously, but offline (in a separate thread) so calling transactions aren't blocked.  Assume key K is on {A, B, Z}.  A change to K would synchronously update A and B, and put the update on the RC (and backup RC)'s processing queue.  A then flushes to X, and on receiving the ack from X, informs B that the message was delivered.  If X crashes/doesn't ack, A keeps retrying, potentially batching queued updates.

While this could certainly be done, what happens if X itself crashes ?

Also, I recall you sending the update for K as 3 unicasts: 1 to A, 1 to 
B and 1 to Z. The one to Z is relayed by A to X, and sent from X to Z.

> WDYT?
>
> Cheers
> Manik
>
> On 14 Dec 2011, at 11:51, Bela Ban wrote:
>
>> Tobias made me aware of a bug that can happen when we use RELAY and one
>> of the relay coordinators crash: messages sent between the crash of the
>> relay coordinator and the switch of the backup relay to full relay are
>> not relayed.
>>
>> This can lead to inconsistencies, see [1] for details. If I implement
>> solution #1, then the chances of this happening are vastly reduced.
>>
>> I wanted to ask the bright folks on this list though, if you see a
>> solution that only involves Infinispan (rebalancing) ?
>> Cheers,
>>
>> [1] https://issues.jboss.org/browse/JGRP-1401
>>
>> --
>> Bela Ban

-- 
Bela Ban
Lead JGroups (http://www.jgroups.org)
JBoss / Red Hat