On 1 Feb 2011, at 09:11, Bela Ban wrote:
RELAY [1] bridges separate local clusters into large virtual global
clusters, e.g. {A,B,C} and {X,Y,Z} into {A,B,C,X,Y,Z}.
This new global view has local members A, B and C and proxies X, Y, Z on
the {A,B,C} cluster, and vice versa.
When A sends a message, it is forwarded to the other cluster, but the
sender is wrapped into a ProxyAddress A/X. This means that the original
sender is A, but the local address in {X,Y,Z} is X. However, there is a
problem !
A ProxyAddress A/X's hashCode(), equals() and compareTo() use *X*, which
means that if we add X, Y, Z and A/X into a *HashSet* (or HashMap), X
and A/X map to the same address ! So if you have a View (on X) with
{A/X, B/X, C/X, X, Y, Z}, and add all members to a HashMap, we'll only
have a size of 3 !!
If we used *A* in A/X for hashCode(), equals() and compareTo(), then
this problem would not exist, however, this means that we would now
'know' about the other cluster, and therefore digest handling, flow
control etc would happen across both clusters, which is something we
don't want; we want the 2 local cluster to be completely autonomous !
E.g. we don't want cluster {X,Y,Z} to block on credits from B in the
other cluster...
So I was thinking of passing the local view {X,Y,Z} up to Infinispan
instead of the global view {A/X,B/X,C/X,X,Y,Z}. This would mean
Infinispan would know only about A, B and C in the cluster {A,B,C}, and
about X, Y and Z in {X,Y,Z}.
Now, I want to be able to have backups of keys from {A,B,C} in {X,Y,Z}
in DIST mode, e.g. with numOwners, key "name" should be stored on A and
C in the local cluster, and on Z in the remote cluster.
To do this, the consistent hash function would know about the local
cluster {A,B,C} and the remote cluster {X,Y,Z}. It would get view
changes by hooking into RELAY.
So when there is a local("name", 3), it would return A, C, A/Z, causing
Infinispan to fetch the data from or store the data to A,C and Z.
This should work fine I guess, because when Infinispan tries to send
data to A/Z, JGroups's RELAY will forward the message to the remote Z.
Q: does Infinispan assert that A, C and Z are in the local view, when
distributing data ? Because then my scheme above wouldn't work...
The other question I have is, can I force Infinispan to do a rehashing ?
For example, when the consistent hash function in {A,B,C} gets a view
change for the remote view, going from {X} to {X,Y}, then I'd like
Infinispan to do a rehashing, checking whether all keys are in the
correct location and - if not - call into the consistent hash function
to compute the new locations...
My plan re: RELAY was to actually implement a delegating ConsistentHash function where I
maintain 2 hash wheels, one for 'lan' and one for 'wan' nodes, and of the
numOwners of the key, pick N of them (configurable) to be in a remote datacentre. It
would be transparent to the rest of Infinispan, but you would have to 'configure'
Infinispan to be aware of RELAY so that it can use the appropriate consistent hash impl.
Also I'd need to change the RPC dispatcher a bit, to force async comms for remote
datacentre nodes, and to de-prioritise them when doing remote GETs.
It's been on my plate for a while now, it just keeps getting overtaken with other
stuff. :-)
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani
Lead, Infinispan
http://www.infinispan.org