#1 Virtual cluster
Rehashing:
I believe we'll have to come up with a custom consistent hash that
knows
about the 2 sites, and places all of the data owners into the local
site. E.g. it would be bad if we create a key K in LON and make the
primary owner T (in SFO) and the backup owner B in LON !
I think this approach works very well if you use SFO as an hot-standby for LON: the
purpose of SFO bein is to pick up the load in the case LON goes down for whatever reason.
In an follow the sun approach, in which LON is up for 12h and then SFO takes over the next
12h there's this won't work.
This should also minimize rehashing across sites, and prefer
rehashing
within sites.
Not sure about that: AFAIK the way rehashing is implemented, is always the primary owner
that generates the state. So if the joiner is in SFO then the burden of join would be much
higher.
In terms of data access, the idea is that all writes would only go
through 1 site, e.g. LON being active and SFO being a backup (in
which
reads could happen); my design didn't assume concurrent writes in
multiple sites (at least not to the same data).
Yes indeed this would work for the master-slave (or hot standby scenario). IMO we should
also consider the master-master (follow the sun) approach as well.
Locking:
Same as above
Configuration:
numOwners would have to be extended, e.g. we could introduce a
property
numPrimarySiteOwners=2 and numBackupSiteOwners=1, TDB
yes indeed these attributes
would be specific to the new CH function you added.
#2 Hot Rod based
Having to connect to potentially all nodes is bad, as customers
possibly
only want to open 1 port for cross-datacenter traffic.
+1.
I assume the RemoteCacheStore would have to be enabled in all nodes
in a
given site for this to work ?
yes. that's a cons for this approach.
How would you handle requests sent during the crash of a HotRod
endpoint
? Would they get queued, similar to
https://issues.jboss.org/browse/JGRP-1401 ?
that would need to be re-implemented indeed. It's the first in the list of cons for
this approach.
How would initial state transfer be done between sites ? E.g. LON has
been up for 2 days and now we start SFO. Does that mean we will
effectively have to transfer *all* of the data in LON --> SFO ?
Transfering all the data is required as the data must be mirrored between the sites.
Of course we'll onloy transfer every entry once, and not numOwners times.
#3 Custom bridge
I like the fact that both sites are configured with possibly
different
numOwners, e.g. LON=2 and SFO=1
This will not work if you need to invoke blocking RPCs between sites:
the copying of traffic to A is always assumed to be asynchronous.
Why that? async
replication is intended as a default, but sync replication should be possible as well.
Just that the bridge would make the invocation in a synchronious way.
Plus,
the recipient in SFO wouldn't know the original sender; in the
example,
the sender would always be X.
Not sure why this is bad for a functional perspective. As long as the sites end up being
in sync.
How do you handle the case where the relay (A or X) crashes and
messages
are sent during that time, before a new relay is elected ?
A subset of RELAY can be
used for this, together with JGRP-1401.
How do you do state transfer, e.g. bootstrap SFO from LON ? I guess
you
transfer the entire state from LON --> SFO, right ?
Indeed state transfer seems to be the biggest challange with this approach. I have some
ideas, I'd also like to get Dan's input on this.
SUMMARY:
I would definitely *not* do #2.
+1
I do like #3 and #1, perhaps we need to focus a bit on #3, to see if
there are other deficiencies we haven't seen yet, as I'm already
familiar with #1.
thanks bela!