Hi Mircea
I think you're missing an intro section with the use cases we want to
handle in 5.2.
The requirements for having a backup datacenter that isn't accessed by
any clients are pretty different from the requirements for multiple
"primary" datacenters that are all handling requests at the same time,
so we should have a clear image of what we are trying to achieve.
On Fri, Feb 10, 2012 at 5:47 PM, Bela Ban <bban(a)redhat.com> wrote:
I'm going to comment via this mailing list, and we can later
summarize
and append to the wiki (I don't like discussions on the wiki... :-))...
#1 Virtual cluster
Rehashing:
I believe we'll have to come up with a custom consistent hash that knows
about the 2 sites, and places all of the data owners into the local
site. E.g. it would be bad if we create a key K in LON and make the
primary owner T (in SFO) and the backup owner B in LON !
This should also minimize rehashing across sites, and prefer rehashing
within sites.
For some use cases I would think it's essential that the data is
replicated in both sites (async, of course).
But I second the idea that there must be at least one backup in the
same site as the primary owner, or rehashes become really expensive.
In terms of data access, the idea is that all writes would only go
through 1 site, e.g. LON being active and SFO being a backup (in which
reads could happen); my design didn't assume concurrent writes in
multiple sites (at least not to the same data).
I like the idea of designating one site as the "master", but I'm
pretty sure we want to handle use cases where there is more than one
master.
Perhaps we could incorporate Erik's suggestion on the wiki and allow
the selection of a master site both at a cache level and at a key
level.
Locking:
Same as above
Having only one master site would certainly make locking easier.
In the other scenarios I'm not sure if we should always designate a
master site for a key and always lock there, or if we should only lock
on a "local primary owner" that's in the same site as the transaction
originator.
Configuration:
numOwners would have to be extended, e.g. we could introduce a property
numPrimarySiteOwners=2 and numBackupSiteOwners=1, TDB
Should we allow the user to specify the number of backup sites as
well? That's assuming we are going to support more than two sites...
#2 Hot Rod based
Having to connect to potentially all nodes is bad, as customers possibly
only want to open 1 port for cross-datacenter traffic.
I'm not sure I agree. It would be bad if we expect to connect over the
internet to the remote site, but I assume that everyone would use a
VPN to communicate between sites anyway.
I assume the RemoteCacheStore would have to be enabled in all nodes
in a
given site for this to work ?
How would you handle requests sent during the crash of a HotRod endpoint
? Would they get queued, similar to
https://issues.jboss.org/browse/JGRP-1401 ?
How would initial state transfer be done between sites ? E.g. LON has
been up for 2 days and now we start SFO. Does that mean we will
effectively have to transfer *all* of the data in LON --> SFO ?
#3 Custom bridge
I like the fact that both sites are configured with possibly different
numOwners, e.g. LON=2 and SFO=1
+1
This will not work if you need to invoke blocking RPCs between
sites:
the copying of traffic to A is always assumed to be asynchronous. Plus,
the recipient in SFO wouldn't know the original sender; in the example,
the sender would always be X.
Mircea was saying that we'd only send over committed modifications, so
no locking RPCs that need to be synchronous.
I'm not sure if the original sender is still relevant in this case.
How do you handle the case where the relay (A or X) crashes and
messages
are sent during that time, before a new relay is elected ?
I suspect this one is the tricky bit... Mircea did mention that he'd
like to reuse the retransmission logic from RELAY, but I'm not sure
how easy that would be.
How do you do state transfer, e.g. bootstrap SFO from LON ? I guess
you
transfer the entire state from LON --> SFO, right ?
For 5.2 I think this would be acceptable.
SUMMARY:
I would definitely *not* do #2.
I do like #3 and #1, perhaps we need to focus a bit on #3, to see if
there are other deficiencies we haven't seen yet, as I'm already
familiar with #1.
Cheers,
On 2/10/12 4:02 PM, Mircea Markus wrote:
> Hi,
>
> I've starte a document[1] that contains 3 possible approaches for implementing
the X-datacentre replication functionality.
> This is a highly requested bit of functionality for 5.2 and involves interaction
between several components: e.g. state transfer, transactions hotrod, jgroups etc. Can you
please take a look and comment?
>
> [1]
https://community.jboss.org/wiki/CrossDatacenterReplication-Design
--
Bela Ban
Lead JGroups (
http://www.jgroups.org)
JBoss / Red Hat
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev