General comments:
- It seems that this is a mechanism where you replicate to the
repl_count instances 'next' to you, ie. similar to Buddy Replication ?
- Are you tying your rebalancing mechanism to the consistent hash
implementation ? This would IMO be bad because it doesn't allow for
plugging in of a different CH algorithm !
More comments inline
Manik Surtani wrote:
DIST, take 2!
Previous design failed due to two flaws:
1. Transaction logs maintained on sender, for each recipient.
Plenty of scope for races, or heavily synchronized.
2. Consistent hash attempted to be overly fair in evenly dispersing
nodes across a hash space. Meant that there was an often large and
unnecessary amount of rehashing to do, which exacerbated the problem
in 1.
So we have a new approach, based on the following premises.
1. Consistent hash (CH) based on fixed positions in the hash space
rather than relative ones.
Do you have a description of how the fixed-positions CH works ? From the
bullets below it seems this is like Buddy Replication, where you store
the data on the N buddies next to you.
1.1. Pros: limited and finite rehashing, particularly when
there
is a leave.
1.1.1. If the leaver is L, only node (L - 1) and node (L + 1)
will have to push state, and only (L + 1) and (L + replCount) will
have to receive state.
1.2. Cons: uneven spread of load (mitigated with grid size)
4. Implementation notes:
4.4. InstallConsistentHashCommand - an RPC command that
"installs" a consistent hash instance on remote nodes.
What does 'installing a consisten hash instance' mean ?
4.5. GetConsistentHashCommand - an RPC command that
"asks" a node
to serialize and transmit across its current CH impl.
Serialize what ? The state (I assume) ? Or the consistent hash ? How can
you serialize a CH in the latter case ?
--
Bela Ban
Lead JGroups / Clustering Team
JBoss