[infinispan-issues] [JBoss JIRA] Issue Comment Edited: (ISPN-1000) PUSH based rehashing
Manik Surtani (JIRA)
jira-events at lists.jboss.org
Wed Apr 20 09:55:33 EDT 2011
[ https://issues.jboss.org/browse/ISPN-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596804#comment-12596804 ]
Manik Surtani edited comment on ISPN-1000 at 4/20/11 9:54 AM:
--------------------------------------------------------------
Some design ideas:
{code}
# PUSH based rehashing
The purpose behind this is to provide for a better rehashing scheme that is:
* more robust
* has fewer RPC messages
* performs better
The current rehashing scheme is RPC-heavy and attempts to be non-blocking, resulting in brittleness.
## Overview
* Every node registers a view change listener
* On view change,
* determine if current node is affected and kick off a RehashTask
* If related to a JOIN, RehashTask blocks on JOIN message to grab TopologyInfo (if config is topology-aware).
* Enter READ_ONLY mode
* if affected, loop through keyset in data container and non-shared cache store. For each key:
* if LAST_OWNER on OLD_CH and key has a new owner in NEW_CH, add to push set
* if LAST_OWNER is no longer an owner based on NEW_CH, add key to invalidation set (deals with JOINs and LEAVEs in the same way now!)
* Push state to new owners
* Apply NEW_CH to local node, exit READ_ONLY mode.
* If successful, invalidate keys in invalidation set
### Determining if node is affected by ViewChange:
* If JOIN:
* If idx(JOINER) within + or - numOwners of idx(self)
* Or if LEAVE:
* and idx(LEAVER) within + or - (numOwners + 1) idx(self)
_TODO_: HOw would this affect vnodes? LEAVER or JOINER will have multiple positions to check, and self will have multiple positions too. Can this be encapsulated in the ConsistentHash impl?
### READ_ONLY mode:
I think attempting a non-blocking rehash at this stage is way too complex. We have it with the current scheme and we can see how brittle it is. IMO we should go for a blocking model for now and make sure rehashing is stable and robust, and consider non-blocking later.
READ_ONLY mode is achieved by using the TransactionLogger. But we won't actually use this to *log* any transactions for now. Instead we can use it to block new transactions and non-transactional writes since it exists in all the appropriate places.
### Parallel joiners and leavers
How will this work with parallel joiners and leavers?
{code}
was (Author: manik):
Some design ideas:
{code}
# PUSH based rehashing
The purpose behind this is to provide for a better rehashing scheme that is:
* more robust
* has fewer RPC messages
* performs better
The current rehashing scheme is RPC-heavy and attempts to be non-blocking, resulting in brittleness.
## Overview
* Every node registers a view change listener
* On JOIN, broadcast JOIN message containing optional topology info (if config is topology aware).
* On view change,
* determine if current node is affected and kick off a RehashTask
* If related to a JOIN, RehashTask blocks on JOIN message to grab TopologyInfo (if config is topology-aware).
* Enter READ_ONLY mode
* if affected, loop through keyset in data container and non-shared cache store. For each key:
* if LAST_OWNER on OLD_CH and key has a new owner in NEW_CH, add to push set
* if LAST_OWNER is no longer an owner based on NEW_CH, add key to invalidation set (deals with JOINs and LEAVEs in the same way now!)
* Push state to new owners
* Apply NEW_CH to local node, exit READ_ONLY mode.
* If successful, invalidate keys in invalidation set
### Determining if node is affected by ViewChange:
* If JOIN:
* If idx(JOINER) within + or - numOwners of idx(self)
* Or if LEAVE:
* and idx(LEAVER) within + or - (numOwners + 1) idx(self)
_TODO_: HOw would this affect vnodes? LEAVER or JOINER will have multiple positions to check, and self will have multiple positions too. Can this be encapsulated in the ConsistentHash impl?
### READ_ONLY mode:
I think attempting a non-blocking rehash at this stage is way too complex. We have it with the current scheme and we can see how brittle it is. IMO we should go for a blocking model for now and make sure rehashing is stable and robust, and consider non-blocking later.
READ_ONLY mode is achieved by using the TransactionLogger. But we won't actually use this to *log* any transactions for now. Instead we can use it to block new transactions and non-transactional writes since it exists in all the appropriate places.
### Parallel joiners and leavers
How will this work with parallel joiners and leavers?
{code}
> PUSH based rehashing
> --------------------
>
> Key: ISPN-1000
> URL: https://issues.jboss.org/browse/ISPN-1000
> Project: Infinispan
> Issue Type: Feature Request
> Components: Distributed Cache
> Affects Versions: 4.2.0.Final
> Reporter: Manik Surtani
> Assignee: Manik Surtani
> Labels: rehash
> Fix For: 5.0.0.CR1, 5.0.0.FINAL
>
>
> Current rehash schemes are based on a PULL of state. Joiners (and new owners after a leave) pull state from their neighbours. This JIRA is to reimplement this as a PUSH based scheme, where all nodes detect new joiners (or leavers) and analyse their internal state and determine what needs to be pushed where.
> The scheme should be more robust, involving far fewer RPCs and coordination, and would work better for merge views detected when partitions heal.
> Based on Bela's prototype on https://github.com/belaban/infinispan/tree/rebalance-changes
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list