[infinispan-dev] Server location hints in Infinispan

Mon May 17 10:04:25 EDT 2010

I have created a wiki page with some early thoughts around this.

	http://community.jboss.org/wiki/DesigningServerHinting

So far defining such hints in XML is easy enough, and sharing these hints cluster-wide again is easy enough as it can be added to the join handshake process in DIST.  

This info can be added to the Address of each node when calculating the hash of each Address to place it on a hash wheel, however each and every technique I have seen so far would just increase spread and *reduce* the chances of colocated nodes being adjacent on a hash wheel.  But not *guarantee* this.  The thing is, such placing needs to be done deterministically by any node in the grid, so the only inputs to such a function can only be an Address and the set of hints.  

I'm not sure how useful or acceptable this is though.  Thoughts?

Cheers
Manik

On 22 Mar 2010, at 16:25, Manik Surtani wrote:

> This relates to https://jira.jboss.org/jira/browse/ISPN-180.
> 
> In JBoss Cache, we had a provision to allow for pluggable buddy selection algorithms.  By default, the buddy selection process would first try and pick a buddy in the same buddy group, failing which any buddy *not* on the same physical machine, failing which any buddy not in the same JVM, and finally any buddy at all.  Further, being pluggable, people could write their own buddy selection algorithms to pick buddies based on any additional metrics, such as machine performance by hooking into monitoring tools, etc.
> 
> In Infinispan we do not have an equivalent as yet.  The consistent hash approach to distribution takes a hash of each server's address and uses this to place the server on a consistent hash wheel.  Owners for keys are picked based on consecutive places on the wheel.  So there is every possibility that nodes on the same physical host or rack are selected to back each other up, which is not optimal for data durability.  
> 
> One approach is for each node to provide additional hints as to where it is - hints including "machine id", "rack id" and maybe even "site id".  The hash function that calculates an addresses position on the hash wheel would take these 3 metrics into account, so this should be robust and pretty efficient.  The only drawback with this approach is that for each address, this additional data needs to be globally available since CH's need to work globally and deterministically.  This information could be a part of a DIST JOIN request, which would work well.
> 
> What do people think?  Any interesting alternate approaches to this problem?
> 
> Cheers
> Manik
> 
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org