[infinispan-dev] default value for virtualNodes

Mircea Markus mircea.markus at jboss.com
Fri Jan 27 08:35:28 EST 2012


The tool you built might be very useful for capacity planning. I think it would be great if you document it somewhere and add it to the code base.

On 27 Jan 2012, at 08:41, Dan Berindei wrote:
> Hi guys
> 
> I've been working on a test to search for an optimal default value here:
> https://github.com/danberindei/infinispan/commit/983c0328dc40be9609fcabb767dd46f9b98af464
> 
> I'm measuring both the number of keys for which a node is primary
> owner and the number of keys for which it is one of the owners
> compared to the ideal distribution (K/N keys on each node). The former
> tells us how much more work the node could be expected to do, the
> latter how much memory the node is likely to need.
> 
> I'm only running 10000 loops, so the max figure is not the absolute
> maximum. But it's certainly bigger than the 0.9999 percentile.
> 
> The full results are here:
> http://fpaste.org/cI1r/
> 
> The uniformity of the distribution goes up with the number of virtual
> nodes but down with the number of physical nodes. I think we should go
> with a default of 48 nodes (or 50 if you prefer decimal). With 32
> nodes, there's only a 0.1% chance that a node will hold more than 1.35
> * K/N keys, and a 0.1% chance that the node will be primary owner for
> more than 1.5 * K/N keys.
> 
> We could go higher, but we run against the risk of node addresses
> colliding on the hash wheel. According to the formula on the Birthday
> Paradox page (http://en.wikipedia.org/wiki/Birthday_problem), we only
> need 2072 addresses on our 2^31 hash wheel to get a 0.1% chance of
> collision. That means 21 nodes * 96 virtual nodes, 32 nodes * 64
> virtual nodes or 43 nodes * 48 virtual nodes.




More information about the infinispan-dev mailing list