Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2

Saturday, 4 February 2012

On 1 Feb 2012, at 12:23, Dan Berindei wrote:

...
 Bela, you're right, this is essentially what we talked about in
Lisbon:
 https://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign

 For joins I actually started working on a policy of coalescing joins
 that happen one after the other in a short time interval. The current
 implementation is very primitive, as I shifted focus to stability, but
 it does coalesce joins 1 second after another join started (or while
 that join is still running).

 I don't quite agree with Sanne's assessment that it's fine for
 getCache() to block for 5 minutes until the administrator allows the
 new node to join. We should modify startCaches() instead to signal to
 the coordinator that we are ready to receive data for one or all of
 the defined caches, and wait with a customizable time limit until the
 caches have properly joined the cluster.

 The getCache() timeout should not be increased at all. Instead I would
 propose that getCache() returns a functional cache immediately, even
 if the cache didn't receive any data, and it works solely as an L1
 cache until the administrator allows it to join. I'd even make it
 possible to designate a cache as an L1-only cache, so it's never an
 owner for any key. 
I presume this would be encoded in the Address?  That would make sense for a node
permanently designated as an L1 node.  But then how would this work for a node temporarily
acting as L1 only, until it has been allowed to join?  Change the Address instance on the
fly?  A delegating Address?  :/

...
 For leaves, the main problem is that every node has to compute the
 same primary owner for a key, at all times. So we need a 2PC cache
 view installation immediately after any leave to ensure that every
 node determines the primary owner in the same way - we can't coalesce
 or postpone leaves. 
Yes, manual rehashing would probably just be for joins.  Controlled shutdown in itself is
manual, and crashes, well, need to be dealt with immediately IMO.

...

 For 5.2 I will try to decouple the cache view installation from the
 state transfer, so in theory we will be able to coalesce/postpone the
 state transfer for leaves as well
 (https://issues.jboss.org/browse/ISPN-1827). I'm kind of need it for
 non-blocking state transfer, because with the current implementation a
 leave forces us to cancel any state transfer in progress and restart
 with the updated cache view - a state transfer rollback will be very
 expensive with NBST.

 Erik does raise a valid point - with TACH, if we bring up a node with
 a different siteId, then it will be an owner for all the keys in the
 cache. That node probably isn't provisioned to hold all the keys, so
 it would very likely run out of memory or evict much of the data. I
 guess that makes it a 5.2 issue? 
Yes.

...
 Shutting down a site should be possible even with what we have now -
 just insert a DISCARD protocol in the JGroups stack of all the nodes
 that are shutting down, and when FD finally times out on the nodes in
 the surviving datacenter they won't have any state transfer to do
 (although it may cause a few failed state transfer attempts). We could
 make it simpler though.

 Cheers
 Dan

 On Tue, Jan 31, 2012 at 6:21 PM, Erik Salter <an1310(a)hotmail.com&gt; wrote:
> ...such as bringing up a backup data center.
> 
> -----Original Message-----
> From: infinispan-dev-bounces(a)lists.jboss.org
> [mailto:infinispan-dev-bounces@lists.jboss.org] On Behalf Of Bela Ban
> Sent: Tuesday, January 31, 2012 11:18 AM
> To: infinispan-dev(a)lists.jboss.org
> Subject: Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2
> 
> I cannot volunteer either, but I find it important to be done in 5.2 !
> 
> Unless rehashing works flawlessly with a large number of nodes joining
> at the same time, I think manual rehashing is crucial...
> 
> 
> 
> On 1/31/12 5:13 PM, Sanne Grinovero wrote:
>> On 31 January 2012 16:06, Bela Ban&lt;bban(a)redhat.com&gt;  wrote:
>>> This is essentially what I suggested at the Lisbon meeting, right ?
>> 
>> Yes!
>> 
>>> I think Dan had a design wiki on this somewhere...
>> 
>> Just rising it here as it was moved to 6.0, while I think it deserves
>> a dedicated thread to better think about it. If it's not hard, I think
>> it should be done sooner.
>> But while I started the thread to wake up the brilliant minds, I can't
>> volunteer for this to make it happen.
>> 
>> Sanne
>> 
>>> 
>>> 
>>> On 1/31/12 4:53 PM, Sanne Grinovero wrote:
>>>> I think this is an important feature to have soon;
>>>> 
>>>> My understanding of it:
>>>> 
>>>> We default with the feature off, and newly discovered nodes are
>>>> added/removed as usual. With a JMX operatable switch, one can disable
>>>> this:
>>>> 
>>>> If a remote node is joining the JGroups view, but rehash is off: it
>>>> will be added to a to-be-installed view, but this won't be installed
>>>> until rehash is enabled again. This gives time to add more changes
>>>> before starting the rehash, and would help a lot to start larger
>>>> clusters.
>>>> 
>>>> If the [self] node is booting and joining a cluster with manual rehash
>>>> off, the start process and any getCache() invocation should block and
>>>> wait for it to be enabled. This would need of course to override the
>>>> usually low timeouts.
>>>> 
>>>> When a node is suspected it's a bit a different story as we need to
>>>> make sure no data is lost. The principle is the same, but maybe we
>>>> should have two flags: one which is a "soft request" to avoid
rehashes
>>>> of less than N members (and refuse N>=numOwners ?), one which is just
>>>> disable it and don't care: data might be in a cachestore, data might
>>>> not be important. Which reminds me, we should consider as well a JMX
>>>> command to flush the container to the CacheLoader.
>>>> 
>>>> --Sanne
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> --
>>> Bela Ban
>>> Lead JGroups (http://www.jgroups.org)
>>> JBoss / Red Hat
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> --
> Bela Ban
> Lead JGroups (http://www.jgroups.org)
> JBoss / Red Hat
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2