[infinispan-dev] Suppressing state transfer via JMX

Fri May 31 12:40:53 EDT 2013

Yes, ISPN-1394 has a broader scope but the proposed solution for 
ISPN-3140 solves quite a lot of ISPN-1394 and it's not complex. We might 
not even need ISPN-1394 soon unless somebody really wants to control 
data ownership down to segment granularity. If we only want to batch 
joins/leaves and manually kick out nodes with or without loosing their 
data then this proposal should be enough. This solution should not 
prevent implementation of ISPN-1394 in future and will not need to be 
removed/undone.

Here are the details:

1. /Add a JMX writable attribute (or operation?) to 
ClusterTopologyManager (name it suppressRehashing?) that is false by 
default but should also be configurable via API or xml. While this 
attribute is true the ClusterTopologyManager queues all 
join/leave/exclude(see below) requests and does not execute them on the 
spot as it would normally happen. The value of this attribute is ignored 
on all nodes but the coordinator. When it is set back to false all 
queued operations (except the ones that cancel eachother out) are 
executed. The setter should be synchronous so when setting is back to 
false it does not return until the queue is empty and all rehashing was 
processed. /

2. /We add a JMX operation excludeNodes(list of addresses) to 
ClusterTopologyManager. Calling this method on any node but the 
coordinator is no-op. This operation removes the node from the topology 
(almost as if it left) and forces a rebalance./ The node is still 
present in the current CH but not in the pending CH. It's basically 
disowned by all its data which is now being transferred to other (not 
excluded) nodes. At the end of the rebalance the node is removed from 
topology for good and can be shut down without loosing data. Note that 
if suppressRehashing==false operation excludeNodes(..) just queues them 
for later removal. We can batch multiple such exclusions and then 
re-activate the rehashing.

The parts that need to be implemented are written in italic above. 
Everything else is already there.

excludeNodes is a way of achieving a soft shutdown and should be used 
only if we care about preserving data int the extreme case where the 
nodes are the last/single owners. We can just kill the node directly if 
we do not care about its data.

suppressRehashing is a way of achieving some kind of batching of 
topology changes. This should speed up state transfer a lot because it 
avoids a lot of pointless reshuffling of data segments when we have many 
successive joiners/leavers.

So what happens if the current coordinator dies for whatever reason? The 
new one will take control and will not have knowledge of the existing 
rehash queue or the previous status of suppressRehashing attribute so it 
will just get the current cache membership status from all members of 
current view and proceed with the rehashing as usual. If the user does 
not want this he can set a default value of true for suppressRehashing. 
The admin has to interact now via JMX with the new coordinator. But 
that's not as bad as the alternative where all the nodes are involved in 
this jmx scheme :) I think having only the coordinator involved in this 
is a plus.

Manik, how does this fit for the full and partial shutdown?

Cheers
Adi

On 05/31/2013 04:20 PM, Manik Surtani wrote:
>
> On 31 May 2013, at 13:52, Dan Berindei <dan.berindei at gmail.com 
> <mailto:dan.berindei at gmail.com>> wrote:
>
>> If we only want to deal with full cluster shutdown, then I think 
>> stopping all application requests, calling Cache.clear() on one node, 
>> and then shutting down all the nodes should be simpler. On start, 
>> assuming no cache store, the caches will start empty, so starting all 
>> the nodes at once and only allowing application requests when they've 
>> all joined should also work without extra work.
>>
>> If we only want to stop a part of the cluster, suppressing 
>> rebalancing would be better, because we wouldn't lose all the data. 
>> But we'd still lose the keys whose owners are all among the nodes we 
>> want to stop. I've discussed this with Adrian, and we think if we 
>> want to stop a part of the cluster without losing data we need a JMX 
>> operation on the coordinator that will "atomically" remove a set of 
>> nodes from the CH. After the operation completes, the user will know 
>> it's safe to stop those nodes without losing data.
>
> I think the no-data-loss option is bigger scope, perhaps part of 
> ISPN-1394.  And that's not what I am asking about.
>
>> When it comes to starting a part of the cluster, a "pause 
>> rebalancing" option would probably be better - but again, on the 
>> coordinator, not on each joining node. And clearly, if more than 
>> numOwner nodes leave while rebalancing is suspended, data will be lost.
>
> Yup.  This sort of option would only be used where data loss isn't an 
> issue (such as a distributed cache).  Where data loss is an issue, 
> we'd need more control - ISPN-1394.
>
>>
>> Cheers
>> Dan
>>
>>
>>
>> On Fri, May 31, 2013 at 12:17 PM, Manik Surtani <msurtani at redhat.com 
>> <mailto:msurtani at redhat.com>> wrote:
>>
>>     Guys
>>
>>     We've discussed ISPN-3140 elsewhere before, I'm brining it to
>>     this forum now.
>>
>>     https://issues.jboss.org/browse/ISPN-3140
>>
>>     Any thoughts/concerns?  Particularly looking to hear from Dan or
>>     Adrian about viability, complexity, ease of implementation.
>>
>>     Thanks
>>     Manik
>>     --
>>     Manik Surtani
>>     manik at jboss.org <mailto:manik at jboss.org>
>>     twitter.com/maniksurtani <http://twitter.com/maniksurtani>
>>
>>     Platform Architect, JBoss Data Grid
>>     http://red.ht/data-grid
>>
>>
>>     _______________________________________________
>>     infinispan-dev mailing list
>>     infinispan-dev at lists.jboss.org
>>     <mailto:infinispan-dev at lists.jboss.org>
>>     https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org <mailto:manik at jboss.org>
> twitter.com/maniksurtani <http://twitter.com/maniksurtani>
>
> Platform Architect, JBoss Data Grid
> http://red.ht/data-grid
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130531/f824ec23/attachment.html