[infinispan-dev] Suppressing state transfer via JMX
Adrian Nistor
anistor at redhat.com
Fri May 31 13:26:03 EDT 2013
A little correction in my previous email is need. When I said "Note that
if suppressRehashing==false operation excludeNodes(..) just queues ... "
I actualy meant to say suppressRehashing==true.
On 05/31/2013 07:40 PM, Adrian Nistor wrote:
> Yes, ISPN-1394 has a broader scope but the proposed solution for
> ISPN-3140 solves quite a lot of ISPN-1394 and it's not complex. We
> might not even need ISPN-1394 soon unless somebody really wants to
> control data ownership down to segment granularity. If we only want to
> batch joins/leaves and manually kick out nodes with or without loosing
> their data then this proposal should be enough. This solution should
> not prevent implementation of ISPN-1394 in future and will not need to
> be removed/undone.
>
> Here are the details:
>
> 1. /Add a JMX writable attribute (or operation?) to
> ClusterTopologyManager (name it suppressRehashing?) that is false by
> default but should also be configurable via API or xml. While this
> attribute is true the ClusterTopologyManager queues all
> join/leave/exclude(see below) requests and does not execute them on
> the spot as it would normally happen. The value of this attribute is
> ignored on all nodes but the coordinator. When it is set back to false
> all queued operations (except the ones that cancel eachother out) are
> executed. The setter should be synchronous so when setting is back to
> false it does not return until the queue is empty and all rehashing
> was processed. /
>
> 2. /We add a JMX operation excludeNodes(list of addresses) to
> ClusterTopologyManager. Calling this method on any node but the
> coordinator is no-op. This operation removes the node from the
> topology (almost as if it left) and forces a rebalance./ The node is
> still present in the current CH but not in the pending CH. It's
> basically disowned by all its data which is now being transferred to
> other (not excluded) nodes. At the end of the rebalance the node is
> removed from topology for good and can be shut down without loosing
> data. Note that if suppressRehashing==false operation excludeNodes(..)
> just queues them for later removal. We can batch multiple such
> exclusions and then re-activate the rehashing.
>
> The parts that need to be implemented are written in italic above.
> Everything else is already there.
>
> excludeNodes is a way of achieving a soft shutdown and should be used
> only if we care about preserving data int the extreme case where the
> nodes are the last/single owners. We can just kill the node directly
> if we do not care about its data.
>
> suppressRehashing is a way of achieving some kind of batching of
> topology changes. This should speed up state transfer a lot because it
> avoids a lot of pointless reshuffling of data segments when we have
> many successive joiners/leavers.
>
> So what happens if the current coordinator dies for whatever reason?
> The new one will take control and will not have knowledge of the
> existing rehash queue or the previous status of suppressRehashing
> attribute so it will just get the current cache membership status from
> all members of current view and proceed with the rehashing as usual.
> If the user does not want this he can set a default value of true for
> suppressRehashing. The admin has to interact now via JMX with the new
> coordinator. But that's not as bad as the alternative where all the
> nodes are involved in this jmx scheme :) I think having only the
> coordinator involved in this is a plus.
>
> Manik, how does this fit for the full and partial shutdown?
>
> Cheers
> Adi
>
>
> On 05/31/2013 04:20 PM, Manik Surtani wrote:
>>
>> On 31 May 2013, at 13:52, Dan Berindei <dan.berindei at gmail.com
>> <mailto:dan.berindei at gmail.com>> wrote:
>>
>>> If we only want to deal with full cluster shutdown, then I think
>>> stopping all application requests, calling Cache.clear() on one
>>> node, and then shutting down all the nodes should be simpler. On
>>> start, assuming no cache store, the caches will start empty, so
>>> starting all the nodes at once and only allowing application
>>> requests when they've all joined should also work without extra work.
>>>
>>> If we only want to stop a part of the cluster, suppressing
>>> rebalancing would be better, because we wouldn't lose all the data.
>>> But we'd still lose the keys whose owners are all among the nodes we
>>> want to stop. I've discussed this with Adrian, and we think if we
>>> want to stop a part of the cluster without losing data we need a JMX
>>> operation on the coordinator that will "atomically" remove a set of
>>> nodes from the CH. After the operation completes, the user will know
>>> it's safe to stop those nodes without losing data.
>>
>> I think the no-data-loss option is bigger scope, perhaps part of
>> ISPN-1394. And that's not what I am asking about.
>>
>>> When it comes to starting a part of the cluster, a "pause
>>> rebalancing" option would probably be better - but again, on the
>>> coordinator, not on each joining node. And clearly, if more than
>>> numOwner nodes leave while rebalancing is suspended, data will be lost.
>>
>> Yup. This sort of option would only be used where data loss isn't an
>> issue (such as a distributed cache). Where data loss is an issue,
>> we'd need more control - ISPN-1394.
>>
>>>
>>> Cheers
>>> Dan
>>>
>>>
>>>
>>> On Fri, May 31, 2013 at 12:17 PM, Manik Surtani <msurtani at redhat.com
>>> <mailto:msurtani at redhat.com>> wrote:
>>>
>>> Guys
>>>
>>> We've discussed ISPN-3140 elsewhere before, I'm brining it to
>>> this forum now.
>>>
>>> https://issues.jboss.org/browse/ISPN-3140
>>>
>>> Any thoughts/concerns? Particularly looking to hear from Dan or
>>> Adrian about viability, complexity, ease of implementation.
>>>
>>> Thanks
>>> Manik
>>> --
>>> Manik Surtani
>>> manik at jboss.org <mailto:manik at jboss.org>
>>> twitter.com/maniksurtani <http://twitter.com/maniksurtani>
>>>
>>> Platform Architect, JBoss Data Grid
>>> http://red.ht/data-grid
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> <mailto:infinispan-dev at lists.jboss.org>
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Manik Surtani
>> manik at jboss.org <mailto:manik at jboss.org>
>> twitter.com/maniksurtani <http://twitter.com/maniksurtani>
>>
>> Platform Architect, JBoss Data Grid
>> http://red.ht/data-grid
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130531/d9624251/attachment-0001.html
More information about the infinispan-dev
mailing list