[infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

Galder Zamarreño galder at redhat.com
Wed Oct 26 05:46:56 EDT 2011


On Oct 24, 2011, at 2:42 PM, Sanne Grinovero wrote:

> On 24 October 2011 12:58, Dan Berindei <dan.berindei at gmail.com> wrote:
>> Hi Galder
>> 
>> On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño <galder at redhat.com> wrote:
>>> 
>>> On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:
>>> 
>>>> ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
>>>> interesting question: if the preloading happens before joining, the
>>>> preloading code won't know anything about the consistent hash. It will
>>>> load everything from the cache store, including the keys that are
>>>> owned by other nodes.
>>> 
>>> It's been defined to work that way:
>>> https://docs.jboss.org/author/display/ISPN/CacheLoaders
>>> 
>>> Tbh, that will only happen in shared cache stores. In non-shared ones, you'll only have data that belongs to that node.
>>> 
>> 
>> Not really... in distributed mode, every time the cache starts it will
>> have another position on the hash wheel.
>> That means even with a non-shared cache store, it's likely most of the
>> stored keys will no longer be local.
>> 
>> Actually I just noticed that you've fixed ISPN-1404, which looks like
>> it would solves my problem when the cache is created by a HotRod
>> server. I would like to extend it to work like this by default, e.g.
>> by using the transport's nodeName as the seed.
>> 
>>>> I think there is a check in place already so that the joiner won't
>>>> push stale data from its cache store to the other nodes, but we should
>>>> also discard the keys that don't map locally or we'll have stale data
>>>> (since we don't have a way to check if those keys are stale and
>>>> register to receive invalidations for those keys).
>>> 
>>> +1, only for shared cache stores.
>>> 
>>>> 
>>>> What do you think, should I discard the non-local keys with the fix
>>>> for ISPN-1470 or should I let them be and warn the user about
>>>> potentially stale data?
>>> 
>>> Discard only for shared cache stores.
>>> 
>>> Cache configurations should be symmetrical, so if other nodes preload, they'll preload only data local to them with your change.
>>> 
>> 
>> Discarding works fine from the correctness POV, but for performance
>> it's not that great: we may do a lot of work to preload keys and have
>> nothing to show for it at the end.
> 
> Can't you just skip loading state and be happy with the state you
> receive from peers? More data will be lazily loaded.
> Applying of course only when you're not the only/first node in the
> grid, in which case you have to load.
> 
> The only alternative I see is to be able to find the boundaries of
> keys you own, and change the CacheLoader API to load keys by the
> identified range - should work with multiple boundaries too for
> virtualnodes, but this is something that not all CacheLoaders will be
> able to implement, so it should be an optional API; for now I'd stick
> with the first option above as I don't see how we can be more
> efficient in loading the state from CacheLoaders than via JGroups.

Before when state transfer meant that state came from a single node, that node could be overloaded and so cache loader access might have been more efficient, particularly if it's a non-shared one that's available in your machine.

The benefit of loading state from cache loader is that the rest of nodes don't have to stop what they're doing, which with loading it from other nodes, in the current design they have to.

> 
> Sanne

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list