[infinispan-dev] Preloading from disk versus state transfer Re: ISPN-1384 - InboundInvocationHandlerImpl should wait for cache to be started? (not just defined)

Sanne Grinovero sanne at infinispan.org
Mon Oct 24 09:42:40 EDT 2011


On 24 October 2011 12:58, Dan Berindei <dan.berindei at gmail.com> wrote:
> Hi Galder
>
> On Mon, Oct 24, 2011 at 1:46 PM, Galder Zamarreño <galder at redhat.com> wrote:
>>
>> On Oct 24, 2011, at 12:04 PM, Dan Berindei wrote:
>>
>>> ISPN-1470 (https://issues.jboss.org/browse/ISPN-1470) raises an
>>> interesting question: if the preloading happens before joining, the
>>> preloading code won't know anything about the consistent hash. It will
>>> load everything from the cache store, including the keys that are
>>> owned by other nodes.
>>
>> It's been defined to work that way:
>> https://docs.jboss.org/author/display/ISPN/CacheLoaders
>>
>> Tbh, that will only happen in shared cache stores. In non-shared ones, you'll only have data that belongs to that node.
>>
>
> Not really... in distributed mode, every time the cache starts it will
> have another position on the hash wheel.
> That means even with a non-shared cache store, it's likely most of the
> stored keys will no longer be local.
>
> Actually I just noticed that you've fixed ISPN-1404, which looks like
> it would solves my problem when the cache is created by a HotRod
> server. I would like to extend it to work like this by default, e.g.
> by using the transport's nodeName as the seed.
>
>>> I think there is a check in place already so that the joiner won't
>>> push stale data from its cache store to the other nodes, but we should
>>> also discard the keys that don't map locally or we'll have stale data
>>> (since we don't have a way to check if those keys are stale and
>>> register to receive invalidations for those keys).
>>
>> +1, only for shared cache stores.
>>
>>>
>>> What do you think, should I discard the non-local keys with the fix
>>> for ISPN-1470 or should I let them be and warn the user about
>>> potentially stale data?
>>
>> Discard only for shared cache stores.
>>
>> Cache configurations should be symmetrical, so if other nodes preload, they'll preload only data local to them with your change.
>>
>
> Discarding works fine from the correctness POV, but for performance
> it's not that great: we may do a lot of work to preload keys and have
> nothing to show for it at the end.

Can't you just skip loading state and be happy with the state you
receive from peers? More data will be lazily loaded.
Applying of course only when you're not the only/first node in the
grid, in which case you have to load.

The only alternative I see is to be able to find the boundaries of
keys you own, and change the CacheLoader API to load keys by the
identified range - should work with multiple boundaries too for
virtualnodes, but this is something that not all CacheLoaders will be
able to implement, so it should be an optional API; for now I'd stick
with the first option above as I don't see how we can be more
efficient in loading the state from CacheLoaders than via JGroups.

Sanne



More information about the infinispan-dev mailing list