[infinispan-issues] [JBoss JIRA] (ISPN-5515) Preload only on the node that starts up the first

Tue Jun 2 04:40:02 EDT 2015

    [ https://issues.jboss.org/browse/ISPN-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072986#comment-13072986 ] 

Dan Berindei commented on ISPN-5515:
------------------------------------

If the cluster is running and a new node is being started, it will clear its store and use the data from the running members, so it doesn't matter how long it was shut down. It would be the admin's responsibility to make sure the last node to shut down is the last node to start back up.

Clearly, this is not [graceful shutdown and restore|https://github.com/infinispan/infinispan/wiki/Graceful-shutdown-&-restore], but it would automate the advice we already advice users for cluster restart in order to avoid stale entries: remove the store file on all but the last node to stop, start that node, and only afterwards start the other nodes.

It's true that we start caches lazily, at least in the embedded case, and this can mean the wrong cache starts first if the application doesn't explicitly create the cache on startup. But I still think it's worth it to simplify the procedure for avoiding stale entries.

> Preload only on the node that starts up the first
> -------------------------------------------------
>
>                 Key: ISPN-5515
>                 URL: https://issues.jboss.org/browse/ISPN-5515
>             Project: Infinispan
>          Issue Type: Enhancement
>          Components: Core, Loaders and Stores
>    Affects Versions: 7.2.2.Final, 8.0.0.Alpha1
>            Reporter: Dan Berindei
>            Assignee: Dan Berindei
>             Fix For: 8.0.0.Alpha2
>
>
> Preloading happens before communicating with other nodes that might already have the cache running. When joining the existing members, the cache then waits to receive the first CH in which it is a member, and then deletes only the entries in the segments that it doesn't own in that CH.
> The intention of this was to remove as little as possible from the existing data, e.g. if the first node to start up is not the one that was stopped last. But the preloaded entries are not replicated to the other nodes, so this can lead to inconsistencies.
> It would be better to delay preloading until we know we are the first node to start up, but failing that we could clear the data container and the store before receiving the initial state.
> Note that this will only allow preloading data from one node. Restoring data from more nodes is harder to do, and we will implement it as part of graceful restart.

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)