[infinispan-issues] [JBoss JIRA] (ISPN-5515) Purge store if there is another node already running

Tue Jun 2 06:53:02 EDT 2015

    [ https://issues.jboss.org/browse/ISPN-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073053#comment-13073053 ] 

Dan Berindei commented on ISPN-5515:
------------------------------------

> I think these are really tricky ideas which should be discussed on the mailing list, I noticed this JIRA by pure luck and find it concerning that such decisions are made without any wider discussion.

We already discussed this on the mailing list, and the conclusion was to implement [graceful restart|https://github.com/infinispan/infinispan/wiki/Graceful-shutdown-&-restore]. This issue is not really about implementing new functionality, it's about automating a recommendation we already have _for users who want it_.

>    it's possible the new starting node starts while "thinking it's first", but then actually merge with a running cluster. The cluster detection protocols aren't foolproof, and you're relying on timeouts to be configured safely (when are they ever?).

If a node starts in a separate partition by itself, the behaviour with "purge on join" enabled will be exactly as it is now - not better, but not worse either.

>    it's unrealistic to push such a requirement to "admin's responsibility" especially but not least because node restarts might not be under their control

A node restart will not affect nodes that are already running in any way. 
Also, this option will be disabled by default, and if the admin can't control the order in which nodes start, he should definitely not enable it.

>    even with this design, the majority of cachestores are cleared so there is an assumption that "data loss is fine" for the user: so why even bother trying to keep a small portion of it at risk of consistency trouble?

In a replicated cache, it's not a small portion of the data, it's all the data. 
I agree that it makes a lot less sense in a distributed cache: in theory you could shut down the cluster such that all the state is transferred to a single node and all the data is preserved in that node's store, but it's definitely something you'd want to do on a regular basis.

>    this design seems to favour something else above correctness, and I'm not sure what "something else" you're aiming at.. why work hard to not wipe a single cachestore?

These are my assumptions for using this option:

* The cache is replicated and the number of nodes is small
* Losing data is not fatal, as there is a backing store
* Reading a stale value *is* fatal
* Reading data from the canonical store is slow

I realize these assumptions are quite narrow, and most users will not use it. But for applications who do fit these assumptions, I think this will help. And it would be back-portable to 7.2.x, unlike the graceful restart work.

>I agree with you that this is an improvement over the current state, but I don't see why you would implement tricky code to provide a tricky solution when all what's needed is remove the preloading option from configuration. You'll be done in much less work and get a better reliable solution.

I renamed the issue, since it's not really about preload - having preload does complicate the code, but stale values are possible with or without preload enabled.

> Purge store if there is another node already running
> ----------------------------------------------------
>
>                 Key: ISPN-5515
>                 URL: https://issues.jboss.org/browse/ISPN-5515
>             Project: Infinispan
>          Issue Type: Enhancement
>          Components: Core, Loaders and Stores
>    Affects Versions: 7.2.2.Final, 8.0.0.Alpha1
>            Reporter: Dan Berindei
>            Assignee: Dan Berindei
>             Fix For: 8.0.0.Alpha2
>
>
> Preloading happens before communicating with other nodes that might already have the cache running. When joining the existing members, the cache then waits to receive the first CH in which it is a member, and then deletes only the entries in the segments that it doesn't own in that CH.
> The intention of this was to remove as little as possible from the existing data, e.g. if the first node to start up is not the one that was stopped last. But the preloaded entries are not replicated to the other nodes, so this can lead to inconsistencies.
> It would be better to delay preloading until we know we are the first node to start up, but failing that we could clear the data container and the store before receiving the initial state.
> Note that this will only allow preloading data from one node. Restoring data from more nodes is harder to do, and we will implement it as part of graceful restart.

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)