I've been thinking more about this issue, after talking with Sanne, and
here's my (possibly faulty) analysis:
I don't think this is so dramatic or urgent that we need a solution
(i.e. a distinct SPI for embedded cachestores) in place by 8.0. This is
something that we can design and introduce as a private-only SPI during
the 8.x series and migrate our stores to use it accordingly. Note that
such a SPI would be more closely tied to the DataContainer so it may not
even have a relationship with the PersistenceManager.
What I would like to see in the current SPI for 8.0, however, is an
extensible way for cachestores to expose "capabilities" so that not only
can we prevent potentially broken configurations, but we can also
declare support for advanced functionality (shared, transactional,
schema-aware, etc). I'm not fond of marker-only interfaces (see
org.infinispan.persistence.spi.LocalOnlyCacheLoader), so I'd prefer an
annotation-based approach.
Tristan
On 06/08/2015 10:39, Radim Vansa wrote:
I understand that shared cache stores will be more common to be
implemented, I don't think that non-shared stores should be considered
'private interface'. But separating them would really give the
oportunity to change this non-shared SPI more often if needed without
breaking shared one.
However, hot-glueing a new cool interface without referential
implementation that supports transaction, solves the ton of issues
described in [1] is not a wise move, IMO. And there's no time to
implement this before 8.0.0.Final.
Radim
[1]
https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-I...
On 08/05/2015 11:57 PM, Sanne Grinovero wrote:
> I don't doubt Radim's code :) but I'm pretty confident that even that
> implementation is limited by the constraints of the general-purpose
> API.
>
> For example it seems Bela will soon allow more flexibility in JGroups
> regarding buffer representations. We need to commit on a stable API
> for end user integrations (shared cachestore implementors), but we
> also need to keep options open to soon play with other approaches.
>
> That's why I think this separation should be done before Infinispan
> 8.0.0.Final even if I don't have a concrete proposal for how this
> other API should look like: I don't presume to be able to anticipate
> which API exactly will be best, but I think we can all see that we
> will want to change that. There should be a private internal contract
> which we can change even in micro versions without concerns of
> compatibility, so to allow R&D progress in the most performance
> sensitive areas w/o this being a problem for integrators and users.
>
> Better configuration validations are additional (strong) benefits:
> we've seen lots of misunderstandings about which CacheStores /
> configuration combinations are valid.
>
> Thanks,
> Sanne
>
> On 5 August 2015 at 22:13, Dan Berindei <dan.berindei(a)gmail.com> wrote:
>> On Fri, Jul 31, 2015 at 3:30 PM, Sanne Grinovero <sanne(a)infinispan.org>
wrote:
>>> On 20 July 2015 at 11:02, Dan Berindei <dan.berindei(a)gmail.com> wrote:
>>>> Sanne, I think changing the cache store API is actually the most
>>>> painful part, so we should only do it if we gain a concrete advantage
>>>> from doing it. From a compatibility point of view, implementing a new
>>>> interface vs implementing the same interface with completely different
>>>> methods is just as bad.
>>> Right, from that perspective it's a quite horrible proposal.
>>>
>>> But I think we can agree that only the "SharedCacheStore" deserves
to
>>> be considered an SPI, right?
>>> That's the one people will normally customize to map stuff to other
>>> stores one might have.
>>>
>>> I think it's important that beyond Infinispan 8.0 API's freeze, we
can
>>> make any change to the non-shared SPI
>>> without affecting users who implement a custom shared cachestore.
>>>
>>> I highly doubt someone will implement a high-performance custom off
>>> heap swap strategy, but if someone does he should contribute it and
>>> will probably need to make integration level changes.
>>>
>>> We probably won't have the time to implement a new super efficient
>>> local-only cachestore to replace the leveldb one, but I'd like to keep
>>> the possibility open to do that beyond 8.0, *especially* without
>>> breaking compatibility for other people.
>> We already have a new super efficient local-only cachestore :)
>>
>>
https://github.com/infinispan/infinispan/tree/master/persistence/soft-index
>>
>>
>>> Sanne
>>>
>>>
>>>> On Mon, Jul 20, 2015 at 12:41 PM, Sanne Grinovero
<sanne(a)infinispan.org> wrote:
>>>>> +1 for incremental changes..
>>>>>
>>>>> I'd see the first step as defining two different interfaces;
>>>>> essentially we need to choose two good names.
>>>>>
>>>>> Then we could have both interfaces still implement the same
identical
>>>>> methods, but go through each implementation and decide to
"mark" it as
>>>>> shared-only or never-shared.
>>>>>
>>>>> That would make it simpler to make concrete change proposals on each
>>>>> of them and start taking some advantage from the split. I think
you'll
>>>>> need the two different interfaces to implement the validations you
>>>>> mentioned.
>>>>>
>>>>> For Infinispan 8's goals, I'd be happy enough to keep the
>>>>> "shared-only" interface quite similar to the current one,
but mark the
>>>>> never-shared one as a private or experimental SPI to allow ourselves
>>>>> some more flexibility in performance oriented changes.
>>>>>
>>>>> Thanks,
>>>>> Sanne
>>>>>
>>>>> On 20 July 2015 at 10:07, Tristan Tarrant <ttarrant(a)redhat.com>
wrote:
>>>>>> Sanne, well written.
>>>>>> Before actually implementing any of the optimizations/changes
you
>>>>>> mention, I think the lowest-hanging fruit we should grab now is
just to
>>>>>> add checks to all of our cachestores to actually throw an
exception when
>>>>>> they are being enabled in unsupported configurations.
>>>>>>
>>>>>> I've created [1] to get us started
>>>>>>
>>>>>> Tristan
>>>>>>
>>>>>> [1]
https://issues.jboss.org/browse/ISPN-5617
>>>>>>
>>>>>> On 16/07/2015 15:32, Sanne Grinovero wrote:
>>>>>>> I would like to propose a clear cut separation between our
shared and
>>>>>>> non-shared CacheStores,
>>>>>>> in all terms such as:
>>>>>>> - Configuration options
>>>>>>> - Integration contracts (Split the CacheStore SPI)
>>>>>>> - Implementations
>>>>>>> - Terminology, to avoid any further confusion around
valid
>>>>>>> configurations and sensible architectures
>>>>>>>
>>>>>>> We have loads of examples of users who get in trouble by
configuring
>>>>>>> one incorrectly, but also there are plenty of efficiency
improvements
>>>>>>> we could take advantage of by clearly splitting the
integration points
>>>>>>> and the implementations in two categories.
>>>>>>>
>>>>>>> Not least, it's a very common and dangerous pitfall to
assume that
>>>>>>> Infinispan is able to restore a consistent state after having
stopped
>>>>>>> a DIST cluster which passivated into non-shared CacheStore
instances,
>>>>>>> or even REPL clusters when they don't shutdown all at the
same exact
>>>>>>> time (and "exact same time" is a strange concept at
least..). We need
>>>>>>> to clarify the different options, tradeoffs and their
consequences..
>>>>>>> to users and ourselves, as a clearly defined use case will
avoid bugs
>>>>>>> and simplify implementations.
>>>>>>>
>>>>>>> # The purpose of each
>>>>>>> I think that people should use a non-shared (local?)
CacheStore for
>>>>>>> the sole purpose of expanding to storage capacity of each
single
>>>>>>> node.. be it because you don't have enough memory at all,
or be it
>>>>>>> because you prefer some extra safety margin because either
your
>>>>>>> estimates are complex, or maybe because we live in a real
world were
>>>>>>> the hashing function might not be perfect in practice. I hope
we all
>>>>>>> agree that Infinispan should be able to take such situations
with at
>>>>>>> worst a graceful performance degradatation, rather than
complain
>>>>>>> sending OOMs to the admin and setting the service on strike.
>>>>>>>
>>>>>>> A Shared CacheStore is useful for very different purposes;
primarily
>>>>>>> to implement a Cache on some other service - for example your
(single,
>>>>>>> shared) RDBMs, a slow (or expensive) webservice your
organization has
>>>>>>> to call frequently, etc.. Or it's useful even as a
write-through cache
>>>>>>> on a similar service, maybe internal but not able to handle
the high
>>>>>>> variation of load spikes which Infinsipan can handle better.
>>>>>>> Finally, a great use case is to have a consistent backup of
all your
>>>>>>> data-grid content, possibly in some "reference"
form such as JPA
>>>>>>> mapped entities.
>>>>>>>
>>>>>>> # Benefits of a Non-Shared
>>>>>>> A non-shared CacheStore implementor should be able to take
advantage
>>>>>>> of *its purpose*, among the big ones I see:
>>>>>>> - Exclusive usage -> locking of a specific entry can
be handled at
>>>>>>> datacontainer level, can simplify quite some internal code.
>>>>>>> - Reliability -> since a clustered node needs to wipe
its state at
>>>>>>> reboot (after a crash), it's much simpler to code any
such CacheStore
>>>>>>> to avoid any form of disk synch or persistance guarantees.
>>>>>>> - Encoding format -> this can be controlled entirely
by Infinispan,
>>>>>>> and no need to take factors like rolling upgrade compatible
encodings
>>>>>>> in mind. JBoss Marshalling would be good enough, or some
>>>>>>> implementations might not need to serialize at all.
>>>>>>>
>>>>>>> Our non-shared CacheStore implentation(s) could take
advantage of
>>>>>>> lower level more complex code optimisations and interfaces,
as users
>>>>>>> would rarely want to customize one of these, while the use
case of
>>>>>>> mapping data to a shared service needs a more user friendly
SPI so to
>>>>>>> keep it simple to plug in custom stores: custom data formats,
custom
>>>>>>> connectors, get some help in implementing concurrency
correctly.
>>>>>>> Proper Transaction integration for the CacheStore has been on
our
>>>>>>> wishlist for some time too, I suspect that accepting that we
have been
>>>>>>> mixing up two different things under a same name so far,
would make it
>>>>>>> simpler to implement further improvements such as
transactions: the
>>>>>>> way to do such a thing is very different in each of these use
cases,
>>>>>>> so it would help at least to implement it on a subset first,
or maybe
>>>>>>> only if it turns out there's no need for such things in
the context of
>>>>>>> the local-only-dedicated "swapfile".
>>>>>>>
>>>>>>> # Mixed types should be killed
>>>>>>> I'm aware that some of our current implementations
_could_ work both as
>>>>>>> shared or non-shared, for example the JDBC or JPACacheStore
or the
>>>>>>> Remote Cachestore.. but in most cases it doesn't make
much sense. Why
>>>>>>> would you ever want to use the JPACacheStore if not to share
data with
>>>>>>> a _shared_ database?
>>>>>>>
>>>>>>> We should take such options away, and by doing so focus on
the use
>>>>>>> cases which actually matter and simplify the implementations
and
>>>>>>> improve the configuration validations.
>>>>>>>
>>>>>>> If ever a compelling storage technology is identified which
we'd like to
>>>>>>> offer as an option for both shared or non-shared, I would
still
>>>>>>> recommend to make two different implementations, as there
certainly are
>>>>>>> different requirements and assumptions when coding such a
thing.
>>>>>>>
>>>>>>> Not least, I would very like to see a default local
CacheStore:
>>>>>>> picking one for local "emergency swapping" should
be a no-brainer for
>>>>>>> users; we could setup one by default and not bother newcomers
with
>>>>>>> complex choices.
>>>>>>>
>>>>>>> If we simplify the requirement of such a thing, it should be
easy to
>>>>>>> write one on standard Java NIO2 APIs and get rid of the
complexities of
>>>>>>> maintaining the native integration with things like LevelDB,
not least
>>>>>>> the inefficiency of Java to make such native calls.
>>>>>>>
>>>>>>> Then as a second step, we should attack the other use case:
backups;
>>>>>>> from a *purpose driven perspective* I'd then see us
revive the Cassandra
>>>>>>> integration; obviously as a shared-only option.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Sanne
>>>>>>> _______________________________________________
>>>>>>> infinispan-dev mailing list
>>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>>
>>>>>> --
>>>>>> Tristan Tarrant
>>>>>> Infinispan Lead
>>>>>> JBoss, a division of Red Hat
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev