I understand that shared cache stores will be more common to be
implemented, I don't think that non-shared stores should be considered
'private interface'. But separating them would really give the
oportunity to change this non-shared SPI more often if needed without
breaking shared one.
However, hot-glueing a new cool interface without referential
implementation that supports transaction, solves the ton of issues
described in [1] is not a wise move, IMO. And there's no time to
implement this before 8.0.0.Final.
Radim
[1]
I don't doubt Radim's code :) but I'm pretty confident
that even that
implementation is limited by the constraints of the general-purpose
API.
For example it seems Bela will soon allow more flexibility in JGroups
regarding buffer representations. We need to commit on a stable API
for end user integrations (shared cachestore implementors), but we
also need to keep options open to soon play with other approaches.
That's why I think this separation should be done before Infinispan
8.0.0.Final even if I don't have a concrete proposal for how this
other API should look like: I don't presume to be able to anticipate
which API exactly will be best, but I think we can all see that we
will want to change that. There should be a private internal contract
which we can change even in micro versions without concerns of
compatibility, so to allow R&D progress in the most performance
sensitive areas w/o this being a problem for integrators and users.
Better configuration validations are additional (strong) benefits:
we've seen lots of misunderstandings about which CacheStores /
configuration combinations are valid.
Thanks,
Sanne
On 5 August 2015 at 22:13, Dan Berindei <dan.berindei(a)gmail.com> wrote:
> On Fri, Jul 31, 2015 at 3:30 PM, Sanne Grinovero <sanne(a)infinispan.org> wrote:
>> On 20 July 2015 at 11:02, Dan Berindei <dan.berindei(a)gmail.com> wrote:
>>> Sanne, I think changing the cache store API is actually the most
>>> painful part, so we should only do it if we gain a concrete advantage
>>> from doing it. From a compatibility point of view, implementing a new
>>> interface vs implementing the same interface with completely different
>>> methods is just as bad.
>> Right, from that perspective it's a quite horrible proposal.
>>
>> But I think we can agree that only the "SharedCacheStore" deserves to
>> be considered an SPI, right?
>> That's the one people will normally customize to map stuff to other
>> stores one might have.
>>
>> I think it's important that beyond Infinispan 8.0 API's freeze, we can
>> make any change to the non-shared SPI
>> without affecting users who implement a custom shared cachestore.
>>
>> I highly doubt someone will implement a high-performance custom off
>> heap swap strategy, but if someone does he should contribute it and
>> will probably need to make integration level changes.
>>
>> We probably won't have the time to implement a new super efficient
>> local-only cachestore to replace the leveldb one, but I'd like to keep
>> the possibility open to do that beyond 8.0, *especially* without
>> breaking compatibility for other people.
> We already have a new super efficient local-only cachestore :)
>
>
https://github.com/infinispan/infinispan/tree/master/persistence/soft-index
>
>
>> Sanne
>>
>>
>>> On Mon, Jul 20, 2015 at 12:41 PM, Sanne Grinovero
<sanne(a)infinispan.org> wrote:
>>>> +1 for incremental changes..
>>>>
>>>> I'd see the first step as defining two different interfaces;
>>>> essentially we need to choose two good names.
>>>>
>>>> Then we could have both interfaces still implement the same identical
>>>> methods, but go through each implementation and decide to
"mark" it as
>>>> shared-only or never-shared.
>>>>
>>>> That would make it simpler to make concrete change proposals on each
>>>> of them and start taking some advantage from the split. I think
you'll
>>>> need the two different interfaces to implement the validations you
>>>> mentioned.
>>>>
>>>> For Infinispan 8's goals, I'd be happy enough to keep the
>>>> "shared-only" interface quite similar to the current one, but
mark the
>>>> never-shared one as a private or experimental SPI to allow ourselves
>>>> some more flexibility in performance oriented changes.
>>>>
>>>> Thanks,
>>>> Sanne
>>>>
>>>> On 20 July 2015 at 10:07, Tristan Tarrant <ttarrant(a)redhat.com>
wrote:
>>>>> Sanne, well written.
>>>>> Before actually implementing any of the optimizations/changes you
>>>>> mention, I think the lowest-hanging fruit we should grab now is just
to
>>>>> add checks to all of our cachestores to actually throw an exception
when
>>>>> they are being enabled in unsupported configurations.
>>>>>
>>>>> I've created [1] to get us started
>>>>>
>>>>> Tristan
>>>>>
>>>>> [1]
https://issues.jboss.org/browse/ISPN-5617
>>>>>
>>>>> On 16/07/2015 15:32, Sanne Grinovero wrote:
>>>>>> I would like to propose a clear cut separation between our shared
and
>>>>>> non-shared CacheStores,
>>>>>> in all terms such as:
>>>>>> - Configuration options
>>>>>> - Integration contracts (Split the CacheStore SPI)
>>>>>> - Implementations
>>>>>> - Terminology, to avoid any further confusion around valid
>>>>>> configurations and sensible architectures
>>>>>>
>>>>>> We have loads of examples of users who get in trouble by
configuring
>>>>>> one incorrectly, but also there are plenty of efficiency
improvements
>>>>>> we could take advantage of by clearly splitting the integration
points
>>>>>> and the implementations in two categories.
>>>>>>
>>>>>> Not least, it's a very common and dangerous pitfall to assume
that
>>>>>> Infinispan is able to restore a consistent state after having
stopped
>>>>>> a DIST cluster which passivated into non-shared CacheStore
instances,
>>>>>> or even REPL clusters when they don't shutdown all at the
same exact
>>>>>> time (and "exact same time" is a strange concept at
least..). We need
>>>>>> to clarify the different options, tradeoffs and their
consequences..
>>>>>> to users and ourselves, as a clearly defined use case will avoid
bugs
>>>>>> and simplify implementations.
>>>>>>
>>>>>> # The purpose of each
>>>>>> I think that people should use a non-shared (local?) CacheStore
for
>>>>>> the sole purpose of expanding to storage capacity of each single
>>>>>> node.. be it because you don't have enough memory at all, or
be it
>>>>>> because you prefer some extra safety margin because either your
>>>>>> estimates are complex, or maybe because we live in a real world
were
>>>>>> the hashing function might not be perfect in practice. I hope we
all
>>>>>> agree that Infinispan should be able to take such situations with
at
>>>>>> worst a graceful performance degradatation, rather than complain
>>>>>> sending OOMs to the admin and setting the service on strike.
>>>>>>
>>>>>> A Shared CacheStore is useful for very different purposes;
primarily
>>>>>> to implement a Cache on some other service - for example your
(single,
>>>>>> shared) RDBMs, a slow (or expensive) webservice your organization
has
>>>>>> to call frequently, etc.. Or it's useful even as a
write-through cache
>>>>>> on a similar service, maybe internal but not able to handle the
high
>>>>>> variation of load spikes which Infinsipan can handle better.
>>>>>> Finally, a great use case is to have a consistent backup of all
your
>>>>>> data-grid content, possibly in some "reference" form
such as JPA
>>>>>> mapped entities.
>>>>>>
>>>>>> # Benefits of a Non-Shared
>>>>>> A non-shared CacheStore implementor should be able to take
advantage
>>>>>> of *its purpose*, among the big ones I see:
>>>>>> - Exclusive usage -> locking of a specific entry can be
handled at
>>>>>> datacontainer level, can simplify quite some internal code.
>>>>>> - Reliability -> since a clustered node needs to wipe its
state at
>>>>>> reboot (after a crash), it's much simpler to code any such
CacheStore
>>>>>> to avoid any form of disk synch or persistance guarantees.
>>>>>> - Encoding format -> this can be controlled entirely by
Infinispan,
>>>>>> and no need to take factors like rolling upgrade compatible
encodings
>>>>>> in mind. JBoss Marshalling would be good enough, or some
>>>>>> implementations might not need to serialize at all.
>>>>>>
>>>>>> Our non-shared CacheStore implentation(s) could take advantage
of
>>>>>> lower level more complex code optimisations and interfaces, as
users
>>>>>> would rarely want to customize one of these, while the use case
of
>>>>>> mapping data to a shared service needs a more user friendly SPI
so to
>>>>>> keep it simple to plug in custom stores: custom data formats,
custom
>>>>>> connectors, get some help in implementing concurrency correctly.
>>>>>> Proper Transaction integration for the CacheStore has been on
our
>>>>>> wishlist for some time too, I suspect that accepting that we have
been
>>>>>> mixing up two different things under a same name so far, would
make it
>>>>>> simpler to implement further improvements such as transactions:
the
>>>>>> way to do such a thing is very different in each of these use
cases,
>>>>>> so it would help at least to implement it on a subset first, or
maybe
>>>>>> only if it turns out there's no need for such things in the
context of
>>>>>> the local-only-dedicated "swapfile".
>>>>>>
>>>>>> # Mixed types should be killed
>>>>>> I'm aware that some of our current implementations _could_
work both as
>>>>>> shared or non-shared, for example the JDBC or JPACacheStore or
the
>>>>>> Remote Cachestore.. but in most cases it doesn't make much
sense. Why
>>>>>> would you ever want to use the JPACacheStore if not to share data
with
>>>>>> a _shared_ database?
>>>>>>
>>>>>> We should take such options away, and by doing so focus on the
use
>>>>>> cases which actually matter and simplify the implementations and
>>>>>> improve the configuration validations.
>>>>>>
>>>>>> If ever a compelling storage technology is identified which
we'd like to
>>>>>> offer as an option for both shared or non-shared, I would still
>>>>>> recommend to make two different implementations, as there
certainly are
>>>>>> different requirements and assumptions when coding such a thing.
>>>>>>
>>>>>> Not least, I would very like to see a default local CacheStore:
>>>>>> picking one for local "emergency swapping" should be a
no-brainer for
>>>>>> users; we could setup one by default and not bother newcomers
with
>>>>>> complex choices.
>>>>>>
>>>>>> If we simplify the requirement of such a thing, it should be easy
to
>>>>>> write one on standard Java NIO2 APIs and get rid of the
complexities of
>>>>>> maintaining the native integration with things like LevelDB, not
least
>>>>>> the inefficiency of Java to make such native calls.
>>>>>>
>>>>>> Then as a second step, we should attack the other use case:
backups;
>>>>>> from a *purpose driven perspective* I'd then see us revive
the Cassandra
>>>>>> integration; obviously as a shared-only option.
>>>>>>
>>>>>> Cheers,
>>>>>> Sanne
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>>
>>>>> --
>>>>> Tristan Tarrant
>>>>> Infinispan Lead
>>>>> JBoss, a division of Red Hat
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev