Re: [infinispan-dev] Write-only commands

Thursday, 29 June 2017

On 06/29/2017 10:23 AM, Dan Berindei wrote:
...
 On Wed, Jun 28, 2017 at 5:25 PM, Radim Vansa
<rvansa(a)redhat.com&gt; wrote:
> On 06/28/2017 01:17 PM, Radim Vansa wrote:
>> On 06/28/2017 10:40 AM, Dan Berindei wrote:
>>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com&gt;
wrote:
>>>> On 06/27/2017 03:54 PM, Dan Berindei wrote:
>>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor
<anistor(a)redhat.com&gt; wrote:
>>>>>> I've said this in a previous thread on this same issue, I
will repeat myself
>>>>>> as many times as needed.
>>>>>>
>>>>>> Continuous queries require the previous value itself, not just
knowledge of
>>>>>> the type of the previous value. Strongly typed caches solve no
problem here.
>>>>>>
>>>>>> So if we half-fix query but leave CQ broken I will be half-happy
(ie. very
>>>>>> depressed) :)
>>>>>>
>>>>>> I'd remove these commands completely or possibly remove them
just from
>>>>>> public API and keep them internal.
>>>>>>
>>>>> +1 to remove the flags from the public API. Most of them are not
safe
>>>>> for applications to use, and ignoring them when they can lead to
>>>>> inconsistencies would make them useless.
>>>>>
>>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache
doesn't
>>>>> know when it is safe to skip the delete statement, and it relies on
>>>>> the application making a (possibly wrong) choice.
>>>>>
>>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually
recommend
>>>>> that applications use it right now. If query or listeners need the
>>>>> previous value, then we should load it internally, but hide it from
>>>>> the user.
>>>>>
>>>>> But removing it opens another discussion: should we replace it in
the
>>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or
>>>>> should we make it the default and add a method
>>>>> AdvancedCache.forceReturnPreviousValues()?
>>>> Please don't derail the thread.
>>>>
>>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also
>>> breaks the previous value for listeners, even if the QueryInterceptor
>>> removes it from write commands. And it is public (+recommended) API,
>>> in fact most if not all of our performance tests use it.
>> That's just a flawed implementation. IPV is documented to be a
'safe'
>> flag that should affect mostly primary -> origin replication, all the
>> other is implementation. And we can fix that. Users should *not* expect
>> that it e.g. skips loading from a cache store. We have already removed
>> the modes that would be broken-by-design.
>>
>> On the other hand, write-only commands are not about *returning* the
>> value but about (not) *reading* it, therefore (in my eyes) user could
>> make that assumption and would like to enforce it this way. Even some
>> docs explaining PersistenceMode.SKIP suggest that.
>>
>> I don't want to talk about flags, because I see all flags but IPV as
>> 'effectively internal'. Let's discuss it more high-level. Some API
>> exposes non-reading operation - we can see that under some circumstances
>> this is not possible so we have options to 1) break stuff 2) break API
>> assumptions 3) sometimes break API assumptions 4) remove such API (to
>> not allow the user to make such assumptions). There's also an option 5)
>> to fail the operation if the API assumption would be broken. Though, I
>> don't fancy getting exception from a WriteOnlyMap.eval just because
>> someone has registered a listener.
>>
>>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the
>>> previous value on backup owners for most write commands
>>> (LoadType.PRIMARY), we'd need to change that as well.
>> Yes, all commands will have to load current value on all owners.
> Btw., this has definitely some impact on shared cachestores. It's not
> just that you have to load the value from cachestore - you also need to
> store the previous value along with the current value (in metadata),
> because if the primary owner crashes and the backups don't have it in
> memory, these need to check the persistence to see if the command was
> already executed.
>
 I don't think this would work with the triangle algorithm. The primary
 owner updates the shared store without waiting for acks from the
 backups, so a backup trying to load the previous value is very likely
 to find the updated value instead. 
With entry version history you'll be able to find the previous value 
even if it was updated, as long as the operation did not complete. 
That's why it's called a 'history'.

...

 I'll hazard a guess that none of the operations using LoadType.OWNER
 currently works with triangle + shared store. 
IIRC triangle does not implement functional commands yet, and these 
don't work during retries anyway. ApplyDelta and put with delta-aware 
flag are about to go (and are probably broken in other interesting ways) 
and then there's the invalidation command that's set to OWNER to have 
more reliable listeners.

...

> This means that the metadata could grow up to several times the size of
> the value itself (if there's high concurrency). And even worse with
> async caches, as these don't know when the command was actually finished
> and therefore rely on time-based expiration of the previous values :-/
>
> Non-shared cachestores don't need to store previous values as long if we
> can guarantee (flag) the entry as not eligible for eviction.
>
>>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 27 Jun 2017 10:13, "Radim Vansa"
<rvansa(a)redhat.com&gt; wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am working on entry version history (again). In Como we've
discussed
>>>>>> that previous values are needed for (continuous) query and
reliable
>>>>>> listeners,
>>>>>>
>>>>>>
>>>>>> Index based queries also require the previous value on a write -
unless we
>>>>>> can get "strongly typed caches" giving guarantees about
the class to
>>>>>> represent the content of a cache to be unique.
>>>>>>
>>>>>> Essentially we only need to know the type of the previous object.
It might
>>>>>> be worth having a way to load the type metadata if the previous
value only.
>>>>>>
>>>>>> so I wonder what should we do with functional write-only
>>>>>> commands. These are different to commands with flags, because
flags
>>>>>> (other than ignore return value) are expected to break
something.
>>>>>>
>>>>>>
>>>>>> Sorry I hope to not derail the thread but let's remind that
we hope to
>>>>>> evolve beyond "flags are expected to break stuff" ; we
never got to it but
>>>>>> search the mailing list.
>>>>>>
>>>>>> Since flags are exposed to the user I would rather they're
not allowed to
>>>>>> break things.
>>>>>> Could they be treated as hints? Ignore the flag (and warn?) if
the used
>>>>>> configuration/integrations veto them.
>>>>>>
>>>>>> Alternatively, let's remove them from API. Remember "The
Jokre" POC was
>>>>>> intentionally designed to explore pushing the limits on
performance w/o end
>>>>>> users having to solve puzzles, such as learning details about
these flags
>>>>>> and their possible side effects.
>>>>>>
>>>>>> So assuming they become either "safe" or internal,
maybe you can take
>>>>>> advantage of them?
>>>>>>
>>>>>> I see
>>>>>> the available options as:
>>>>>>
>>>>>> 1) run write-only commands 'optimized', ignoring any
querying and such
>>>>>> (warn user that he will break it)
>>>>>>
>>>>>> 2) run write-only without any optimization, rendering them
useless
>>>>>>
>>>>>> 3) detect when querying is set up (ignoring listeners and maybe
other
>>>>>> stuff that could get broken)
>>>>>>
>>>>>>
>>>>>> Might be useful for making a POC work, but I believe query will
be very
>>>>>> likely to be often enabled.
>>>>>> Having an either / or switch for different features in Infinispan
will make
>>>>>> it harder to use and understand, so I'd rather see work on
the right design
>>>>>> as taking temporary shortcuts risks baking into stone features
which we
>>>>>> later struggle to fix or maintain.
>>>>>>
>>>>> I vote for this option.
>>>>>
>>>>> Query, listeners, and other components that need the previous value
>>>>> should not just assume that the application knows better, they
should
>>>>> be able to change how operations works based on their needs. Of
>>>>> course, the reverse is also true: if the application uses write-only
>>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it
should
>>>>> be possible for the user to detect why the previous values are still
>>>>> loaded.
>>>> If it were just query (static configuration), I would be okay with this
>>>> idea. But as per listeners - besides tainting the design (event source
>>>> should not check if there's a listener) you'd need to check
*before*
>>> The source wouldn't check for listeners explicitly, the notifier would
>>> have an isPreviousValueNeeded() method and precompute that before a
>>> listener is added or after a listener is removed. I was am assuming
>>> some listeners will not need the previous value, e.g. the listeners
>>> installed by streams.
>> You can cover your warts with a make-up but you'll still have warts :)
>>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform,
>>>> EWI). So this is a space for race conditions or weird handling (if
>>>> there's a listener when I am about to call notify and my flags are
not
>>>> cleared, skip the notification and pretend that this code was invoked
>>>> before the listener was registered...). Or do you have another solution
>>>> in mind (config option to disable listeners && all features using
those?).
>>>>
>>> I was definitely going for the weird handling...
>>>
>>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when
>>> it's loaded, and check that before invoking a listener that needs the
>>> previous value. It is missing one edge case: if one thread starts a
>>> write operation, then another thread installs a listener that requires
>>> the previous value and iterates over the cache, the second thread may
>>> not see the value written by the first thread.
>> If the operations overlap, you could pretend that the write has finished
>> before the listener was invoked and simply not notify the listener. If I
>> am missing it please write it down in code. But handling this in any way
>> is still clumsy.
>>> So now I'm thinking we should retry the write commands when
>>> isPreviousValueNeeded() changes... Not very appealing, but I think the
>>> performance difference is worth it.
>>>
>>>> R.
>>>>
>>>>>> 4) remove write-only commands completely (and probably
functional
>>>>>> listeners as well because these will lose their purpose)
>>>>>>
>>>>>>
>>>>>> +1 to remove "unconditional writes", at least an entry
version check should
>>>>>> be applied.
>>>>>> I believe we had already pointed out this would eventually
happen, pretty
>>>>>> much for the reasons you're hitting now.
>>>>>>
>>>>> IMO version checks should be done internally, we shouldn't force
the
>>>>> users of the functional API to deal with versions themselves because
>>>>> we know how hard making write skew checks work is for us :)
>>>>>
>>>>> And I wouldn't go as far as to remove the functional listeners,
>>>>> instead I would change them so that read-write listeners are invoked
>>>>> on write-only operations and they force the loading of the previous
>>>>> value. I would also add a way for the regular listeners to say
whether
>>>>> they need the previous value or not.
>>>>>
>>>>>> Right now I am inclined towards 4). There could be some internal
use
>>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy
setup,
>>>>>> though, but it's asking for trouble.
>>>>>>
>>>>>>
>>>>>> I agree!
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>> WDYT?
>>>>>>
>>>>>> Radim
>>>>>>
>>>>> Cheers
>>>>> Dan
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Radim Vansa <rvansa(a)redhat.com&gt;
> JBoss Performance Team
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 

-- 
Radim Vansa <rvansa(a)redhat.com&gt;
JBoss Performance Team

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Write-only commands