Write-only commands

Feedback for PR 5233 needed

Jenkins SSL and signup

Radim Vansa

Tuesday, 27 June 2017 Tue, 27 Jun '17

2:31 a.m.

Hi, I am working on entry version history (again). In Como we've discussed that previous values are needed for (continuous) query and reliable listeners, so I wonder what should we do with functional write-only commands. These are different to commands with flags, because flags (other than ignore return value) are expected to break something. I see the available options as: 1) run write-only commands 'optimized', ignoring any querying and such (warn user that he will break it) 2) run write-only without any optimization, rendering them useless 3) detect when querying is set up (ignoring listeners and maybe other stuff that could get broken) 4) remove write-only commands completely (and probably functional listeners as well because these will lose their purpose) Right now I am inclined towards 4). There could be some internal use (e.g. multimaps) that could use 1) which is ran without a fancy setup, though, but it's asking for trouble. WDYT? Radim -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Show replies by date

Sanne Grinovero

Tuesday, 27 June Tue, 27 Jun

5:28 a.m.

On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: Hi, I am working on entry version history (again). In Como we've discussed that previous values are needed for (continuous) query and reliable listeners, Index based queries also require the previous value on a write - unless we can get "strongly typed caches" giving guarantees about the class to represent the content of a cache to be unique. Essentially we only need to know the type of the previous object. It might be worth having a way to load the type metadata if the previous value only. so I wonder what should we do with functional write-only commands. These are different to commands with flags, because flags (other than ignore return value) are expected to break something. Sorry I hope to not derail the thread but let's remind that we hope to evolve beyond "flags are expected to break stuff" ; we never got to it but search the mailing list. Since flags are exposed to the user I would rather they're not allowed to break things. Could they be treated as hints? Ignore the flag (and warn?) if the used configuration/integrations veto them. Alternatively, let's remove them from API. Remember "The Jokre" POC was intentionally designed to explore pushing the limits on performance w/o end users having to solve puzzles, such as learning details about these flags and their possible side effects. So assuming they become either "safe" or internal, maybe you can take advantage of them? I see the available options as: 1) run write-only commands 'optimized', ignoring any querying and such (warn user that he will break it) 2) run write-only without any optimization, rendering them useless 3) detect when querying is set up (ignoring listeners and maybe other stuff that could get broken) Might be useful for making a POC work, but I believe query will be very likely to be often enabled. Having an either / or switch for different features in Infinispan will make it harder to use and understand, so I'd rather see work on the right design as taking temporary shortcuts risks baking into stone features which we later struggle to fix or maintain. 4) remove write-only commands completely (and probably functional listeners as well because these will lose their purpose) +1 to remove "unconditional writes", at least an entry version check should be applied. I believe we had already pointed out this would eventually happen, pretty much for the reasons you're hitting now. Right now I am inclined towards 4). There could be some internal use (e.g. multimaps) that could use 1) which is ran without a fancy setup, though, but it's asking for trouble. I agree! Thanks WDYT? Radim -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Adrian Nistor

6:43 a.m.

I've said this in a previous thread on this same issue, I will repeat myself as many times as needed. Continuous queries require the previous value itself, not just knowledge of the type of the previous value. Strongly typed caches solve no problem here. So if we half-fix query but leave CQ broken I will be half-happy (ie. very depressed) :) I'd remove these commands completely or possibly remove them just from public API and keep them internal. Adrian On 06/27/2017 01:28 PM, Sanne Grinovero wrote:

...

On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com <mailto:rvansa@redhat.com>> wrote: Hi, I am working on entry version history (again). In Como we've discussed that previous values are needed for (continuous) query and reliable listeners, Index based queries also require the previous value on a write - unless we can get "strongly typed caches" giving guarantees about the class to represent the content of a cache to be unique. Essentially we only need to know the type of the previous object. It might be worth having a way to load the type metadata if the previous value only. so I wonder what should we do with functional write-only commands. These are different to commands with flags, because flags (other than ignore return value) are expected to break something. Sorry I hope to not derail the thread but let's remind that we hope to evolve beyond "flags are expected to break stuff" ; we never got to it but search the mailing list. Since flags are exposed to the user I would rather they're not allowed to break things. Could they be treated as hints? Ignore the flag (and warn?) if the used configuration/integrations veto them. Alternatively, let's remove them from API. Remember "The Jokre" POC was intentionally designed to explore pushing the limits on performance w/o end users having to solve puzzles, such as learning details about these flags and their possible side effects. So assuming they become either "safe" or internal, maybe you can take advantage of them? I see the available options as: 1) run write-only commands 'optimized', ignoring any querying and such (warn user that he will break it) 2) run write-only without any optimization, rendering them useless 3) detect when querying is set up (ignoring listeners and maybe other stuff that could get broken) Might be useful for making a POC work, but I believe query will be very likely to be often enabled. Having an either / or switch for different features in Infinispan will make it harder to use and understand, so I'd rather see work on the right design as taking temporary shortcuts risks baking into stone features which we later struggle to fix or maintain. 4) remove write-only commands completely (and probably functional listeners as well because these will lose their purpose) +1 to remove "unconditional writes", at least an entry version check should be applied. I believe we had already pointed out this would eventually happen, pretty much for the reasons you're hitting now. Right now I am inclined towards 4). There could be some internal use (e.g. multimaps) that could use 1) which is ran without a fancy setup, though, but it's asking for trouble. I agree! Thanks WDYT? Radim -- Radim Vansa <rvansa(a)redhat.com <mailto:rvansa@redhat.com>> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org <mailto:infinispan-dev@lists.jboss.org> https://lists.jboss.org/mailman/listinfo/infinispan-dev <https://lists.jboss.org/mailman/listinfo/infinispan-dev> _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Dan Berindei

8:54 a.m.

On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote:

...

+1 to remove the flags from the public API. Most of them are not safe for applications to use, and ignoring them when they can lead to inconsistencies would make them useless. E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't know when it is safe to skip the delete statement, and it relies on the application making a (possibly wrong) choice. IGNORE_RETURN_VALUES should be safe to use, and we actually recommend that applications use it right now. If query or listeners need the previous value, then we should load it internally, but hide it from the user. But removing it opens another discussion: should we replace it in the public API with a new method AdvancedCache.ignoreReturnValues(), or should we make it the default and add a method AdvancedCache.forceReturnPreviousValues()?

...

On 06/27/2017 01:28 PM, Sanne Grinovero wrote: On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: Hi, I am working on entry version history (again). In Como we've discussed that previous values are needed for (continuous) query and reliable listeners, Index based queries also require the previous value on a write - unless we can get "strongly typed caches" giving guarantees about the class to represent the content of a cache to be unique. Essentially we only need to know the type of the previous object. It might be worth having a way to load the type metadata if the previous value only. so I wonder what should we do with functional write-only commands. These are different to commands with flags, because flags (other than ignore return value) are expected to break something. Sorry I hope to not derail the thread but let's remind that we hope to evolve beyond "flags are expected to break stuff" ; we never got to it but search the mailing list. Since flags are exposed to the user I would rather they're not allowed to break things. Could they be treated as hints? Ignore the flag (and warn?) if the used configuration/integrations veto them. Alternatively, let's remove them from API. Remember "The Jokre" POC was intentionally designed to explore pushing the limits on performance w/o end users having to solve puzzles, such as learning details about these flags and their possible side effects. So assuming they become either "safe" or internal, maybe you can take advantage of them? I see the available options as: 1) run write-only commands 'optimized', ignoring any querying and such (warn user that he will break it) 2) run write-only without any optimization, rendering them useless 3) detect when querying is set up (ignoring listeners and maybe other stuff that could get broken) Might be useful for making a POC work, but I believe query will be very likely to be often enabled. Having an either / or switch for different features in Infinispan will make it harder to use and understand, so I'd rather see work on the right design as taking temporary shortcuts risks baking into stone features which we later struggle to fix or maintain.

I vote for this option. Query, listeners, and other components that need the previous value should not just assume that the application knows better, they should be able to change how operations works based on their needs. Of course, the reverse is also true: if the application uses write-only commands (or IGNORE_RETURN_VALUES) for performance reasons, it should be possible for the user to detect why the previous values are still loaded.

...

4) remove write-only commands completely (and probably functional listeners as well because these will lose their purpose) +1 to remove "unconditional writes", at least an entry version check should be applied. I believe we had already pointed out this would eventually happen, pretty much for the reasons you're hitting now.

IMO version checks should be done internally, we shouldn't force the users of the functional API to deal with versions themselves because we know how hard making write skew checks work is for us :) And I wouldn't go as far as to remove the functional listeners, instead I would change them so that read-write listeners are invoked on write-only operations and they force the loading of the previous value. I would also add a way for the regular listeners to say whether they need the previous value or not.

...

Right now I am inclined towards 4). There could be some internal use (e.g. multimaps) that could use 1) which is ran without a fancy setup, though, but it's asking for trouble. I agree! Thanks WDYT? Radim

Cheers Dan

Radim Vansa

Wednesday, 28 June Wed, 28 Jun

2:17 a.m.

On 06/27/2017 03:54 PM, Dan Berindei wrote:

...

On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: > I've said this in a previous thread on this same issue, I will repeat myself > as many times as needed. > > Continuous queries require the previous value itself, not just knowledge of > the type of the previous value. Strongly typed caches solve no problem here. > > So if we half-fix query but leave CQ broken I will be half-happy (ie. very > depressed) :) > > I'd remove these commands completely or possibly remove them just from > public API and keep them internal. > +1 to remove the flags from the public API. Most of them are not safe for applications to use, and ignoring them when they can lead to inconsistencies would make them useless. E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't know when it is safe to skip the delete statement, and it relies on the application making a (possibly wrong) choice. IGNORE_RETURN_VALUES should be safe to use, and we actually recommend that applications use it right now. If query or listeners need the previous value, then we should load it internally, but hide it from the user. But removing it opens another discussion: should we replace it in the public API with a new method AdvancedCache.ignoreReturnValues(), or should we make it the default and add a method AdvancedCache.forceReturnPreviousValues()?

Please don't derail the thread.

...

> > > On 06/27/2017 01:28 PM, Sanne Grinovero wrote: > > > > On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: > > Hi, > > I am working on entry version history (again). In Como we've discussed > that previous values are needed for (continuous) query and reliable > listeners, > > > Index based queries also require the previous value on a write - unless we > can get "strongly typed caches" giving guarantees about the class to > represent the content of a cache to be unique. > > Essentially we only need to know the type of the previous object. It might > be worth having a way to load the type metadata if the previous value only. > > so I wonder what should we do with functional write-only > commands. These are different to commands with flags, because flags > (other than ignore return value) are expected to break something. > > > Sorry I hope to not derail the thread but let's remind that we hope to > evolve beyond "flags are expected to break stuff" ; we never got to it but > search the mailing list. > > Since flags are exposed to the user I would rather they're not allowed to > break things. > Could they be treated as hints? Ignore the flag (and warn?) if the used > configuration/integrations veto them. > > Alternatively, let's remove them from API. Remember "The Jokre" POC was > intentionally designed to explore pushing the limits on performance w/o end > users having to solve puzzles, such as learning details about these flags > and their possible side effects. > > So assuming they become either "safe" or internal, maybe you can take > advantage of them? > > I see > the available options as: > > 1) run write-only commands 'optimized', ignoring any querying and such > (warn user that he will break it) > > 2) run write-only without any optimization, rendering them useless > > 3) detect when querying is set up (ignoring listeners and maybe other > stuff that could get broken) > > > Might be useful for making a POC work, but I believe query will be very > likely to be often enabled. > Having an either / or switch for different features in Infinispan will make > it harder to use and understand, so I'd rather see work on the right design > as taking temporary shortcuts risks baking into stone features which we > later struggle to fix or maintain. > I vote for this option. Query, listeners, and other components that need the previous value should not just assume that the application knows better, they should be able to change how operations works based on their needs. Of course, the reverse is also true: if the application uses write-only commands (or IGNORE_RETURN_VALUES) for performance reasons, it should be possible for the user to detect why the previous values are still loaded.

If it were just query (static configuration), I would be okay with this idea. But as per listeners - besides tainting the design (event source should not check if there's a listener) you'd need to check *before* (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, EWI). So this is a space for race conditions or weird handling (if there's a listener when I am about to call notify and my flags are not cleared, skip the notification and pretend that this code was invoked before the listener was registered...). Or do you have another solution in mind (config option to disable listeners && all features using those?). R.

...

> 4) remove write-only commands completely (and probably functional > listeners as well because these will lose their purpose) > > > +1 to remove "unconditional writes", at least an entry version check should > be applied. > I believe we had already pointed out this would eventually happen, pretty > much for the reasons you're hitting now. > IMO version checks should be done internally, we shouldn't force the users of the functional API to deal with versions themselves because we know how hard making write skew checks work is for us :) And I wouldn't go as far as to remove the functional listeners, instead I would change them so that read-write listeners are invoked on write-only operations and they force the loading of the previous value. I would also add a way for the regular listeners to say whether they need the previous value or not. > Right now I am inclined towards 4). There could be some internal use > (e.g. multimaps) that could use 1) which is ran without a fancy setup, > though, but it's asking for trouble. > > > I agree! > > Thanks > > > WDYT? > > Radim > Cheers Dan _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Dan Berindei

3:40 a.m.

On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote:

...

On 06/27/2017 03:54 PM, Dan Berindei wrote: > On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >> I've said this in a previous thread on this same issue, I will repeat myself >> as many times as needed. >> >> Continuous queries require the previous value itself, not just knowledge of >> the type of the previous value. Strongly typed caches solve no problem here. >> >> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >> depressed) :) >> >> I'd remove these commands completely or possibly remove them just from >> public API and keep them internal. >> > +1 to remove the flags from the public API. Most of them are not safe > for applications to use, and ignoring them when they can lead to > inconsistencies would make them useless. > > E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't > know when it is safe to skip the delete statement, and it relies on > the application making a (possibly wrong) choice. > > IGNORE_RETURN_VALUES should be safe to use, and we actually recommend > that applications use it right now. If query or listeners need the > previous value, then we should load it internally, but hide it from > the user. > > But removing it opens another discussion: should we replace it in the > public API with a new method AdvancedCache.ignoreReturnValues(), or > should we make it the default and add a method > AdvancedCache.forceReturnPreviousValues()? Please don't derail the thread.

I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also breaks the previous value for listeners, even if the QueryInterceptor removes it from write commands. And it is public (+recommended) API, in fact most if not all of our performance tests use it. For that matter, ClusteredCacheLoaderInterceptor also doesn't load the previous value on backup owners for most write commands (LoadType.PRIMARY), we'd need to change that as well.

...

> >> >> >> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >> >> >> >> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >> >> Hi, >> >> I am working on entry version history (again). In Como we've discussed >> that previous values are needed for (continuous) query and reliable >> listeners, >> >> >> Index based queries also require the previous value on a write - unless we >> can get "strongly typed caches" giving guarantees about the class to >> represent the content of a cache to be unique. >> >> Essentially we only need to know the type of the previous object. It might >> be worth having a way to load the type metadata if the previous value only. >> >> so I wonder what should we do with functional write-only >> commands. These are different to commands with flags, because flags >> (other than ignore return value) are expected to break something. >> >> >> Sorry I hope to not derail the thread but let's remind that we hope to >> evolve beyond "flags are expected to break stuff" ; we never got to it but >> search the mailing list. >> >> Since flags are exposed to the user I would rather they're not allowed to >> break things. >> Could they be treated as hints? Ignore the flag (and warn?) if the used >> configuration/integrations veto them. >> >> Alternatively, let's remove them from API. Remember "The Jokre" POC was >> intentionally designed to explore pushing the limits on performance w/o end >> users having to solve puzzles, such as learning details about these flags >> and their possible side effects. >> >> So assuming they become either "safe" or internal, maybe you can take >> advantage of them? >> >> I see >> the available options as: >> >> 1) run write-only commands 'optimized', ignoring any querying and such >> (warn user that he will break it) >> >> 2) run write-only without any optimization, rendering them useless >> >> 3) detect when querying is set up (ignoring listeners and maybe other >> stuff that could get broken) >> >> >> Might be useful for making a POC work, but I believe query will be very >> likely to be often enabled. >> Having an either / or switch for different features in Infinispan will make >> it harder to use and understand, so I'd rather see work on the right design >> as taking temporary shortcuts risks baking into stone features which we >> later struggle to fix or maintain. >> > I vote for this option. > > Query, listeners, and other components that need the previous value > should not just assume that the application knows better, they should > be able to change how operations works based on their needs. Of > course, the reverse is also true: if the application uses write-only > commands (or IGNORE_RETURN_VALUES) for performance reasons, it should > be possible for the user to detect why the previous values are still > loaded. If it were just query (static configuration), I would be okay with this idea. But as per listeners - besides tainting the design (event source should not check if there's a listener) you'd need to check *before*

The source wouldn't check for listeners explicitly, the notifier would have an isPreviousValueNeeded() method and precompute that before a listener is added or after a listener is removed. I was am assuming some listeners will not need the previous value, e.g. the listeners installed by streams.

...

(DistributionI, CacheLoaderI) you have to call notify (cmd.perform, EWI). So this is a space for race conditions or weird handling (if there's a listener when I am about to call notify and my flags are not cleared, skip the notification and pretend that this code was invoked before the listener was registered...). Or do you have another solution in mind (config option to disable listeners && all features using those?).

I was definitely going for the weird handling... My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when it's loaded, and check that before invoking a listener that needs the previous value. It is missing one edge case: if one thread starts a write operation, then another thread installs a listener that requires the previous value and iterates over the cache, the second thread may not see the value written by the first thread. So now I'm thinking we should retry the write commands when isPreviousValueNeeded() changes... Not very appealing, but I think the performance difference is worth it.

...

R. > >> 4) remove write-only commands completely (and probably functional >> listeners as well because these will lose their purpose) >> >> >> +1 to remove "unconditional writes", at least an entry version check should >> be applied. >> I believe we had already pointed out this would eventually happen, pretty >> much for the reasons you're hitting now. >> > IMO version checks should be done internally, we shouldn't force the > users of the functional API to deal with versions themselves because > we know how hard making write skew checks work is for us :) > > And I wouldn't go as far as to remove the functional listeners, > instead I would change them so that read-write listeners are invoked > on write-only operations and they force the loading of the previous > value. I would also add a way for the regular listeners to say whether > they need the previous value or not. > >> Right now I am inclined towards 4). There could be some internal use >> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >> though, but it's asking for trouble. >> >> >> I agree! >> >> Thanks >> >> >> WDYT? >> >> Radim >> > Cheers > Dan

Radim Vansa

6:17 a.m.

On 06/28/2017 10:40 AM, Dan Berindei wrote:

...

On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: > On 06/27/2017 03:54 PM, Dan Berindei wrote: >> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>> I've said this in a previous thread on this same issue, I will repeat myself >>> as many times as needed. >>> >>> Continuous queries require the previous value itself, not just knowledge of >>> the type of the previous value. Strongly typed caches solve no problem here. >>> >>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>> depressed) :) >>> >>> I'd remove these commands completely or possibly remove them just from >>> public API and keep them internal. >>> >> +1 to remove the flags from the public API. Most of them are not safe >> for applications to use, and ignoring them when they can lead to >> inconsistencies would make them useless. >> >> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >> know when it is safe to skip the delete statement, and it relies on >> the application making a (possibly wrong) choice. >> >> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >> that applications use it right now. If query or listeners need the >> previous value, then we should load it internally, but hide it from >> the user. >> >> But removing it opens another discussion: should we replace it in the >> public API with a new method AdvancedCache.ignoreReturnValues(), or >> should we make it the default and add a method >> AdvancedCache.forceReturnPreviousValues()? > Please don't derail the thread. > I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also breaks the previous value for listeners, even if the QueryInterceptor removes it from write commands. And it is public (+recommended) API, in fact most if not all of our performance tests use it.

That's just a flawed implementation. IPV is documented to be a 'safe' flag that should affect mostly primary -> origin replication, all the other is implementation. And we can fix that. Users should *not* expect that it e.g. skips loading from a cache store. We have already removed the modes that would be broken-by-design. On the other hand, write-only commands are not about *returning* the value but about (not) *reading* it, therefore (in my eyes) user could make that assumption and would like to enforce it this way. Even some docs explaining PersistenceMode.SKIP suggest that. I don't want to talk about flags, because I see all flags but IPV as 'effectively internal'. Let's discuss it more high-level. Some API exposes non-reading operation - we can see that under some circumstances this is not possible so we have options to 1) break stuff 2) break API assumptions 3) sometimes break API assumptions 4) remove such API (to not allow the user to make such assumptions). There's also an option 5) to fail the operation if the API assumption would be broken. Though, I don't fancy getting exception from a WriteOnlyMap.eval just because someone has registered a listener.

...

For that matter, ClusteredCacheLoaderInterceptor also doesn't load the previous value on backup owners for most write commands (LoadType.PRIMARY), we'd need to change that as well.

Yes, all commands will have to load current value on all owners.

...

>> >>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>> >>> >>> >>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>> >>> Hi, >>> >>> I am working on entry version history (again). In Como we've discussed >>> that previous values are needed for (continuous) query and reliable >>> listeners, >>> >>> >>> Index based queries also require the previous value on a write - unless we >>> can get "strongly typed caches" giving guarantees about the class to >>> represent the content of a cache to be unique. >>> >>> Essentially we only need to know the type of the previous object. It might >>> be worth having a way to load the type metadata if the previous value only. >>> >>> so I wonder what should we do with functional write-only >>> commands. These are different to commands with flags, because flags >>> (other than ignore return value) are expected to break something. >>> >>> >>> Sorry I hope to not derail the thread but let's remind that we hope to >>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>> search the mailing list. >>> >>> Since flags are exposed to the user I would rather they're not allowed to >>> break things. >>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>> configuration/integrations veto them. >>> >>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>> intentionally designed to explore pushing the limits on performance w/o end >>> users having to solve puzzles, such as learning details about these flags >>> and their possible side effects. >>> >>> So assuming they become either "safe" or internal, maybe you can take >>> advantage of them? >>> >>> I see >>> the available options as: >>> >>> 1) run write-only commands 'optimized', ignoring any querying and such >>> (warn user that he will break it) >>> >>> 2) run write-only without any optimization, rendering them useless >>> >>> 3) detect when querying is set up (ignoring listeners and maybe other >>> stuff that could get broken) >>> >>> >>> Might be useful for making a POC work, but I believe query will be very >>> likely to be often enabled. >>> Having an either / or switch for different features in Infinispan will make >>> it harder to use and understand, so I'd rather see work on the right design >>> as taking temporary shortcuts risks baking into stone features which we >>> later struggle to fix or maintain. >>> >> I vote for this option. >> >> Query, listeners, and other components that need the previous value >> should not just assume that the application knows better, they should >> be able to change how operations works based on their needs. Of >> course, the reverse is also true: if the application uses write-only >> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >> be possible for the user to detect why the previous values are still >> loaded. > If it were just query (static configuration), I would be okay with this > idea. But as per listeners - besides tainting the design (event source > should not check if there's a listener) you'd need to check *before* The source wouldn't check for listeners explicitly, the notifier would have an isPreviousValueNeeded() method and precompute that before a listener is added or after a listener is removed. I was am assuming some listeners will not need the previous value, e.g. the listeners installed by streams.

You can cover your warts with a make-up but you'll still have warts :)

...

> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, > EWI). So this is a space for race conditions or weird handling (if > there's a listener when I am about to call notify and my flags are not > cleared, skip the notification and pretend that this code was invoked > before the listener was registered...). Or do you have another solution > in mind (config option to disable listeners && all features using those?). > I was definitely going for the weird handling... My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when it's loaded, and check that before invoking a listener that needs the previous value. It is missing one edge case: if one thread starts a write operation, then another thread installs a listener that requires the previous value and iterates over the cache, the second thread may not see the value written by the first thread.

If the operations overlap, you could pretend that the write has finished before the listener was invoked and simply not notify the listener. If I am missing it please write it down in code. But handling this in any way is still clumsy.

...

So now I'm thinking we should retry the write commands when isPreviousValueNeeded() changes... Not very appealing, but I think the performance difference is worth it. > R. > >>> 4) remove write-only commands completely (and probably functional >>> listeners as well because these will lose their purpose) >>> >>> >>> +1 to remove "unconditional writes", at least an entry version check should >>> be applied. >>> I believe we had already pointed out this would eventually happen, pretty >>> much for the reasons you're hitting now. >>> >> IMO version checks should be done internally, we shouldn't force the >> users of the functional API to deal with versions themselves because >> we know how hard making write skew checks work is for us :) >> >> And I wouldn't go as far as to remove the functional listeners, >> instead I would change them so that read-write listeners are invoked >> on write-only operations and they force the loading of the previous >> value. I would also add a way for the regular listeners to say whether >> they need the previous value or not. >> >>> Right now I am inclined towards 4). There could be some internal use >>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>> though, but it's asking for trouble. >>> >>> >>> I agree! >>> >>> Thanks >>> >>> >>> WDYT? >>> >>> Radim >>> >> Cheers >> Dan _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Dan Berindei

9:20 a.m.

On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote:

...

I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP here. The IVR javadoc doesn't say anything about remote lookups, only SRL does. And I agree that the current status is far from ideal, but there is one more valid alternative: we can decide that the previous value is only reliable in clustered listeners, and local listeners don't always have it. Document that, make sure continuous query uses clustered listeners, and we're done :)

...

On the other hand, write-only commands are not about *returning* the value but about (not) *reading* it, therefore (in my eyes) user could make that assumption and would like to enforce it this way. Even some docs explaining PersistenceMode.SKIP suggest that.

To me the purpose the same, there is no difference between returning the previous value to the application or providing the previous value via EntryView. Applying this logic to the JCache API, it would mean put() should never read the previous value, because some users could assume that only getAndPut() reads it. In the old times we didn't have IGNORE_RETURN_VALUES, only SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be ignored (e.g. if the write was conditional). I think that's what Galder had in mind when he wrote the PersistenceMode api note, not the current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this himself, but I'll be very disappointed if he says he designed the write-only operations so that they'll never work with query.

...

I don't want to talk about flags, because I see all flags but IPV as 'effectively internal'. Let's discuss it more high-level. Some API exposes non-reading operation - we can see that under some circumstances this is not possible so we have options to 1) break stuff 2) break API assumptions 3) sometimes break API assumptions 4) remove such API (to not allow the user to make such assumptions). There's also an option 5) to fail the operation if the API assumption would be broken. Though, I don't fancy getting exception from a WriteOnlyMap.eval just because someone has registered a listener.

I disagree with the premise: there's no good reason for the user to assume that write-only commands are *guaranteed* to never load the previous value from a store. We just need to add a clarification to the write-only operations' javadoc, no need to break anything.

...

> > For that matter, ClusteredCacheLoaderInterceptor also doesn't load the > previous value on backup owners for most write commands > (LoadType.PRIMARY), we'd need to change that as well. Yes, all commands will have to load current value on all owners. > >>> >>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>> >>>> >>>> >>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>> >>>> Hi, >>>> >>>> I am working on entry version history (again). In Como we've discussed >>>> that previous values are needed for (continuous) query and reliable >>>> listeners, >>>> >>>> >>>> Index based queries also require the previous value on a write - unless we >>>> can get "strongly typed caches" giving guarantees about the class to >>>> represent the content of a cache to be unique. >>>> >>>> Essentially we only need to know the type of the previous object. It might >>>> be worth having a way to load the type metadata if the previous value only. >>>> >>>> so I wonder what should we do with functional write-only >>>> commands. These are different to commands with flags, because flags >>>> (other than ignore return value) are expected to break something. >>>> >>>> >>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>> search the mailing list. >>>> >>>> Since flags are exposed to the user I would rather they're not allowed to >>>> break things. >>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>> configuration/integrations veto them. >>>> >>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>> intentionally designed to explore pushing the limits on performance w/o end >>>> users having to solve puzzles, such as learning details about these flags >>>> and their possible side effects. >>>> >>>> So assuming they become either "safe" or internal, maybe you can take >>>> advantage of them? >>>> >>>> I see >>>> the available options as: >>>> >>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>> (warn user that he will break it) >>>> >>>> 2) run write-only without any optimization, rendering them useless >>>> >>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>> stuff that could get broken) >>>> >>>> >>>> Might be useful for making a POC work, but I believe query will be very >>>> likely to be often enabled. >>>> Having an either / or switch for different features in Infinispan will make >>>> it harder to use and understand, so I'd rather see work on the right design >>>> as taking temporary shortcuts risks baking into stone features which we >>>> later struggle to fix or maintain. >>>> >>> I vote for this option. >>> >>> Query, listeners, and other components that need the previous value >>> should not just assume that the application knows better, they should >>> be able to change how operations works based on their needs. Of >>> course, the reverse is also true: if the application uses write-only >>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>> be possible for the user to detect why the previous values are still >>> loaded. >> If it were just query (static configuration), I would be okay with this >> idea. But as per listeners - besides tainting the design (event source >> should not check if there's a listener) you'd need to check *before* > The source wouldn't check for listeners explicitly, the notifier would > have an isPreviousValueNeeded() method and precompute that before a > listener is added or after a listener is removed. I was am assuming > some listeners will not need the previous value, e.g. the listeners > installed by streams. You can cover your warts with a make-up but you'll still have warts :)

Cutting them off doesn't necessarily work, either :)

...

> >> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >> EWI). So this is a space for race conditions or weird handling (if >> there's a listener when I am about to call notify and my flags are not >> cleared, skip the notification and pretend that this code was invoked >> before the listener was registered...). Or do you have another solution >> in mind (config option to disable listeners && all features using those?). >> > I was definitely going for the weird handling... > > My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when > it's loaded, and check that before invoking a listener that needs the > previous value. It is missing one edge case: if one thread starts a > write operation, then another thread installs a listener that requires > the previous value and iterates over the cache, the second thread may > not see the value written by the first thread. If the operations overlap, you could pretend that the write has finished before the listener was invoked and simply not notify the listener. If I am missing it please write it down in code. But handling this in any way is still clumsy.

I hope pseudo-code is fine... 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context 2. cache.addListener(l) runs, doesn't block 3. cache.entrySet().forEach() runs, finds k->v0 4. cache.put(k, v1) commits k->v1, should notify the listener but doesn't have the previous value 5. cache.put(k, v0) returns, but the code that installed the listener thinks the value of k is still v0

...

> So now I'm thinking we should retry the write commands when > isPreviousValueNeeded() changes... Not very appealing, but I think the > performance difference is worth it. > >> R. >> >>>> 4) remove write-only commands completely (and probably functional >>>> listeners as well because these will lose their purpose) >>>> >>>> >>>> +1 to remove "unconditional writes", at least an entry version check should >>>> be applied. >>>> I believe we had already pointed out this would eventually happen, pretty >>>> much for the reasons you're hitting now. >>>> >>> IMO version checks should be done internally, we shouldn't force the >>> users of the functional API to deal with versions themselves because >>> we know how hard making write skew checks work is for us :) >>> >>> And I wouldn't go as far as to remove the functional listeners, >>> instead I would change them so that read-write listeners are invoked >>> on write-only operations and they force the loading of the previous >>> value. I would also add a way for the regular listeners to say whether >>> they need the previous value or not. >>> >>>> Right now I am inclined towards 4). There could be some internal use >>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>> though, but it's asking for trouble. >>>> >>>> >>>> I agree! >>>> >>>> Thanks >>>> >>>> >>>> WDYT? >>>> >>>> Radim >>>> >>> Cheers >>> Dan > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Radim Vansa

Thursday, 29 June Thu, 29 Jun

3:53 a.m.

On 06/28/2017 04:20 PM, Dan Berindei wrote:

...

On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote: > On 06/28/2017 10:40 AM, Dan Berindei wrote: >> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>> as many times as needed. >>>>> >>>>> Continuous queries require the previous value itself, not just knowledge of >>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>> >>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>> depressed) :) >>>>> >>>>> I'd remove these commands completely or possibly remove them just from >>>>> public API and keep them internal. >>>>> >>>> +1 to remove the flags from the public API. Most of them are not safe >>>> for applications to use, and ignoring them when they can lead to >>>> inconsistencies would make them useless. >>>> >>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>> know when it is safe to skip the delete statement, and it relies on >>>> the application making a (possibly wrong) choice. >>>> >>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>> that applications use it right now. If query or listeners need the >>>> previous value, then we should load it internally, but hide it from >>>> the user. >>>> >>>> But removing it opens another discussion: should we replace it in the >>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>> should we make it the default and add a method >>>> AdvancedCache.forceReturnPreviousValues()? >>> Please don't derail the thread. >>> >> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >> breaks the previous value for listeners, even if the QueryInterceptor >> removes it from write commands. And it is public (+recommended) API, >> in fact most if not all of our performance tests use it. > That's just a flawed implementation. IPV is documented to be a 'safe' > flag that should affect mostly primary -> origin replication, all the > other is implementation. And we can fix that. Users should *not* expect > that it e.g. skips loading from a cache store. We have already removed > the modes that would be broken-by-design. > I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP here. The IVR javadoc doesn't say anything about remote lookups, only SRL does.

No, I am not; While IRV does not mention the replication, it's said to be 'safe'. So omitting the primary -> origin replication is basically all it can do when listeners are in place. You're right that I have missed the second part in SRL talking about put()s; I took it as a flag prohibiting any remote lookup (as the RPC operation in its whole) any time the remote value is needed. Yes, the second part seems equal to my understanding of IRV.

...

And I agree that the current status is far from ideal, but there is one more valid alternative: we can decide that the previous value is only reliable in clustered listeners, and local listeners don't always have it. Document that, make sure continuous query uses clustered listeners, and we're done :)

Unreliable return values are worse than none; I would rather remove them if we can't guarantee that these are right. Though, clustered listeners are based on regular listeners, so you'd need some means to make them reliable.

...

> On the other hand, write-only commands are not about *returning* the > value but about (not) *reading* it, therefore (in my eyes) user could > make that assumption and would like to enforce it this way. Even some > docs explaining PersistenceMode.SKIP suggest that. > To me the purpose the same, there is no difference between returning the previous value to the application or providing the previous value via EntryView.

There is a difference between what's provided locally and what's send over the network.

...

Applying this logic to the JCache API, it would mean put() should never read the previous value, because some users could assume that only getAndPut() reads it.

OK, this is a valid point.

...

In the old times we didn't have IGNORE_RETURN_VALUES, only SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be ignored (e.g. if the write was conditional). I think that's what Galder had in mind when he wrote the PersistenceMode api note, not the current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this himself, but I'll be very disappointed if he says he designed the write-only operations so that they'll never work with query. > I don't want to talk about flags, because I see all flags but IPV as > 'effectively internal'. Let's discuss it more high-level. Some API > exposes non-reading operation - we can see that under some circumstances > this is not possible so we have options to 1) break stuff 2) break API > assumptions 3) sometimes break API assumptions 4) remove such API (to > not allow the user to make such assumptions). There's also an option 5) > to fail the operation if the API assumption would be broken. Though, I > don't fancy getting exception from a WriteOnlyMap.eval just because > someone has registered a listener. > I disagree with the premise: there's no good reason for the user to assume that write-only commands are *guaranteed* to never load the previous value from a store. We just need to add a clarification to the write-only operations' javadoc, no need to break anything.

OK then, though it diminishes the value of write-only commands a lot.

...

>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >> previous value on backup owners for most write commands >> (LoadType.PRIMARY), we'd need to change that as well. > Yes, all commands will have to load current value on all owners. > >>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>> >>>>> >>>>> >>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am working on entry version history (again). In Como we've discussed >>>>> that previous values are needed for (continuous) query and reliable >>>>> listeners, >>>>> >>>>> >>>>> Index based queries also require the previous value on a write - unless we >>>>> can get "strongly typed caches" giving guarantees about the class to >>>>> represent the content of a cache to be unique. >>>>> >>>>> Essentially we only need to know the type of the previous object. It might >>>>> be worth having a way to load the type metadata if the previous value only. >>>>> >>>>> so I wonder what should we do with functional write-only >>>>> commands. These are different to commands with flags, because flags >>>>> (other than ignore return value) are expected to break something. >>>>> >>>>> >>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>> search the mailing list. >>>>> >>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>> break things. >>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>> configuration/integrations veto them. >>>>> >>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>> users having to solve puzzles, such as learning details about these flags >>>>> and their possible side effects. >>>>> >>>>> So assuming they become either "safe" or internal, maybe you can take >>>>> advantage of them? >>>>> >>>>> I see >>>>> the available options as: >>>>> >>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>> (warn user that he will break it) >>>>> >>>>> 2) run write-only without any optimization, rendering them useless >>>>> >>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>> stuff that could get broken) >>>>> >>>>> >>>>> Might be useful for making a POC work, but I believe query will be very >>>>> likely to be often enabled. >>>>> Having an either / or switch for different features in Infinispan will make >>>>> it harder to use and understand, so I'd rather see work on the right design >>>>> as taking temporary shortcuts risks baking into stone features which we >>>>> later struggle to fix or maintain. >>>>> >>>> I vote for this option. >>>> >>>> Query, listeners, and other components that need the previous value >>>> should not just assume that the application knows better, they should >>>> be able to change how operations works based on their needs. Of >>>> course, the reverse is also true: if the application uses write-only >>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>> be possible for the user to detect why the previous values are still >>>> loaded. >>> If it were just query (static configuration), I would be okay with this >>> idea. But as per listeners - besides tainting the design (event source >>> should not check if there's a listener) you'd need to check *before* >> The source wouldn't check for listeners explicitly, the notifier would >> have an isPreviousValueNeeded() method and precompute that before a >> listener is added or after a listener is removed. I was am assuming >> some listeners will not need the previous value, e.g. the listeners >> installed by streams. > You can cover your warts with a make-up but you'll still have warts :) Cutting them off doesn't necessarily work, either :)

Yep, some people tend to fix w/ hacks instead of designing :)

...

>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>> EWI). So this is a space for race conditions or weird handling (if >>> there's a listener when I am about to call notify and my flags are not >>> cleared, skip the notification and pretend that this code was invoked >>> before the listener was registered...). Or do you have another solution >>> in mind (config option to disable listeners && all features using those?). >>> >> I was definitely going for the weird handling... >> >> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >> it's loaded, and check that before invoking a listener that needs the >> previous value. It is missing one edge case: if one thread starts a >> write operation, then another thread installs a listener that requires >> the previous value and iterates over the cache, the second thread may >> not see the value written by the first thread. > If the operations overlap, you could pretend that the write has finished > before the listener was invoked and simply not notify the listener. If I > am missing it please write it down in code. But handling this in any way > is still clumsy. I hope pseudo-code is fine... 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context 2. cache.addListener(l) runs, doesn't block 3. cache.entrySet().forEach() runs, finds k->v0 4. cache.put(k, v1) commits k->v1, should notify the listener but doesn't have the previous value 5. cache.put(k, v0) returns, but the code that installed the listener thinks the value of k is still v0

Oh OK, I should have drawn that myself when considering the scenario. You're right, here we'll have to retry. All in all, I think this discussion is done. We'll tell users to stick their flags where the sun doesn't shine and remove any inconvenient ones. Should we issue a warning any time we're removing the flag? Radim

...

>> So now I'm thinking we should retry the write commands when >> isPreviousValueNeeded() changes... Not very appealing, but I think the >> performance difference is worth it. >> >>> R. >>> >>>>> 4) remove write-only commands completely (and probably functional >>>>> listeners as well because these will lose their purpose) >>>>> >>>>> >>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>> be applied. >>>>> I believe we had already pointed out this would eventually happen, pretty >>>>> much for the reasons you're hitting now. >>>>> >>>> IMO version checks should be done internally, we shouldn't force the >>>> users of the functional API to deal with versions themselves because >>>> we know how hard making write skew checks work is for us :) >>>> >>>> And I wouldn't go as far as to remove the functional listeners, >>>> instead I would change them so that read-write listeners are invoked >>>> on write-only operations and they force the loading of the previous >>>> value. I would also add a way for the regular listeners to say whether >>>> they need the previous value or not. >>>> >>>>> Right now I am inclined towards 4). There could be some internal use >>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>> though, but it's asking for trouble. >>>>> >>>>> >>>>> I agree! >>>>> >>>>> Thanks >>>>> >>>>> >>>>> WDYT? >>>>> >>>>> Radim >>>>> >>>> Cheers >>>> Dan >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Radim Vansa <rvansa(a)redhat.com> > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Dan Berindei

4:16 a.m.

On Thu, Jun 29, 2017 at 11:53 AM, Radim Vansa <rvansa(a)redhat.com> wrote:

...

On 06/28/2017 04:20 PM, Dan Berindei wrote: > On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote: >> On 06/28/2017 10:40 AM, Dan Berindei wrote: >>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>>> as many times as needed. >>>>>> >>>>>> Continuous queries require the previous value itself, not just knowledge of >>>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>>> >>>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>>> depressed) :) >>>>>> >>>>>> I'd remove these commands completely or possibly remove them just from >>>>>> public API and keep them internal. >>>>>> >>>>> +1 to remove the flags from the public API. Most of them are not safe >>>>> for applications to use, and ignoring them when they can lead to >>>>> inconsistencies would make them useless. >>>>> >>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>>> know when it is safe to skip the delete statement, and it relies on >>>>> the application making a (possibly wrong) choice. >>>>> >>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>>> that applications use it right now. If query or listeners need the >>>>> previous value, then we should load it internally, but hide it from >>>>> the user. >>>>> >>>>> But removing it opens another discussion: should we replace it in the >>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>>> should we make it the default and add a method >>>>> AdvancedCache.forceReturnPreviousValues()? >>>> Please don't derail the thread. >>>> >>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >>> breaks the previous value for listeners, even if the QueryInterceptor >>> removes it from write commands. And it is public (+recommended) API, >>> in fact most if not all of our performance tests use it. >> That's just a flawed implementation. IPV is documented to be a 'safe' >> flag that should affect mostly primary -> origin replication, all the >> other is implementation. And we can fix that. Users should *not* expect >> that it e.g. skips loading from a cache store. We have already removed >> the modes that would be broken-by-design. >> > I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP > here. The IVR javadoc doesn't say anything about remote lookups, only > SRL does. No, I am not; While IRV does not mention the replication, it's said to be 'safe'. So omitting the primary -> origin replication is basically all it can do when listeners are in place. You're right that I have missed the second part in SRL talking about put()s; I took it as a flag prohibiting any remote lookup (as the RPC operation in its whole) any time the remote value is needed. Yes, the second part seems equal to my understanding of IRV. > > And I agree that the current status is far from ideal, but there is > one more valid alternative: we can decide that the previous value is > only reliable in clustered listeners, and local listeners don't always > have it. Document that, make sure continuous query uses clustered > listeners, and we're done :) Unreliable return values are worse than none; I would rather remove them if we can't guarantee that these are right. Though, clustered listeners are based on regular listeners, so you'd need some means to make them reliable.

We could change the clustered listeners so that they're not based on the regular listeners... I've been pestering Will about this ever since the clustered listeners landed! But I should have been clearer: I didn't mean that the listeners on the backups should receive the previous value whenever we feel like it, I meant we should document and enforce that the previous value is only included in the event for listeners on the primary owner.

...

>> On the other hand, write-only commands are not about *returning* the >> value but about (not) *reading* it, therefore (in my eyes) user could >> make that assumption and would like to enforce it this way. Even some >> docs explaining PersistenceMode.SKIP suggest that. >> > To me the purpose the same, there is no difference between returning > the previous value to the application or providing the previous value > via EntryView. There is a difference between what's provided locally and what's send over the network. > Applying this logic to the JCache API, it would mean > put() should never read the previous value, because some users could > assume that only getAndPut() reads it. OK, this is a valid point. > > In the old times we didn't have IGNORE_RETURN_VALUES, only > SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be > ignored (e.g. if the write was conditional). I think that's what > Galder had in mind when he wrote the PersistenceMode api note, not the > current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this > himself, but I'll be very disappointed if he says he designed the > write-only operations so that they'll never work with query. > > >> I don't want to talk about flags, because I see all flags but IPV as >> 'effectively internal'. Let's discuss it more high-level. Some API >> exposes non-reading operation - we can see that under some circumstances >> this is not possible so we have options to 1) break stuff 2) break API >> assumptions 3) sometimes break API assumptions 4) remove such API (to >> not allow the user to make such assumptions). There's also an option 5) >> to fail the operation if the API assumption would be broken. Though, I >> don't fancy getting exception from a WriteOnlyMap.eval just because >> someone has registered a listener. >> > I disagree with the premise: there's no good reason for the user to > assume that write-only commands are *guaranteed* to never load the > previous value from a store. We just need to add a clarification to > the write-only operations' javadoc, no need to break anything. OK then, though it diminishes the value of write-only commands a lot. > > >>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >>> previous value on backup owners for most write commands >>> (LoadType.PRIMARY), we'd need to change that as well. >> Yes, all commands will have to load current value on all owners. >> >>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am working on entry version history (again). In Como we've discussed >>>>>> that previous values are needed for (continuous) query and reliable >>>>>> listeners, >>>>>> >>>>>> >>>>>> Index based queries also require the previous value on a write - unless we >>>>>> can get "strongly typed caches" giving guarantees about the class to >>>>>> represent the content of a cache to be unique. >>>>>> >>>>>> Essentially we only need to know the type of the previous object. It might >>>>>> be worth having a way to load the type metadata if the previous value only. >>>>>> >>>>>> so I wonder what should we do with functional write-only >>>>>> commands. These are different to commands with flags, because flags >>>>>> (other than ignore return value) are expected to break something. >>>>>> >>>>>> >>>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>>> search the mailing list. >>>>>> >>>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>>> break things. >>>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>>> configuration/integrations veto them. >>>>>> >>>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>>> users having to solve puzzles, such as learning details about these flags >>>>>> and their possible side effects. >>>>>> >>>>>> So assuming they become either "safe" or internal, maybe you can take >>>>>> advantage of them? >>>>>> >>>>>> I see >>>>>> the available options as: >>>>>> >>>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>>> (warn user that he will break it) >>>>>> >>>>>> 2) run write-only without any optimization, rendering them useless >>>>>> >>>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>>> stuff that could get broken) >>>>>> >>>>>> >>>>>> Might be useful for making a POC work, but I believe query will be very >>>>>> likely to be often enabled. >>>>>> Having an either / or switch for different features in Infinispan will make >>>>>> it harder to use and understand, so I'd rather see work on the right design >>>>>> as taking temporary shortcuts risks baking into stone features which we >>>>>> later struggle to fix or maintain. >>>>>> >>>>> I vote for this option. >>>>> >>>>> Query, listeners, and other components that need the previous value >>>>> should not just assume that the application knows better, they should >>>>> be able to change how operations works based on their needs. Of >>>>> course, the reverse is also true: if the application uses write-only >>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>>> be possible for the user to detect why the previous values are still >>>>> loaded. >>>> If it were just query (static configuration), I would be okay with this >>>> idea. But as per listeners - besides tainting the design (event source >>>> should not check if there's a listener) you'd need to check *before* >>> The source wouldn't check for listeners explicitly, the notifier would >>> have an isPreviousValueNeeded() method and precompute that before a >>> listener is added or after a listener is removed. I was am assuming >>> some listeners will not need the previous value, e.g. the listeners >>> installed by streams. >> You can cover your warts with a make-up but you'll still have warts :) > Cutting them off doesn't necessarily work, either :) Yep, some people tend to fix w/ hacks instead of designing :) > >>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>>> EWI). So this is a space for race conditions or weird handling (if >>>> there's a listener when I am about to call notify and my flags are not >>>> cleared, skip the notification and pretend that this code was invoked >>>> before the listener was registered...). Or do you have another solution >>>> in mind (config option to disable listeners && all features using those?). >>>> >>> I was definitely going for the weird handling... >>> >>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >>> it's loaded, and check that before invoking a listener that needs the >>> previous value. It is missing one edge case: if one thread starts a >>> write operation, then another thread installs a listener that requires >>> the previous value and iterates over the cache, the second thread may >>> not see the value written by the first thread. >> If the operations overlap, you could pretend that the write has finished >> before the listener was invoked and simply not notify the listener. If I >> am missing it please write it down in code. But handling this in any way >> is still clumsy. > I hope pseudo-code is fine... > > 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context > 2. cache.addListener(l) runs, doesn't block > 3. cache.entrySet().forEach() runs, finds k->v0 > 4. cache.put(k, v1) commits k->v1, should notify the listener but > doesn't have the previous value > 5. cache.put(k, v0) returns, but the code that installed the listener > thinks the value of k is still v0 Oh OK, I should have drawn that myself when considering the scenario. You're right, here we'll have to retry. All in all, I think this discussion is done. We'll tell users to stick their flags where the sun doesn't shine and remove any inconvenient ones. Should we issue a warning any time we're removing the flag?

If you mean that we should remove the flags from the public API, I agree. If you mean we should just ignore them, then no, because most of the flags were added for internal components that really need their semantics. Dan

...

Radim > > >>> So now I'm thinking we should retry the write commands when >>> isPreviousValueNeeded() changes... Not very appealing, but I think the >>> performance difference is worth it. >>> >>>> R. >>>> >>>>>> 4) remove write-only commands completely (and probably functional >>>>>> listeners as well because these will lose their purpose) >>>>>> >>>>>> >>>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>>> be applied. >>>>>> I believe we had already pointed out this would eventually happen, pretty >>>>>> much for the reasons you're hitting now. >>>>>> >>>>> IMO version checks should be done internally, we shouldn't force the >>>>> users of the functional API to deal with versions themselves because >>>>> we know how hard making write skew checks work is for us :) >>>>> >>>>> And I wouldn't go as far as to remove the functional listeners, >>>>> instead I would change them so that read-write listeners are invoked >>>>> on write-only operations and they force the loading of the previous >>>>> value. I would also add a way for the regular listeners to say whether >>>>> they need the previous value or not. >>>>> >>>>>> Right now I am inclined towards 4). There could be some internal use >>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>>> though, but it's asking for trouble. >>>>>> >>>>>> >>>>>> I agree! >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> WDYT? >>>>>> >>>>>> Radim >>>>>> >>>>> Cheers >>>>> Dan >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa <rvansa(a)redhat.com> >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Radim Vansa

6:19 a.m.

On 06/29/2017 11:16 AM, Dan Berindei wrote:

...

On Thu, Jun 29, 2017 at 11:53 AM, Radim Vansa <rvansa(a)redhat.com> wrote: > On 06/28/2017 04:20 PM, Dan Berindei wrote: >> On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote: >>> On 06/28/2017 10:40 AM, Dan Berindei wrote: >>>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>>>> as many times as needed. >>>>>>> >>>>>>> Continuous queries require the previous value itself, not just knowledge of >>>>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>>>> >>>>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>>>> depressed) :) >>>>>>> >>>>>>> I'd remove these commands completely or possibly remove them just from >>>>>>> public API and keep them internal. >>>>>>> >>>>>> +1 to remove the flags from the public API. Most of them are not safe >>>>>> for applications to use, and ignoring them when they can lead to >>>>>> inconsistencies would make them useless. >>>>>> >>>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>>>> know when it is safe to skip the delete statement, and it relies on >>>>>> the application making a (possibly wrong) choice. >>>>>> >>>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>>>> that applications use it right now. If query or listeners need the >>>>>> previous value, then we should load it internally, but hide it from >>>>>> the user. >>>>>> >>>>>> But removing it opens another discussion: should we replace it in the >>>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>>>> should we make it the default and add a method >>>>>> AdvancedCache.forceReturnPreviousValues()? >>>>> Please don't derail the thread. >>>>> >>>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >>>> breaks the previous value for listeners, even if the QueryInterceptor >>>> removes it from write commands. And it is public (+recommended) API, >>>> in fact most if not all of our performance tests use it. >>> That's just a flawed implementation. IPV is documented to be a 'safe' >>> flag that should affect mostly primary -> origin replication, all the >>> other is implementation. And we can fix that. Users should *not* expect >>> that it e.g. skips loading from a cache store. We have already removed >>> the modes that would be broken-by-design. >>> >> I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP >> here. The IVR javadoc doesn't say anything about remote lookups, only >> SRL does. > No, I am not; While IRV does not mention the replication, it's said to > be 'safe'. So omitting the primary -> origin replication is basically > all it can do when listeners are in place. You're right that I have > missed the second part in SRL talking about put()s; I took it as a flag > prohibiting any remote lookup (as the RPC operation in its whole) any > time the remote value is needed. Yes, the second part seems equal to my > understanding of IRV. > >> And I agree that the current status is far from ideal, but there is >> one more valid alternative: we can decide that the previous value is >> only reliable in clustered listeners, and local listeners don't always >> have it. Document that, make sure continuous query uses clustered >> listeners, and we're done :) > Unreliable return values are worse than none; I would rather remove them > if we can't guarantee that these are right. Though, clustered listeners > are based on regular listeners, so you'd need some means to make them > reliable. We could change the clustered listeners so that they're not based on the regular listeners... I've been pestering Will about this ever since the clustered listeners landed! But I should have been clearer: I didn't mean that the listeners on the backups should receive the previous value whenever we feel like it, I meant we should document and enforce that the previous value is only included in the event for listeners on the primary owner. >>> On the other hand, write-only commands are not about *returning* the >>> value but about (not) *reading* it, therefore (in my eyes) user could >>> make that assumption and would like to enforce it this way. Even some >>> docs explaining PersistenceMode.SKIP suggest that. >>> >> To me the purpose the same, there is no difference between returning >> the previous value to the application or providing the previous value >> via EntryView. > There is a difference between what's provided locally and what's send > over the network. > >> Applying this logic to the JCache API, it would mean >> put() should never read the previous value, because some users could >> assume that only getAndPut() reads it. > OK, this is a valid point. > >> In the old times we didn't have IGNORE_RETURN_VALUES, only >> SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be >> ignored (e.g. if the write was conditional). I think that's what >> Galder had in mind when he wrote the PersistenceMode api note, not the >> current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this >> himself, but I'll be very disappointed if he says he designed the >> write-only operations so that they'll never work with query. >> >> >>> I don't want to talk about flags, because I see all flags but IPV as >>> 'effectively internal'. Let's discuss it more high-level. Some API >>> exposes non-reading operation - we can see that under some circumstances >>> this is not possible so we have options to 1) break stuff 2) break API >>> assumptions 3) sometimes break API assumptions 4) remove such API (to >>> not allow the user to make such assumptions). There's also an option 5) >>> to fail the operation if the API assumption would be broken. Though, I >>> don't fancy getting exception from a WriteOnlyMap.eval just because >>> someone has registered a listener. >>> >> I disagree with the premise: there's no good reason for the user to >> assume that write-only commands are *guaranteed* to never load the >> previous value from a store. We just need to add a clarification to >> the write-only operations' javadoc, no need to break anything. > OK then, though it diminishes the value of write-only commands a lot. > >> >>>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >>>> previous value on backup owners for most write commands >>>> (LoadType.PRIMARY), we'd need to change that as well. >>> Yes, all commands will have to load current value on all owners. >>> >>>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I am working on entry version history (again). In Como we've discussed >>>>>>> that previous values are needed for (continuous) query and reliable >>>>>>> listeners, >>>>>>> >>>>>>> >>>>>>> Index based queries also require the previous value on a write - unless we >>>>>>> can get "strongly typed caches" giving guarantees about the class to >>>>>>> represent the content of a cache to be unique. >>>>>>> >>>>>>> Essentially we only need to know the type of the previous object. It might >>>>>>> be worth having a way to load the type metadata if the previous value only. >>>>>>> >>>>>>> so I wonder what should we do with functional write-only >>>>>>> commands. These are different to commands with flags, because flags >>>>>>> (other than ignore return value) are expected to break something. >>>>>>> >>>>>>> >>>>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>>>> search the mailing list. >>>>>>> >>>>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>>>> break things. >>>>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>>>> configuration/integrations veto them. >>>>>>> >>>>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>>>> users having to solve puzzles, such as learning details about these flags >>>>>>> and their possible side effects. >>>>>>> >>>>>>> So assuming they become either "safe" or internal, maybe you can take >>>>>>> advantage of them? >>>>>>> >>>>>>> I see >>>>>>> the available options as: >>>>>>> >>>>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>>>> (warn user that he will break it) >>>>>>> >>>>>>> 2) run write-only without any optimization, rendering them useless >>>>>>> >>>>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>>>> stuff that could get broken) >>>>>>> >>>>>>> >>>>>>> Might be useful for making a POC work, but I believe query will be very >>>>>>> likely to be often enabled. >>>>>>> Having an either / or switch for different features in Infinispan will make >>>>>>> it harder to use and understand, so I'd rather see work on the right design >>>>>>> as taking temporary shortcuts risks baking into stone features which we >>>>>>> later struggle to fix or maintain. >>>>>>> >>>>>> I vote for this option. >>>>>> >>>>>> Query, listeners, and other components that need the previous value >>>>>> should not just assume that the application knows better, they should >>>>>> be able to change how operations works based on their needs. Of >>>>>> course, the reverse is also true: if the application uses write-only >>>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>>>> be possible for the user to detect why the previous values are still >>>>>> loaded. >>>>> If it were just query (static configuration), I would be okay with this >>>>> idea. But as per listeners - besides tainting the design (event source >>>>> should not check if there's a listener) you'd need to check *before* >>>> The source wouldn't check for listeners explicitly, the notifier would >>>> have an isPreviousValueNeeded() method and precompute that before a >>>> listener is added or after a listener is removed. I was am assuming >>>> some listeners will not need the previous value, e.g. the listeners >>>> installed by streams. >>> You can cover your warts with a make-up but you'll still have warts :) >> Cutting them off doesn't necessarily work, either :) > Yep, some people tend to fix w/ hacks instead of designing :) > >>>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>>>> EWI). So this is a space for race conditions or weird handling (if >>>>> there's a listener when I am about to call notify and my flags are not >>>>> cleared, skip the notification and pretend that this code was invoked >>>>> before the listener was registered...). Or do you have another solution >>>>> in mind (config option to disable listeners && all features using those?). >>>>> >>>> I was definitely going for the weird handling... >>>> >>>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >>>> it's loaded, and check that before invoking a listener that needs the >>>> previous value. It is missing one edge case: if one thread starts a >>>> write operation, then another thread installs a listener that requires >>>> the previous value and iterates over the cache, the second thread may >>>> not see the value written by the first thread. >>> If the operations overlap, you could pretend that the write has finished >>> before the listener was invoked and simply not notify the listener. If I >>> am missing it please write it down in code. But handling this in any way >>> is still clumsy. >> I hope pseudo-code is fine... >> >> 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context >> 2. cache.addListener(l) runs, doesn't block >> 3. cache.entrySet().forEach() runs, finds k->v0 >> 4. cache.put(k, v1) commits k->v1, should notify the listener but >> doesn't have the previous value >> 5. cache.put(k, v0) returns, but the code that installed the listener >> thinks the value of k is still v0 > Oh OK, I should have drawn that myself when considering the scenario. > You're right, here we'll have to retry. > > All in all, I think this discussion is done. We'll tell users to stick > their flags where the sun doesn't shine and remove any inconvenient > ones. Should we issue a warning any time we're removing the flag? > If you mean that we should remove the flags from the public API, I agree. If you mean we should just ignore them, then no, because most of the flags were added for internal components that really need their semantics.

We can't remove them from public API before Infinspan 10, and I think that it will be a quite an unpopular step even after that. But until 10, I think that the common agreement was to not break query, that is ignore the flags. And make write-only reading. R.

...

Dan > Radim > >> >>>> So now I'm thinking we should retry the write commands when >>>> isPreviousValueNeeded() changes... Not very appealing, but I think the >>>> performance difference is worth it. >>>> >>>>> R. >>>>> >>>>>>> 4) remove write-only commands completely (and probably functional >>>>>>> listeners as well because these will lose their purpose) >>>>>>> >>>>>>> >>>>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>>>> be applied. >>>>>>> I believe we had already pointed out this would eventually happen, pretty >>>>>>> much for the reasons you're hitting now. >>>>>>> >>>>>> IMO version checks should be done internally, we shouldn't force the >>>>>> users of the functional API to deal with versions themselves because >>>>>> we know how hard making write skew checks work is for us :) >>>>>> >>>>>> And I wouldn't go as far as to remove the functional listeners, >>>>>> instead I would change them so that read-write listeners are invoked >>>>>> on write-only operations and they force the loading of the previous >>>>>> value. I would also add a way for the regular listeners to say whether >>>>>> they need the previous value or not. >>>>>> >>>>>>> Right now I am inclined towards 4). There could be some internal use >>>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>>>> though, but it's asking for trouble. >>>>>>> >>>>>>> >>>>>>> I agree! >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> WDYT? >>>>>>> >>>>>>> Radim >>>>>>> >>>>>> Cheers >>>>>> Dan >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> -- >>> Radim Vansa <rvansa(a)redhat.com> >>> JBoss Performance Team >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Radim Vansa <rvansa(a)redhat.com> > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Dan Berindei

7:36 a.m.

On Thu, Jun 29, 2017 at 2:19 PM, Radim Vansa <rvansa(a)redhat.com> wrote:

...

On 06/29/2017 11:16 AM, Dan Berindei wrote: > On Thu, Jun 29, 2017 at 11:53 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >> On 06/28/2017 04:20 PM, Dan Berindei wrote: >>> On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>> On 06/28/2017 10:40 AM, Dan Berindei wrote: >>>>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>>>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>>>>> as many times as needed. >>>>>>>> >>>>>>>> Continuous queries require the previous value itself, not just knowledge of >>>>>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>>>>> >>>>>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>>>>> depressed) :) >>>>>>>> >>>>>>>> I'd remove these commands completely or possibly remove them just from >>>>>>>> public API and keep them internal. >>>>>>>> >>>>>>> +1 to remove the flags from the public API. Most of them are not safe >>>>>>> for applications to use, and ignoring them when they can lead to >>>>>>> inconsistencies would make them useless. >>>>>>> >>>>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>>>>> know when it is safe to skip the delete statement, and it relies on >>>>>>> the application making a (possibly wrong) choice. >>>>>>> >>>>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>>>>> that applications use it right now. If query or listeners need the >>>>>>> previous value, then we should load it internally, but hide it from >>>>>>> the user. >>>>>>> >>>>>>> But removing it opens another discussion: should we replace it in the >>>>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>>>>> should we make it the default and add a method >>>>>>> AdvancedCache.forceReturnPreviousValues()? >>>>>> Please don't derail the thread. >>>>>> >>>>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >>>>> breaks the previous value for listeners, even if the QueryInterceptor >>>>> removes it from write commands. And it is public (+recommended) API, >>>>> in fact most if not all of our performance tests use it. >>>> That's just a flawed implementation. IPV is documented to be a 'safe' >>>> flag that should affect mostly primary -> origin replication, all the >>>> other is implementation. And we can fix that. Users should *not* expect >>>> that it e.g. skips loading from a cache store. We have already removed >>>> the modes that would be broken-by-design. >>>> >>> I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP >>> here. The IVR javadoc doesn't say anything about remote lookups, only >>> SRL does. >> No, I am not; While IRV does not mention the replication, it's said to >> be 'safe'. So omitting the primary -> origin replication is basically >> all it can do when listeners are in place. You're right that I have >> missed the second part in SRL talking about put()s; I took it as a flag >> prohibiting any remote lookup (as the RPC operation in its whole) any >> time the remote value is needed. Yes, the second part seems equal to my >> understanding of IRV. >> >>> And I agree that the current status is far from ideal, but there is >>> one more valid alternative: we can decide that the previous value is >>> only reliable in clustered listeners, and local listeners don't always >>> have it. Document that, make sure continuous query uses clustered >>> listeners, and we're done :) >> Unreliable return values are worse than none; I would rather remove them >> if we can't guarantee that these are right. Though, clustered listeners >> are based on regular listeners, so you'd need some means to make them >> reliable. > We could change the clustered listeners so that they're not based on > the regular listeners... I've been pestering Will about this ever > since the clustered listeners landed! > > But I should have been clearer: I didn't mean that the listeners on > the backups should receive the previous value whenever we feel like > it, I meant we should document and enforce that the previous value is > only included in the event for listeners on the primary owner. >>>> On the other hand, write-only commands are not about *returning* the >>>> value but about (not) *reading* it, therefore (in my eyes) user could >>>> make that assumption and would like to enforce it this way. Even some >>>> docs explaining PersistenceMode.SKIP suggest that. >>>> >>> To me the purpose the same, there is no difference between returning >>> the previous value to the application or providing the previous value >>> via EntryView. >> There is a difference between what's provided locally and what's send >> over the network. >> >>> Applying this logic to the JCache API, it would mean >>> put() should never read the previous value, because some users could >>> assume that only getAndPut() reads it. >> OK, this is a valid point. >> >>> In the old times we didn't have IGNORE_RETURN_VALUES, only >>> SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be >>> ignored (e.g. if the write was conditional). I think that's what >>> Galder had in mind when he wrote the PersistenceMode api note, not the >>> current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this >>> himself, but I'll be very disappointed if he says he designed the >>> write-only operations so that they'll never work with query. >>> >>> >>>> I don't want to talk about flags, because I see all flags but IPV as >>>> 'effectively internal'. Let's discuss it more high-level. Some API >>>> exposes non-reading operation - we can see that under some circumstances >>>> this is not possible so we have options to 1) break stuff 2) break API >>>> assumptions 3) sometimes break API assumptions 4) remove such API (to >>>> not allow the user to make such assumptions). There's also an option 5) >>>> to fail the operation if the API assumption would be broken. Though, I >>>> don't fancy getting exception from a WriteOnlyMap.eval just because >>>> someone has registered a listener. >>>> >>> I disagree with the premise: there's no good reason for the user to >>> assume that write-only commands are *guaranteed* to never load the >>> previous value from a store. We just need to add a clarification to >>> the write-only operations' javadoc, no need to break anything. >> OK then, though it diminishes the value of write-only commands a lot. >> >>> >>>>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >>>>> previous value on backup owners for most write commands >>>>> (LoadType.PRIMARY), we'd need to change that as well. >>>> Yes, all commands will have to load current value on all owners. >>>> >>>>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am working on entry version history (again). In Como we've discussed >>>>>>>> that previous values are needed for (continuous) query and reliable >>>>>>>> listeners, >>>>>>>> >>>>>>>> >>>>>>>> Index based queries also require the previous value on a write - unless we >>>>>>>> can get "strongly typed caches" giving guarantees about the class to >>>>>>>> represent the content of a cache to be unique. >>>>>>>> >>>>>>>> Essentially we only need to know the type of the previous object. It might >>>>>>>> be worth having a way to load the type metadata if the previous value only. >>>>>>>> >>>>>>>> so I wonder what should we do with functional write-only >>>>>>>> commands. These are different to commands with flags, because flags >>>>>>>> (other than ignore return value) are expected to break something. >>>>>>>> >>>>>>>> >>>>>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>>>>> search the mailing list. >>>>>>>> >>>>>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>>>>> break things. >>>>>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>>>>> configuration/integrations veto them. >>>>>>>> >>>>>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>>>>> users having to solve puzzles, such as learning details about these flags >>>>>>>> and their possible side effects. >>>>>>>> >>>>>>>> So assuming they become either "safe" or internal, maybe you can take >>>>>>>> advantage of them? >>>>>>>> >>>>>>>> I see >>>>>>>> the available options as: >>>>>>>> >>>>>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>>>>> (warn user that he will break it) >>>>>>>> >>>>>>>> 2) run write-only without any optimization, rendering them useless >>>>>>>> >>>>>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>>>>> stuff that could get broken) >>>>>>>> >>>>>>>> >>>>>>>> Might be useful for making a POC work, but I believe query will be very >>>>>>>> likely to be often enabled. >>>>>>>> Having an either / or switch for different features in Infinispan will make >>>>>>>> it harder to use and understand, so I'd rather see work on the right design >>>>>>>> as taking temporary shortcuts risks baking into stone features which we >>>>>>>> later struggle to fix or maintain. >>>>>>>> >>>>>>> I vote for this option. >>>>>>> >>>>>>> Query, listeners, and other components that need the previous value >>>>>>> should not just assume that the application knows better, they should >>>>>>> be able to change how operations works based on their needs. Of >>>>>>> course, the reverse is also true: if the application uses write-only >>>>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>>>>> be possible for the user to detect why the previous values are still >>>>>>> loaded. >>>>>> If it were just query (static configuration), I would be okay with this >>>>>> idea. But as per listeners - besides tainting the design (event source >>>>>> should not check if there's a listener) you'd need to check *before* >>>>> The source wouldn't check for listeners explicitly, the notifier would >>>>> have an isPreviousValueNeeded() method and precompute that before a >>>>> listener is added or after a listener is removed. I was am assuming >>>>> some listeners will not need the previous value, e.g. the listeners >>>>> installed by streams. >>>> You can cover your warts with a make-up but you'll still have warts :) >>> Cutting them off doesn't necessarily work, either :) >> Yep, some people tend to fix w/ hacks instead of designing :) >> >>>>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>>>>> EWI). So this is a space for race conditions or weird handling (if >>>>>> there's a listener when I am about to call notify and my flags are not >>>>>> cleared, skip the notification and pretend that this code was invoked >>>>>> before the listener was registered...). Or do you have another solution >>>>>> in mind (config option to disable listeners && all features using those?). >>>>>> >>>>> I was definitely going for the weird handling... >>>>> >>>>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >>>>> it's loaded, and check that before invoking a listener that needs the >>>>> previous value. It is missing one edge case: if one thread starts a >>>>> write operation, then another thread installs a listener that requires >>>>> the previous value and iterates over the cache, the second thread may >>>>> not see the value written by the first thread. >>>> If the operations overlap, you could pretend that the write has finished >>>> before the listener was invoked and simply not notify the listener. If I >>>> am missing it please write it down in code. But handling this in any way >>>> is still clumsy. >>> I hope pseudo-code is fine... >>> >>> 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context >>> 2. cache.addListener(l) runs, doesn't block >>> 3. cache.entrySet().forEach() runs, finds k->v0 >>> 4. cache.put(k, v1) commits k->v1, should notify the listener but >>> doesn't have the previous value >>> 5. cache.put(k, v0) returns, but the code that installed the listener >>> thinks the value of k is still v0 >> Oh OK, I should have drawn that myself when considering the scenario. >> You're right, here we'll have to retry. >> >> All in all, I think this discussion is done. We'll tell users to stick >> their flags where the sun doesn't shine and remove any inconvenient >> ones. Should we issue a warning any time we're removing the flag? >> > If you mean that we should remove the flags from the public API, I > agree. If you mean we should just ignore them, then no, because most > of the flags were added for internal components that really need their > semantics. We can't remove them from public API before Infinspan 10, and I think that it will be a quite an unpopular step even after that. But until 10, I think that the common agreement was to not break query, that is ignore the flags. And make write-only reading.

So SKIP_INDEXING should not skip indexing because it can break query??

...

R. > > Dan > > >> Radim >> >>> >>>>> So now I'm thinking we should retry the write commands when >>>>> isPreviousValueNeeded() changes... Not very appealing, but I think the >>>>> performance difference is worth it. >>>>> >>>>>> R. >>>>>> >>>>>>>> 4) remove write-only commands completely (and probably functional >>>>>>>> listeners as well because these will lose their purpose) >>>>>>>> >>>>>>>> >>>>>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>>>>> be applied. >>>>>>>> I believe we had already pointed out this would eventually happen, pretty >>>>>>>> much for the reasons you're hitting now. >>>>>>>> >>>>>>> IMO version checks should be done internally, we shouldn't force the >>>>>>> users of the functional API to deal with versions themselves because >>>>>>> we know how hard making write skew checks work is for us :) >>>>>>> >>>>>>> And I wouldn't go as far as to remove the functional listeners, >>>>>>> instead I would change them so that read-write listeners are invoked >>>>>>> on write-only operations and they force the loading of the previous >>>>>>> value. I would also add a way for the regular listeners to say whether >>>>>>> they need the previous value or not. >>>>>>> >>>>>>>> Right now I am inclined towards 4). There could be some internal use >>>>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>>>>> though, but it's asking for trouble. >>>>>>>> >>>>>>>> >>>>>>>> I agree! >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> WDYT? >>>>>>>> >>>>>>>> Radim >>>>>>>> >>>>>>> Cheers >>>>>>> Dan >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev(a)lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> -- >>>> Radim Vansa <rvansa(a)redhat.com> >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa <rvansa(a)redhat.com> >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Radim Vansa

8:51 a.m.

On 06/29/2017 02:36 PM, Dan Berindei wrote:

...

On Thu, Jun 29, 2017 at 2:19 PM, Radim Vansa <rvansa(a)redhat.com> wrote: > On 06/29/2017 11:16 AM, Dan Berindei wrote: >> On Thu, Jun 29, 2017 at 11:53 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>> On 06/28/2017 04:20 PM, Dan Berindei wrote: >>>> On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>>> On 06/28/2017 10:40 AM, Dan Berindei wrote: >>>>>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>>>>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>>>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>>>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>>>>>> as many times as needed. >>>>>>>>> >>>>>>>>> Continuous queries require the previous value itself, not just knowledge of >>>>>>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>>>>>> >>>>>>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>>>>>> depressed) :) >>>>>>>>> >>>>>>>>> I'd remove these commands completely or possibly remove them just from >>>>>>>>> public API and keep them internal. >>>>>>>>> >>>>>>>> +1 to remove the flags from the public API. Most of them are not safe >>>>>>>> for applications to use, and ignoring them when they can lead to >>>>>>>> inconsistencies would make them useless. >>>>>>>> >>>>>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>>>>>> know when it is safe to skip the delete statement, and it relies on >>>>>>>> the application making a (possibly wrong) choice. >>>>>>>> >>>>>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>>>>>> that applications use it right now. If query or listeners need the >>>>>>>> previous value, then we should load it internally, but hide it from >>>>>>>> the user. >>>>>>>> >>>>>>>> But removing it opens another discussion: should we replace it in the >>>>>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>>>>>> should we make it the default and add a method >>>>>>>> AdvancedCache.forceReturnPreviousValues()? >>>>>>> Please don't derail the thread. >>>>>>> >>>>>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >>>>>> breaks the previous value for listeners, even if the QueryInterceptor >>>>>> removes it from write commands. And it is public (+recommended) API, >>>>>> in fact most if not all of our performance tests use it. >>>>> That's just a flawed implementation. IPV is documented to be a 'safe' >>>>> flag that should affect mostly primary -> origin replication, all the >>>>> other is implementation. And we can fix that. Users should *not* expect >>>>> that it e.g. skips loading from a cache store. We have already removed >>>>> the modes that would be broken-by-design. >>>>> >>>> I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP >>>> here. The IVR javadoc doesn't say anything about remote lookups, only >>>> SRL does. >>> No, I am not; While IRV does not mention the replication, it's said to >>> be 'safe'. So omitting the primary -> origin replication is basically >>> all it can do when listeners are in place. You're right that I have >>> missed the second part in SRL talking about put()s; I took it as a flag >>> prohibiting any remote lookup (as the RPC operation in its whole) any >>> time the remote value is needed. Yes, the second part seems equal to my >>> understanding of IRV. >>> >>>> And I agree that the current status is far from ideal, but there is >>>> one more valid alternative: we can decide that the previous value is >>>> only reliable in clustered listeners, and local listeners don't always >>>> have it. Document that, make sure continuous query uses clustered >>>> listeners, and we're done :) >>> Unreliable return values are worse than none; I would rather remove them >>> if we can't guarantee that these are right. Though, clustered listeners >>> are based on regular listeners, so you'd need some means to make them >>> reliable. >> We could change the clustered listeners so that they're not based on >> the regular listeners... I've been pestering Will about this ever >> since the clustered listeners landed! >> >> But I should have been clearer: I didn't mean that the listeners on >> the backups should receive the previous value whenever we feel like >> it, I meant we should document and enforce that the previous value is >> only included in the event for listeners on the primary owner. >>>>> On the other hand, write-only commands are not about *returning* the >>>>> value but about (not) *reading* it, therefore (in my eyes) user could >>>>> make that assumption and would like to enforce it this way. Even some >>>>> docs explaining PersistenceMode.SKIP suggest that. >>>>> >>>> To me the purpose the same, there is no difference between returning >>>> the previous value to the application or providing the previous value >>>> via EntryView. >>> There is a difference between what's provided locally and what's send >>> over the network. >>> >>>> Applying this logic to the JCache API, it would mean >>>> put() should never read the previous value, because some users could >>>> assume that only getAndPut() reads it. >>> OK, this is a valid point. >>> >>>> In the old times we didn't have IGNORE_RETURN_VALUES, only >>>> SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be >>>> ignored (e.g. if the write was conditional). I think that's what >>>> Galder had in mind when he wrote the PersistenceMode api note, not the >>>> current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this >>>> himself, but I'll be very disappointed if he says he designed the >>>> write-only operations so that they'll never work with query. >>>> >>>> >>>>> I don't want to talk about flags, because I see all flags but IPV as >>>>> 'effectively internal'. Let's discuss it more high-level. Some API >>>>> exposes non-reading operation - we can see that under some circumstances >>>>> this is not possible so we have options to 1) break stuff 2) break API >>>>> assumptions 3) sometimes break API assumptions 4) remove such API (to >>>>> not allow the user to make such assumptions). There's also an option 5) >>>>> to fail the operation if the API assumption would be broken. Though, I >>>>> don't fancy getting exception from a WriteOnlyMap.eval just because >>>>> someone has registered a listener. >>>>> >>>> I disagree with the premise: there's no good reason for the user to >>>> assume that write-only commands are *guaranteed* to never load the >>>> previous value from a store. We just need to add a clarification to >>>> the write-only operations' javadoc, no need to break anything. >>> OK then, though it diminishes the value of write-only commands a lot. >>> >>>>>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >>>>>> previous value on backup owners for most write commands >>>>>> (LoadType.PRIMARY), we'd need to change that as well. >>>>> Yes, all commands will have to load current value on all owners. >>>>> >>>>>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I am working on entry version history (again). In Como we've discussed >>>>>>>>> that previous values are needed for (continuous) query and reliable >>>>>>>>> listeners, >>>>>>>>> >>>>>>>>> >>>>>>>>> Index based queries also require the previous value on a write - unless we >>>>>>>>> can get "strongly typed caches" giving guarantees about the class to >>>>>>>>> represent the content of a cache to be unique. >>>>>>>>> >>>>>>>>> Essentially we only need to know the type of the previous object. It might >>>>>>>>> be worth having a way to load the type metadata if the previous value only. >>>>>>>>> >>>>>>>>> so I wonder what should we do with functional write-only >>>>>>>>> commands. These are different to commands with flags, because flags >>>>>>>>> (other than ignore return value) are expected to break something. >>>>>>>>> >>>>>>>>> >>>>>>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>>>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>>>>>> search the mailing list. >>>>>>>>> >>>>>>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>>>>>> break things. >>>>>>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>>>>>> configuration/integrations veto them. >>>>>>>>> >>>>>>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>>>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>>>>>> users having to solve puzzles, such as learning details about these flags >>>>>>>>> and their possible side effects. >>>>>>>>> >>>>>>>>> So assuming they become either "safe" or internal, maybe you can take >>>>>>>>> advantage of them? >>>>>>>>> >>>>>>>>> I see >>>>>>>>> the available options as: >>>>>>>>> >>>>>>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>>>>>> (warn user that he will break it) >>>>>>>>> >>>>>>>>> 2) run write-only without any optimization, rendering them useless >>>>>>>>> >>>>>>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>>>>>> stuff that could get broken) >>>>>>>>> >>>>>>>>> >>>>>>>>> Might be useful for making a POC work, but I believe query will be very >>>>>>>>> likely to be often enabled. >>>>>>>>> Having an either / or switch for different features in Infinispan will make >>>>>>>>> it harder to use and understand, so I'd rather see work on the right design >>>>>>>>> as taking temporary shortcuts risks baking into stone features which we >>>>>>>>> later struggle to fix or maintain. >>>>>>>>> >>>>>>>> I vote for this option. >>>>>>>> >>>>>>>> Query, listeners, and other components that need the previous value >>>>>>>> should not just assume that the application knows better, they should >>>>>>>> be able to change how operations works based on their needs. Of >>>>>>>> course, the reverse is also true: if the application uses write-only >>>>>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>>>>>> be possible for the user to detect why the previous values are still >>>>>>>> loaded. >>>>>>> If it were just query (static configuration), I would be okay with this >>>>>>> idea. But as per listeners - besides tainting the design (event source >>>>>>> should not check if there's a listener) you'd need to check *before* >>>>>> The source wouldn't check for listeners explicitly, the notifier would >>>>>> have an isPreviousValueNeeded() method and precompute that before a >>>>>> listener is added or after a listener is removed. I was am assuming >>>>>> some listeners will not need the previous value, e.g. the listeners >>>>>> installed by streams. >>>>> You can cover your warts with a make-up but you'll still have warts :) >>>> Cutting them off doesn't necessarily work, either :) >>> Yep, some people tend to fix w/ hacks instead of designing :) >>> >>>>>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>>>>>> EWI). So this is a space for race conditions or weird handling (if >>>>>>> there's a listener when I am about to call notify and my flags are not >>>>>>> cleared, skip the notification and pretend that this code was invoked >>>>>>> before the listener was registered...). Or do you have another solution >>>>>>> in mind (config option to disable listeners && all features using those?). >>>>>>> >>>>>> I was definitely going for the weird handling... >>>>>> >>>>>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >>>>>> it's loaded, and check that before invoking a listener that needs the >>>>>> previous value. It is missing one edge case: if one thread starts a >>>>>> write operation, then another thread installs a listener that requires >>>>>> the previous value and iterates over the cache, the second thread may >>>>>> not see the value written by the first thread. >>>>> If the operations overlap, you could pretend that the write has finished >>>>> before the listener was invoked and simply not notify the listener. If I >>>>> am missing it please write it down in code. But handling this in any way >>>>> is still clumsy. >>>> I hope pseudo-code is fine... >>>> >>>> 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context >>>> 2. cache.addListener(l) runs, doesn't block >>>> 3. cache.entrySet().forEach() runs, finds k->v0 >>>> 4. cache.put(k, v1) commits k->v1, should notify the listener but >>>> doesn't have the previous value >>>> 5. cache.put(k, v0) returns, but the code that installed the listener >>>> thinks the value of k is still v0 >>> Oh OK, I should have drawn that myself when considering the scenario. >>> You're right, here we'll have to retry. >>> >>> All in all, I think this discussion is done. We'll tell users to stick >>> their flags where the sun doesn't shine and remove any inconvenient >>> ones. Should we issue a warning any time we're removing the flag? >>> >> If you mean that we should remove the flags from the public API, I >> agree. If you mean we should just ignore them, then no, because most >> of the flags were added for internal components that really need their >> semantics. > We can't remove them from public API before Infinspan 10, and I think > that it will be a quite an unpopular step even after that. But until 10, > I think that the common agreement was to not break query, that is ignore > the flags. And make write-only reading. > So SKIP_INDEXING should not skip indexing because it can break query??

Ehm... Talking about all flags was wrong, and I think that I've also mixed your input on write-only command and on flags. Let's reiterate, until we hide the flags (in 10+): A) how should we treat SKIP_CACHE_LOAD with respect to (clustered) listeners, query, and write skew check? (IIRC we ignore that for purposes of WSC) B) for write-only, will we load the value if necessary (listeners/query/wsc)? (I guess that the answer is yes) C) for write-only, will we treat PersistenceMode.SKIP differently? D) how should we treat SKIP_REMOTE_LOOKUP when the current write-owner is not a read-owner? R.

...

> R. > >> Dan >> >> >>> Radim >>> >>>>>> So now I'm thinking we should retry the write commands when >>>>>> isPreviousValueNeeded() changes... Not very appealing, but I think the >>>>>> performance difference is worth it. >>>>>> >>>>>>> R. >>>>>>> >>>>>>>>> 4) remove write-only commands completely (and probably functional >>>>>>>>> listeners as well because these will lose their purpose) >>>>>>>>> >>>>>>>>> >>>>>>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>>>>>> be applied. >>>>>>>>> I believe we had already pointed out this would eventually happen, pretty >>>>>>>>> much for the reasons you're hitting now. >>>>>>>>> >>>>>>>> IMO version checks should be done internally, we shouldn't force the >>>>>>>> users of the functional API to deal with versions themselves because >>>>>>>> we know how hard making write skew checks work is for us :) >>>>>>>> >>>>>>>> And I wouldn't go as far as to remove the functional listeners, >>>>>>>> instead I would change them so that read-write listeners are invoked >>>>>>>> on write-only operations and they force the loading of the previous >>>>>>>> value. I would also add a way for the regular listeners to say whether >>>>>>>> they need the previous value or not. >>>>>>>> >>>>>>>>> Right now I am inclined towards 4). There could be some internal use >>>>>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>>>>>> though, but it's asking for trouble. >>>>>>>>> >>>>>>>>> >>>>>>>>> I agree! >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> WDYT? >>>>>>>>> >>>>>>>>> Radim >>>>>>>>> >>>>>>>> Cheers >>>>>>>> Dan >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev(a)lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> -- >>>>> Radim Vansa <rvansa(a)redhat.com> >>>>> JBoss Performance Team >>>>> >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev(a)lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> -- >>> Radim Vansa <rvansa(a)redhat.com> >>> JBoss Performance Team >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Radim Vansa <rvansa(a)redhat.com> > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Dan Berindei

Wednesday, 5 July Wed, 5 Jul

4:54 a.m.

On Thu, Jun 29, 2017 at 4:51 PM, Radim Vansa <rvansa(a)redhat.com> wrote:

...

On 06/29/2017 02:36 PM, Dan Berindei wrote: > On Thu, Jun 29, 2017 at 2:19 PM, Radim Vansa <rvansa(a)redhat.com> wrote: >> On 06/29/2017 11:16 AM, Dan Berindei wrote: >>> On Thu, Jun 29, 2017 at 11:53 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>> On 06/28/2017 04:20 PM, Dan Berindei wrote: >>>>> On Wed, Jun 28, 2017 at 2:17 PM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>>>> On 06/28/2017 10:40 AM, Dan Berindei wrote: >>>>>>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>>>>>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>>>>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>>>>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>>>>>>> as many times as needed. >>>>>>>>>> >>>>>>>>>> Continuous queries require the previous value itself, not just knowledge of >>>>>>>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>>>>>>> >>>>>>>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>>>>>>> depressed) :) >>>>>>>>>> >>>>>>>>>> I'd remove these commands completely or possibly remove them just from >>>>>>>>>> public API and keep them internal. >>>>>>>>>> >>>>>>>>> +1 to remove the flags from the public API. Most of them are not safe >>>>>>>>> for applications to use, and ignoring them when they can lead to >>>>>>>>> inconsistencies would make them useless. >>>>>>>>> >>>>>>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>>>>>>> know when it is safe to skip the delete statement, and it relies on >>>>>>>>> the application making a (possibly wrong) choice. >>>>>>>>> >>>>>>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>>>>>>> that applications use it right now. If query or listeners need the >>>>>>>>> previous value, then we should load it internally, but hide it from >>>>>>>>> the user. >>>>>>>>> >>>>>>>>> But removing it opens another discussion: should we replace it in the >>>>>>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>>>>>>> should we make it the default and add a method >>>>>>>>> AdvancedCache.forceReturnPreviousValues()? >>>>>>>> Please don't derail the thread. >>>>>>>> >>>>>>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >>>>>>> breaks the previous value for listeners, even if the QueryInterceptor >>>>>>> removes it from write commands. And it is public (+recommended) API, >>>>>>> in fact most if not all of our performance tests use it. >>>>>> That's just a flawed implementation. IPV is documented to be a 'safe' >>>>>> flag that should affect mostly primary -> origin replication, all the >>>>>> other is implementation. And we can fix that. Users should *not* expect >>>>>> that it e.g. skips loading from a cache store. We have already removed >>>>>> the modes that would be broken-by-design. >>>>>> >>>>> I think you're confusing IGNORE_RETURN_VALUES with SKIP_REMOTE_LOOKUP >>>>> here. The IVR javadoc doesn't say anything about remote lookups, only >>>>> SRL does. >>>> No, I am not; While IRV does not mention the replication, it's said to >>>> be 'safe'. So omitting the primary -> origin replication is basically >>>> all it can do when listeners are in place. You're right that I have >>>> missed the second part in SRL talking about put()s; I took it as a flag >>>> prohibiting any remote lookup (as the RPC operation in its whole) any >>>> time the remote value is needed. Yes, the second part seems equal to my >>>> understanding of IRV. >>>> >>>>> And I agree that the current status is far from ideal, but there is >>>>> one more valid alternative: we can decide that the previous value is >>>>> only reliable in clustered listeners, and local listeners don't always >>>>> have it. Document that, make sure continuous query uses clustered >>>>> listeners, and we're done :) >>>> Unreliable return values are worse than none; I would rather remove them >>>> if we can't guarantee that these are right. Though, clustered listeners >>>> are based on regular listeners, so you'd need some means to make them >>>> reliable. >>> We could change the clustered listeners so that they're not based on >>> the regular listeners... I've been pestering Will about this ever >>> since the clustered listeners landed! >>> >>> But I should have been clearer: I didn't mean that the listeners on >>> the backups should receive the previous value whenever we feel like >>> it, I meant we should document and enforce that the previous value is >>> only included in the event for listeners on the primary owner. >>>>>> On the other hand, write-only commands are not about *returning* the >>>>>> value but about (not) *reading* it, therefore (in my eyes) user could >>>>>> make that assumption and would like to enforce it this way. Even some >>>>>> docs explaining PersistenceMode.SKIP suggest that. >>>>>> >>>>> To me the purpose the same, there is no difference between returning >>>>> the previous value to the application or providing the previous value >>>>> via EntryView. >>>> There is a difference between what's provided locally and what's send >>>> over the network. >>>> >>>>> Applying this logic to the JCache API, it would mean >>>>> put() should never read the previous value, because some users could >>>>> assume that only getAndPut() reads it. >>>> OK, this is a valid point. >>>> >>>>> In the old times we didn't have IGNORE_RETURN_VALUES, only >>>>> SKIP_REMOTE_LOOKUP+SKIP_CACHE_LOAD, and they would sometimes be >>>>> ignored (e.g. if the write was conditional). I think that's what >>>>> Galder had in mind when he wrote the PersistenceMode api note, not the >>>>> current behaviour of SKIP_CACHE_LOAD. I'll let Galder clarify this >>>>> himself, but I'll be very disappointed if he says he designed the >>>>> write-only operations so that they'll never work with query. >>>>> >>>>> >>>>>> I don't want to talk about flags, because I see all flags but IPV as >>>>>> 'effectively internal'. Let's discuss it more high-level. Some API >>>>>> exposes non-reading operation - we can see that under some circumstances >>>>>> this is not possible so we have options to 1) break stuff 2) break API >>>>>> assumptions 3) sometimes break API assumptions 4) remove such API (to >>>>>> not allow the user to make such assumptions). There's also an option 5) >>>>>> to fail the operation if the API assumption would be broken. Though, I >>>>>> don't fancy getting exception from a WriteOnlyMap.eval just because >>>>>> someone has registered a listener. >>>>>> >>>>> I disagree with the premise: there's no good reason for the user to >>>>> assume that write-only commands are *guaranteed* to never load the >>>>> previous value from a store. We just need to add a clarification to >>>>> the write-only operations' javadoc, no need to break anything. >>>> OK then, though it diminishes the value of write-only commands a lot. >>>> >>>>>>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >>>>>>> previous value on backup owners for most write commands >>>>>>> (LoadType.PRIMARY), we'd need to change that as well. >>>>>> Yes, all commands will have to load current value on all owners. >>>>>> >>>>>>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I am working on entry version history (again). In Como we've discussed >>>>>>>>>> that previous values are needed for (continuous) query and reliable >>>>>>>>>> listeners, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Index based queries also require the previous value on a write - unless we >>>>>>>>>> can get "strongly typed caches" giving guarantees about the class to >>>>>>>>>> represent the content of a cache to be unique. >>>>>>>>>> >>>>>>>>>> Essentially we only need to know the type of the previous object. It might >>>>>>>>>> be worth having a way to load the type metadata if the previous value only. >>>>>>>>>> >>>>>>>>>> so I wonder what should we do with functional write-only >>>>>>>>>> commands. These are different to commands with flags, because flags >>>>>>>>>> (other than ignore return value) are expected to break something. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>>>>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>>>>>>> search the mailing list. >>>>>>>>>> >>>>>>>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>>>>>>> break things. >>>>>>>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>>>>>>> configuration/integrations veto them. >>>>>>>>>> >>>>>>>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>>>>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>>>>>>> users having to solve puzzles, such as learning details about these flags >>>>>>>>>> and their possible side effects. >>>>>>>>>> >>>>>>>>>> So assuming they become either "safe" or internal, maybe you can take >>>>>>>>>> advantage of them? >>>>>>>>>> >>>>>>>>>> I see >>>>>>>>>> the available options as: >>>>>>>>>> >>>>>>>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>>>>>>> (warn user that he will break it) >>>>>>>>>> >>>>>>>>>> 2) run write-only without any optimization, rendering them useless >>>>>>>>>> >>>>>>>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>>>>>>> stuff that could get broken) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Might be useful for making a POC work, but I believe query will be very >>>>>>>>>> likely to be often enabled. >>>>>>>>>> Having an either / or switch for different features in Infinispan will make >>>>>>>>>> it harder to use and understand, so I'd rather see work on the right design >>>>>>>>>> as taking temporary shortcuts risks baking into stone features which we >>>>>>>>>> later struggle to fix or maintain. >>>>>>>>>> >>>>>>>>> I vote for this option. >>>>>>>>> >>>>>>>>> Query, listeners, and other components that need the previous value >>>>>>>>> should not just assume that the application knows better, they should >>>>>>>>> be able to change how operations works based on their needs. Of >>>>>>>>> course, the reverse is also true: if the application uses write-only >>>>>>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>>>>>>> be possible for the user to detect why the previous values are still >>>>>>>>> loaded. >>>>>>>> If it were just query (static configuration), I would be okay with this >>>>>>>> idea. But as per listeners - besides tainting the design (event source >>>>>>>> should not check if there's a listener) you'd need to check *before* >>>>>>> The source wouldn't check for listeners explicitly, the notifier would >>>>>>> have an isPreviousValueNeeded() method and precompute that before a >>>>>>> listener is added or after a listener is removed. I was am assuming >>>>>>> some listeners will not need the previous value, e.g. the listeners >>>>>>> installed by streams. >>>>>> You can cover your warts with a make-up but you'll still have warts :) >>>>> Cutting them off doesn't necessarily work, either :) >>>> Yep, some people tend to fix w/ hacks instead of designing :) >>>> >>>>>>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>>>>>>> EWI). So this is a space for race conditions or weird handling (if >>>>>>>> there's a listener when I am about to call notify and my flags are not >>>>>>>> cleared, skip the notification and pretend that this code was invoked >>>>>>>> before the listener was registered...). Or do you have another solution >>>>>>>> in mind (config option to disable listeners && all features using those?). >>>>>>>> >>>>>>> I was definitely going for the weird handling... >>>>>>> >>>>>>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >>>>>>> it's loaded, and check that before invoking a listener that needs the >>>>>>> previous value. It is missing one edge case: if one thread starts a >>>>>>> write operation, then another thread installs a listener that requires >>>>>>> the previous value and iterates over the cache, the second thread may >>>>>>> not see the value written by the first thread. >>>>>> If the operations overlap, you could pretend that the write has finished >>>>>> before the listener was invoked and simply not notify the listener. If I >>>>>> am missing it please write it down in code. But handling this in any way >>>>>> is still clumsy. >>>>> I hope pseudo-code is fine... >>>>> >>>>> 1. cache.put(k, v1) starts, doesn't load the previous value v0 in the context >>>>> 2. cache.addListener(l) runs, doesn't block >>>>> 3. cache.entrySet().forEach() runs, finds k->v0 >>>>> 4. cache.put(k, v1) commits k->v1, should notify the listener but >>>>> doesn't have the previous value >>>>> 5. cache.put(k, v0) returns, but the code that installed the listener >>>>> thinks the value of k is still v0 >>>> Oh OK, I should have drawn that myself when considering the scenario. >>>> You're right, here we'll have to retry. >>>> >>>> All in all, I think this discussion is done. We'll tell users to stick >>>> their flags where the sun doesn't shine and remove any inconvenient >>>> ones. Should we issue a warning any time we're removing the flag? >>>> >>> If you mean that we should remove the flags from the public API, I >>> agree. If you mean we should just ignore them, then no, because most >>> of the flags were added for internal components that really need their >>> semantics. >> We can't remove them from public API before Infinspan 10, and I think >> that it will be a quite an unpopular step even after that. But until 10, >> I think that the common agreement was to not break query, that is ignore >> the flags. And make write-only reading. >> > So SKIP_INDEXING should not skip indexing because it can break query?? Ehm... Talking about all flags was wrong, and I think that I've also mixed your input on write-only command and on flags.

This is at least partially my fault, because I was thinking of the write-only commands as regular write commands with the IGNORE_RETURN_VALUES flag.

...

Let's reiterate, until we hide the flags (in 10+): A) how should we treat SKIP_CACHE_LOAD with respect to (clustered) listeners, query, and write skew check? (IIRC we ignore that for purposes of WSC)

This is a perfect example of a flag that shouldn't be in the public API. The intent was clearly to skip the cache loader for Cache.put(), but it works at a much lower level than it should, so the interaction with any other operation was undefined before ISPN-5643. In ISPN-5643 I documented it as skipping the load all the time, even for delta writes, in order to push users towards IGNORE_RETURN_VALUES, but I missed WSC (or perhaps there was a test I didn't want to break). For a simpler implementation, I'd still like SKIP_CACHE_LOAD to apply all the time. But it's hard for users to know exactly how other features will be affected, so I'm starting to think it should be like IGNORE_RETURN_VALUES/SKIP_REMOTE_LOOKUP, and the entry should be loaded from the store if it's needed for anything other than the return value (both for reads and for writes). compute() should probably load the previous value all the time. The same goes for read/read-write operations on a functional map created on top of a cache with SKIP_CACHE_LOAD.

...

B) for write-only, will we load the value if necessary (listeners/query/wsc)? (I guess that the answer is yes)

For query/WSC, yes. For listeners, my current thinking is that the previous value should only be available in clustered listeners, and adding a regular listener should not force write commands to load the previous value.

...

C) for write-only, will we treat PersistenceMode.SKIP differently?

The PersistenceMode javadoc says it's only about stores, just like SKIP_CACHE_STORE, so I don't see any reason to treat write-only commands differently.

...

D) how should we treat SKIP_REMOTE_LOOKUP when the current write-owner is not a read-owner?

The javadoc says the flag "will prevent retrieving a remote value [...] to return the overwritten value for {@link Cache#put(Object, Object)}", so SKIP_REMOTE_LOOKUP should be ignored when the previous value is needed for anything other than the return value. Dan

...

R. > >> R. >> >>> Dan >>> >>> >>>> Radim >>>> >>>>>>> So now I'm thinking we should retry the write commands when >>>>>>> isPreviousValueNeeded() changes... Not very appealing, but I think the >>>>>>> performance difference is worth it. >>>>>>> >>>>>>>> R. >>>>>>>> >>>>>>>>>> 4) remove write-only commands completely (and probably functional >>>>>>>>>> listeners as well because these will lose their purpose) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>>>>>>> be applied. >>>>>>>>>> I believe we had already pointed out this would eventually happen, pretty >>>>>>>>>> much for the reasons you're hitting now. >>>>>>>>>> >>>>>>>>> IMO version checks should be done internally, we shouldn't force the >>>>>>>>> users of the functional API to deal with versions themselves because >>>>>>>>> we know how hard making write skew checks work is for us :) >>>>>>>>> >>>>>>>>> And I wouldn't go as far as to remove the functional listeners, >>>>>>>>> instead I would change them so that read-write listeners are invoked >>>>>>>>> on write-only operations and they force the loading of the previous >>>>>>>>> value. I would also add a way for the regular listeners to say whether >>>>>>>>> they need the previous value or not. >>>>>>>>> >>>>>>>>>> Right now I am inclined towards 4). There could be some internal use >>>>>>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>>>>>>> though, but it's asking for trouble. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I agree! >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> WDYT? >>>>>>>>>> >>>>>>>>>> Radim >>>>>>>>>> >>>>>>>>> Cheers >>>>>>>>> Dan >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> infinispan-dev(a)lists.jboss.org >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> -- >>>>>> Radim Vansa <rvansa(a)redhat.com> >>>>>> JBoss Performance Team >>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev(a)lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev(a)lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> -- >>>> Radim Vansa <rvansa(a)redhat.com> >>>> JBoss Performance Team >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa <rvansa(a)redhat.com> >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Adrian Nistor

Friday, 30 June Fri, 30 Jun

6:13 a.m.

<<So SKIP_INDEXING should not skip indexing because it can break query??>> Dan, drawing the conversation into absurdity is not useful. On 06/29/2017 03:36 PM, Dan Berindei wrote:

...

Radim Vansa

Wednesday, 28 June Wed, 28 Jun

9:25 a.m.

On 06/28/2017 01:17 PM, Radim Vansa wrote:

...

On 06/28/2017 10:40 AM, Dan Berindei wrote: > On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>> I've said this in a previous thread on this same issue, I will repeat myself >>>> as many times as needed. >>>> >>>> Continuous queries require the previous value itself, not just knowledge of >>>> the type of the previous value. Strongly typed caches solve no problem here. >>>> >>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>> depressed) :) >>>> >>>> I'd remove these commands completely or possibly remove them just from >>>> public API and keep them internal. >>>> >>> +1 to remove the flags from the public API. Most of them are not safe >>> for applications to use, and ignoring them when they can lead to >>> inconsistencies would make them useless. >>> >>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>> know when it is safe to skip the delete statement, and it relies on >>> the application making a (possibly wrong) choice. >>> >>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>> that applications use it right now. If query or listeners need the >>> previous value, then we should load it internally, but hide it from >>> the user. >>> >>> But removing it opens another discussion: should we replace it in the >>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>> should we make it the default and add a method >>> AdvancedCache.forceReturnPreviousValues()? >> Please don't derail the thread. >> > I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also > breaks the previous value for listeners, even if the QueryInterceptor > removes it from write commands. And it is public (+recommended) API, > in fact most if not all of our performance tests use it. That's just a flawed implementation. IPV is documented to be a 'safe' flag that should affect mostly primary -> origin replication, all the other is implementation. And we can fix that. Users should *not* expect that it e.g. skips loading from a cache store. We have already removed the modes that would be broken-by-design. On the other hand, write-only commands are not about *returning* the value but about (not) *reading* it, therefore (in my eyes) user could make that assumption and would like to enforce it this way. Even some docs explaining PersistenceMode.SKIP suggest that. I don't want to talk about flags, because I see all flags but IPV as 'effectively internal'. Let's discuss it more high-level. Some API exposes non-reading operation - we can see that under some circumstances this is not possible so we have options to 1) break stuff 2) break API assumptions 3) sometimes break API assumptions 4) remove such API (to not allow the user to make such assumptions). There's also an option 5) to fail the operation if the API assumption would be broken. Though, I don't fancy getting exception from a WriteOnlyMap.eval just because someone has registered a listener. > For that matter, ClusteredCacheLoaderInterceptor also doesn't load the > previous value on backup owners for most write commands > (LoadType.PRIMARY), we'd need to change that as well. Yes, all commands will have to load current value on all owners.

Btw., this has definitely some impact on shared cachestores. It's not just that you have to load the value from cachestore - you also need to store the previous value along with the current value (in metadata), because if the primary owner crashes and the backups don't have it in memory, these need to check the persistence to see if the command was already executed. This means that the metadata could grow up to several times the size of the value itself (if there's high concurrency). And even worse with async caches, as these don't know when the command was actually finished and therefore rely on time-based expiration of the previous values :-/ Non-shared cachestores don't need to store previous values as long if we can guarantee (flag) the entry as not eligible for eviction.

...

>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>> >>>> >>>> >>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>> >>>> Hi, >>>> >>>> I am working on entry version history (again). In Como we've discussed >>>> that previous values are needed for (continuous) query and reliable >>>> listeners, >>>> >>>> >>>> Index based queries also require the previous value on a write - unless we >>>> can get "strongly typed caches" giving guarantees about the class to >>>> represent the content of a cache to be unique. >>>> >>>> Essentially we only need to know the type of the previous object. It might >>>> be worth having a way to load the type metadata if the previous value only. >>>> >>>> so I wonder what should we do with functional write-only >>>> commands. These are different to commands with flags, because flags >>>> (other than ignore return value) are expected to break something. >>>> >>>> >>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>> search the mailing list. >>>> >>>> Since flags are exposed to the user I would rather they're not allowed to >>>> break things. >>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>> configuration/integrations veto them. >>>> >>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>> intentionally designed to explore pushing the limits on performance w/o end >>>> users having to solve puzzles, such as learning details about these flags >>>> and their possible side effects. >>>> >>>> So assuming they become either "safe" or internal, maybe you can take >>>> advantage of them? >>>> >>>> I see >>>> the available options as: >>>> >>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>> (warn user that he will break it) >>>> >>>> 2) run write-only without any optimization, rendering them useless >>>> >>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>> stuff that could get broken) >>>> >>>> >>>> Might be useful for making a POC work, but I believe query will be very >>>> likely to be often enabled. >>>> Having an either / or switch for different features in Infinispan will make >>>> it harder to use and understand, so I'd rather see work on the right design >>>> as taking temporary shortcuts risks baking into stone features which we >>>> later struggle to fix or maintain. >>>> >>> I vote for this option. >>> >>> Query, listeners, and other components that need the previous value >>> should not just assume that the application knows better, they should >>> be able to change how operations works based on their needs. Of >>> course, the reverse is also true: if the application uses write-only >>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>> be possible for the user to detect why the previous values are still >>> loaded. >> If it were just query (static configuration), I would be okay with this >> idea. But as per listeners - besides tainting the design (event source >> should not check if there's a listener) you'd need to check *before* > The source wouldn't check for listeners explicitly, the notifier would > have an isPreviousValueNeeded() method and precompute that before a > listener is added or after a listener is removed. I was am assuming > some listeners will not need the previous value, e.g. the listeners > installed by streams. You can cover your warts with a make-up but you'll still have warts :) >> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >> EWI). So this is a space for race conditions or weird handling (if >> there's a listener when I am about to call notify and my flags are not >> cleared, skip the notification and pretend that this code was invoked >> before the listener was registered...). Or do you have another solution >> in mind (config option to disable listeners && all features using those?). >> > I was definitely going for the weird handling... > > My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when > it's loaded, and check that before invoking a listener that needs the > previous value. It is missing one edge case: if one thread starts a > write operation, then another thread installs a listener that requires > the previous value and iterates over the cache, the second thread may > not see the value written by the first thread. If the operations overlap, you could pretend that the write has finished before the listener was invoked and simply not notify the listener. If I am missing it please write it down in code. But handling this in any way is still clumsy. > So now I'm thinking we should retry the write commands when > isPreviousValueNeeded() changes... Not very appealing, but I think the > performance difference is worth it. > >> R. >> >>>> 4) remove write-only commands completely (and probably functional >>>> listeners as well because these will lose their purpose) >>>> >>>> >>>> +1 to remove "unconditional writes", at least an entry version check should >>>> be applied. >>>> I believe we had already pointed out this would eventually happen, pretty >>>> much for the reasons you're hitting now. >>>> >>> IMO version checks should be done internally, we shouldn't force the >>> users of the functional API to deal with versions themselves because >>> we know how hard making write skew checks work is for us :) >>> >>> And I wouldn't go as far as to remove the functional listeners, >>> instead I would change them so that read-write listeners are invoked >>> on write-only operations and they force the loading of the previous >>> value. I would also add a way for the regular listeners to say whether >>> they need the previous value or not. >>> >>>> Right now I am inclined towards 4). There could be some internal use >>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>> though, but it's asking for trouble. >>>> >>>> >>>> I agree! >>>> >>>> Thanks >>>> >>>> >>>> WDYT? >>>> >>>> Radim >>>> >>> Cheers >>> Dan > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Dan Berindei

Thursday, 29 June Thu, 29 Jun

3:23 a.m.

On Wed, Jun 28, 2017 at 5:25 PM, Radim Vansa <rvansa(a)redhat.com> wrote:

...

On 06/28/2017 01:17 PM, Radim Vansa wrote: > On 06/28/2017 10:40 AM, Dan Berindei wrote: >> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>> as many times as needed. >>>>> >>>>> Continuous queries require the previous value itself, not just knowledge of >>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>> >>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>> depressed) :) >>>>> >>>>> I'd remove these commands completely or possibly remove them just from >>>>> public API and keep them internal. >>>>> >>>> +1 to remove the flags from the public API. Most of them are not safe >>>> for applications to use, and ignoring them when they can lead to >>>> inconsistencies would make them useless. >>>> >>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>> know when it is safe to skip the delete statement, and it relies on >>>> the application making a (possibly wrong) choice. >>>> >>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>> that applications use it right now. If query or listeners need the >>>> previous value, then we should load it internally, but hide it from >>>> the user. >>>> >>>> But removing it opens another discussion: should we replace it in the >>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>> should we make it the default and add a method >>>> AdvancedCache.forceReturnPreviousValues()? >>> Please don't derail the thread. >>> >> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >> breaks the previous value for listeners, even if the QueryInterceptor >> removes it from write commands. And it is public (+recommended) API, >> in fact most if not all of our performance tests use it. > That's just a flawed implementation. IPV is documented to be a 'safe' > flag that should affect mostly primary -> origin replication, all the > other is implementation. And we can fix that. Users should *not* expect > that it e.g. skips loading from a cache store. We have already removed > the modes that would be broken-by-design. > > On the other hand, write-only commands are not about *returning* the > value but about (not) *reading* it, therefore (in my eyes) user could > make that assumption and would like to enforce it this way. Even some > docs explaining PersistenceMode.SKIP suggest that. > > I don't want to talk about flags, because I see all flags but IPV as > 'effectively internal'. Let's discuss it more high-level. Some API > exposes non-reading operation - we can see that under some circumstances > this is not possible so we have options to 1) break stuff 2) break API > assumptions 3) sometimes break API assumptions 4) remove such API (to > not allow the user to make such assumptions). There's also an option 5) > to fail the operation if the API assumption would be broken. Though, I > don't fancy getting exception from a WriteOnlyMap.eval just because > someone has registered a listener. > >> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >> previous value on backup owners for most write commands >> (LoadType.PRIMARY), we'd need to change that as well. > Yes, all commands will have to load current value on all owners. Btw., this has definitely some impact on shared cachestores. It's not just that you have to load the value from cachestore - you also need to store the previous value along with the current value (in metadata), because if the primary owner crashes and the backups don't have it in memory, these need to check the persistence to see if the command was already executed.

I don't think this would work with the triangle algorithm. The primary owner updates the shared store without waiting for acks from the backups, so a backup trying to load the previous value is very likely to find the updated value instead. I'll hazard a guess that none of the operations using LoadType.OWNER currently works with triangle + shared store.

...

This means that the metadata could grow up to several times the size of the value itself (if there's high concurrency). And even worse with async caches, as these don't know when the command was actually finished and therefore rely on time-based expiration of the previous values :-/ Non-shared cachestores don't need to store previous values as long if we can guarantee (flag) the entry as not eligible for eviction. > >>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>> >>>>> >>>>> >>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am working on entry version history (again). In Como we've discussed >>>>> that previous values are needed for (continuous) query and reliable >>>>> listeners, >>>>> >>>>> >>>>> Index based queries also require the previous value on a write - unless we >>>>> can get "strongly typed caches" giving guarantees about the class to >>>>> represent the content of a cache to be unique. >>>>> >>>>> Essentially we only need to know the type of the previous object. It might >>>>> be worth having a way to load the type metadata if the previous value only. >>>>> >>>>> so I wonder what should we do with functional write-only >>>>> commands. These are different to commands with flags, because flags >>>>> (other than ignore return value) are expected to break something. >>>>> >>>>> >>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>> search the mailing list. >>>>> >>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>> break things. >>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>> configuration/integrations veto them. >>>>> >>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>> users having to solve puzzles, such as learning details about these flags >>>>> and their possible side effects. >>>>> >>>>> So assuming they become either "safe" or internal, maybe you can take >>>>> advantage of them? >>>>> >>>>> I see >>>>> the available options as: >>>>> >>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>> (warn user that he will break it) >>>>> >>>>> 2) run write-only without any optimization, rendering them useless >>>>> >>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>> stuff that could get broken) >>>>> >>>>> >>>>> Might be useful for making a POC work, but I believe query will be very >>>>> likely to be often enabled. >>>>> Having an either / or switch for different features in Infinispan will make >>>>> it harder to use and understand, so I'd rather see work on the right design >>>>> as taking temporary shortcuts risks baking into stone features which we >>>>> later struggle to fix or maintain. >>>>> >>>> I vote for this option. >>>> >>>> Query, listeners, and other components that need the previous value >>>> should not just assume that the application knows better, they should >>>> be able to change how operations works based on their needs. Of >>>> course, the reverse is also true: if the application uses write-only >>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>> be possible for the user to detect why the previous values are still >>>> loaded. >>> If it were just query (static configuration), I would be okay with this >>> idea. But as per listeners - besides tainting the design (event source >>> should not check if there's a listener) you'd need to check *before* >> The source wouldn't check for listeners explicitly, the notifier would >> have an isPreviousValueNeeded() method and precompute that before a >> listener is added or after a listener is removed. I was am assuming >> some listeners will not need the previous value, e.g. the listeners >> installed by streams. > You can cover your warts with a make-up but you'll still have warts :) >>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>> EWI). So this is a space for race conditions or weird handling (if >>> there's a listener when I am about to call notify and my flags are not >>> cleared, skip the notification and pretend that this code was invoked >>> before the listener was registered...). Or do you have another solution >>> in mind (config option to disable listeners && all features using those?). >>> >> I was definitely going for the weird handling... >> >> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >> it's loaded, and check that before invoking a listener that needs the >> previous value. It is missing one edge case: if one thread starts a >> write operation, then another thread installs a listener that requires >> the previous value and iterates over the cache, the second thread may >> not see the value written by the first thread. > If the operations overlap, you could pretend that the write has finished > before the listener was invoked and simply not notify the listener. If I > am missing it please write it down in code. But handling this in any way > is still clumsy. >> So now I'm thinking we should retry the write commands when >> isPreviousValueNeeded() changes... Not very appealing, but I think the >> performance difference is worth it. >> >>> R. >>> >>>>> 4) remove write-only commands completely (and probably functional >>>>> listeners as well because these will lose their purpose) >>>>> >>>>> >>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>> be applied. >>>>> I believe we had already pointed out this would eventually happen, pretty >>>>> much for the reasons you're hitting now. >>>>> >>>> IMO version checks should be done internally, we shouldn't force the >>>> users of the functional API to deal with versions themselves because >>>> we know how hard making write skew checks work is for us :) >>>> >>>> And I wouldn't go as far as to remove the functional listeners, >>>> instead I would change them so that read-write listeners are invoked >>>> on write-only operations and they force the loading of the previous >>>> value. I would also add a way for the regular listeners to say whether >>>> they need the previous value or not. >>>> >>>>> Right now I am inclined towards 4). There could be some internal use >>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>> though, but it's asking for trouble. >>>>> >>>>> >>>>> I agree! >>>>> >>>>> Thanks >>>>> >>>>> >>>>> WDYT? >>>>> >>>>> Radim >>>>> >>>> Cheers >>>> Dan >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Radim Vansa

4 a.m.

On 06/29/2017 10:23 AM, Dan Berindei wrote:

...

On Wed, Jun 28, 2017 at 5:25 PM, Radim Vansa <rvansa(a)redhat.com> wrote: > On 06/28/2017 01:17 PM, Radim Vansa wrote: >> On 06/28/2017 10:40 AM, Dan Berindei wrote: >>> On Wed, Jun 28, 2017 at 10:17 AM, Radim Vansa <rvansa(a)redhat.com> wrote: >>>> On 06/27/2017 03:54 PM, Dan Berindei wrote: >>>>> On Tue, Jun 27, 2017 at 2:43 PM, Adrian Nistor <anistor(a)redhat.com> wrote: >>>>>> I've said this in a previous thread on this same issue, I will repeat myself >>>>>> as many times as needed. >>>>>> >>>>>> Continuous queries require the previous value itself, not just knowledge of >>>>>> the type of the previous value. Strongly typed caches solve no problem here. >>>>>> >>>>>> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >>>>>> depressed) :) >>>>>> >>>>>> I'd remove these commands completely or possibly remove them just from >>>>>> public API and keep them internal. >>>>>> >>>>> +1 to remove the flags from the public API. Most of them are not safe >>>>> for applications to use, and ignoring them when they can lead to >>>>> inconsistencies would make them useless. >>>>> >>>>> E.g. the whole point of SKIP_INDEX_CLEANUP is that the cache doesn't >>>>> know when it is safe to skip the delete statement, and it relies on >>>>> the application making a (possibly wrong) choice. >>>>> >>>>> IGNORE_RETURN_VALUES should be safe to use, and we actually recommend >>>>> that applications use it right now. If query or listeners need the >>>>> previous value, then we should load it internally, but hide it from >>>>> the user. >>>>> >>>>> But removing it opens another discussion: should we replace it in the >>>>> public API with a new method AdvancedCache.ignoreReturnValues(), or >>>>> should we make it the default and add a method >>>>> AdvancedCache.forceReturnPreviousValues()? >>>> Please don't derail the thread. >>>> >>> I don't think I'm derailing the thread: IGNORE_PREVIOUS_VALUES also >>> breaks the previous value for listeners, even if the QueryInterceptor >>> removes it from write commands. And it is public (+recommended) API, >>> in fact most if not all of our performance tests use it. >> That's just a flawed implementation. IPV is documented to be a 'safe' >> flag that should affect mostly primary -> origin replication, all the >> other is implementation. And we can fix that. Users should *not* expect >> that it e.g. skips loading from a cache store. We have already removed >> the modes that would be broken-by-design. >> >> On the other hand, write-only commands are not about *returning* the >> value but about (not) *reading* it, therefore (in my eyes) user could >> make that assumption and would like to enforce it this way. Even some >> docs explaining PersistenceMode.SKIP suggest that. >> >> I don't want to talk about flags, because I see all flags but IPV as >> 'effectively internal'. Let's discuss it more high-level. Some API >> exposes non-reading operation - we can see that under some circumstances >> this is not possible so we have options to 1) break stuff 2) break API >> assumptions 3) sometimes break API assumptions 4) remove such API (to >> not allow the user to make such assumptions). There's also an option 5) >> to fail the operation if the API assumption would be broken. Though, I >> don't fancy getting exception from a WriteOnlyMap.eval just because >> someone has registered a listener. >> >>> For that matter, ClusteredCacheLoaderInterceptor also doesn't load the >>> previous value on backup owners for most write commands >>> (LoadType.PRIMARY), we'd need to change that as well. >> Yes, all commands will have to load current value on all owners. > Btw., this has definitely some impact on shared cachestores. It's not > just that you have to load the value from cachestore - you also need to > store the previous value along with the current value (in metadata), > because if the primary owner crashes and the backups don't have it in > memory, these need to check the persistence to see if the command was > already executed. > I don't think this would work with the triangle algorithm. The primary owner updates the shared store without waiting for acks from the backups, so a backup trying to load the previous value is very likely to find the updated value instead.

With entry version history you'll be able to find the previous value even if it was updated, as long as the operation did not complete. That's why it's called a 'history'.

...

I'll hazard a guess that none of the operations using LoadType.OWNER currently works with triangle + shared store.

IIRC triangle does not implement functional commands yet, and these don't work during retries anyway. ApplyDelta and put with delta-aware flag are about to go (and are probably broken in other interesting ways) and then there's the invalidation command that's set to OWNER to have more reliable listeners.

...

> This means that the metadata could grow up to several times the size of > the value itself (if there's high concurrency). And even worse with > async caches, as these don't know when the command was actually finished > and therefore rely on time-based expiration of the previous values :-/ > > Non-shared cachestores don't need to store previous values as long if we > can guarantee (flag) the entry as not eligible for eviction. > >>>>>> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am working on entry version history (again). In Como we've discussed >>>>>> that previous values are needed for (continuous) query and reliable >>>>>> listeners, >>>>>> >>>>>> >>>>>> Index based queries also require the previous value on a write - unless we >>>>>> can get "strongly typed caches" giving guarantees about the class to >>>>>> represent the content of a cache to be unique. >>>>>> >>>>>> Essentially we only need to know the type of the previous object. It might >>>>>> be worth having a way to load the type metadata if the previous value only. >>>>>> >>>>>> so I wonder what should we do with functional write-only >>>>>> commands. These are different to commands with flags, because flags >>>>>> (other than ignore return value) are expected to break something. >>>>>> >>>>>> >>>>>> Sorry I hope to not derail the thread but let's remind that we hope to >>>>>> evolve beyond "flags are expected to break stuff" ; we never got to it but >>>>>> search the mailing list. >>>>>> >>>>>> Since flags are exposed to the user I would rather they're not allowed to >>>>>> break things. >>>>>> Could they be treated as hints? Ignore the flag (and warn?) if the used >>>>>> configuration/integrations veto them. >>>>>> >>>>>> Alternatively, let's remove them from API. Remember "The Jokre" POC was >>>>>> intentionally designed to explore pushing the limits on performance w/o end >>>>>> users having to solve puzzles, such as learning details about these flags >>>>>> and their possible side effects. >>>>>> >>>>>> So assuming they become either "safe" or internal, maybe you can take >>>>>> advantage of them? >>>>>> >>>>>> I see >>>>>> the available options as: >>>>>> >>>>>> 1) run write-only commands 'optimized', ignoring any querying and such >>>>>> (warn user that he will break it) >>>>>> >>>>>> 2) run write-only without any optimization, rendering them useless >>>>>> >>>>>> 3) detect when querying is set up (ignoring listeners and maybe other >>>>>> stuff that could get broken) >>>>>> >>>>>> >>>>>> Might be useful for making a POC work, but I believe query will be very >>>>>> likely to be often enabled. >>>>>> Having an either / or switch for different features in Infinispan will make >>>>>> it harder to use and understand, so I'd rather see work on the right design >>>>>> as taking temporary shortcuts risks baking into stone features which we >>>>>> later struggle to fix or maintain. >>>>>> >>>>> I vote for this option. >>>>> >>>>> Query, listeners, and other components that need the previous value >>>>> should not just assume that the application knows better, they should >>>>> be able to change how operations works based on their needs. Of >>>>> course, the reverse is also true: if the application uses write-only >>>>> commands (or IGNORE_RETURN_VALUES) for performance reasons, it should >>>>> be possible for the user to detect why the previous values are still >>>>> loaded. >>>> If it were just query (static configuration), I would be okay with this >>>> idea. But as per listeners - besides tainting the design (event source >>>> should not check if there's a listener) you'd need to check *before* >>> The source wouldn't check for listeners explicitly, the notifier would >>> have an isPreviousValueNeeded() method and precompute that before a >>> listener is added or after a listener is removed. I was am assuming >>> some listeners will not need the previous value, e.g. the listeners >>> installed by streams. >> You can cover your warts with a make-up but you'll still have warts :) >>>> (DistributionI, CacheLoaderI) you have to call notify (cmd.perform, >>>> EWI). So this is a space for race conditions or weird handling (if >>>> there's a listener when I am about to call notify and my flags are not >>>> cleared, skip the notification and pretend that this code was invoked >>>> before the listener was registered...). Or do you have another solution >>>> in mind (config option to disable listeners && all features using those?). >>>> >>> I was definitely going for the weird handling... >>> >>> My plan was to set a HAS_PREVIOUS_VALUE flag on the context entry when >>> it's loaded, and check that before invoking a listener that needs the >>> previous value. It is missing one edge case: if one thread starts a >>> write operation, then another thread installs a listener that requires >>> the previous value and iterates over the cache, the second thread may >>> not see the value written by the first thread. >> If the operations overlap, you could pretend that the write has finished >> before the listener was invoked and simply not notify the listener. If I >> am missing it please write it down in code. But handling this in any way >> is still clumsy. >>> So now I'm thinking we should retry the write commands when >>> isPreviousValueNeeded() changes... Not very appealing, but I think the >>> performance difference is worth it. >>> >>>> R. >>>> >>>>>> 4) remove write-only commands completely (and probably functional >>>>>> listeners as well because these will lose their purpose) >>>>>> >>>>>> >>>>>> +1 to remove "unconditional writes", at least an entry version check should >>>>>> be applied. >>>>>> I believe we had already pointed out this would eventually happen, pretty >>>>>> much for the reasons you're hitting now. >>>>>> >>>>> IMO version checks should be done internally, we shouldn't force the >>>>> users of the functional API to deal with versions themselves because >>>>> we know how hard making write skew checks work is for us :) >>>>> >>>>> And I wouldn't go as far as to remove the functional listeners, >>>>> instead I would change them so that read-write listeners are invoked >>>>> on write-only operations and they force the loading of the previous >>>>> value. I would also add a way for the regular listeners to say whether >>>>> they need the previous value or not. >>>>> >>>>>> Right now I am inclined towards 4). There could be some internal use >>>>>> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >>>>>> though, but it's asking for trouble. >>>>>> >>>>>> >>>>>> I agree! >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> WDYT? >>>>>> >>>>>> Radim >>>>>> >>>>> Cheers >>>>> Dan >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Radim Vansa <rvansa(a)redhat.com> > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

Sanne Grinovero

Tuesday, 27 June Tue, 27 Jun

4 p.m.

On 27 June 2017 at 13:43, Adrian Nistor <anistor(a)redhat.com> wrote:

...

Why do you need to repeat this, Radim already captured this requirement for CQ in the premise of this very thread?

...

So if we half-fix query but leave CQ broken I will be half-happy (ie. very depressed) :)

There must be a misunderstanding. I merely highlighted the needs for indexed queries, and point out that it might be useful to have *yet another* form of previous-version loading as this specific circumstance would just need some metadata. I'd never suggest to replace or remove the capability to load the "full" previous version, definitely not meaning to suggest to break CQ and many other essential use cases.

...

I'd remove these commands completely or possibly remove them just from public API and keep them internal. Adrian On 06/27/2017 01:28 PM, Sanne Grinovero wrote: On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: Hi, I am working on entry version history (again). In Como we've discussed that previous values are needed for (continuous) query and reliable listeners, Index based queries also require the previous value on a write - unless we can get "strongly typed caches" giving guarantees about the class to represent the content of a cache to be unique. Essentially we only need to know the type of the previous object. It might be worth having a way to load the type metadata if the previous value only. so I wonder what should we do with functional write-only commands. These are different to commands with flags, because flags (other than ignore return value) are expected to break something. Sorry I hope to not derail the thread but let's remind that we hope to evolve beyond "flags are expected to break stuff" ; we never got to it but search the mailing list. Since flags are exposed to the user I would rather they're not allowed to break things. Could they be treated as hints? Ignore the flag (and warn?) if the used configuration/integrations veto them. Alternatively, let's remove them from API. Remember "The Jokre" POC was intentionally designed to explore pushing the limits on performance w/o end users having to solve puzzles, such as learning details about these flags and their possible side effects. So assuming they become either "safe" or internal, maybe you can take advantage of them? I see the available options as: 1) run write-only commands 'optimized', ignoring any querying and such (warn user that he will break it) 2) run write-only without any optimization, rendering them useless 3) detect when querying is set up (ignoring listeners and maybe other stuff that could get broken) Might be useful for making a POC work, but I believe query will be very likely to be often enabled. Having an either / or switch for different features in Infinispan will make it harder to use and understand, so I'd rather see work on the right design as taking temporary shortcuts risks baking into stone features which we later struggle to fix or maintain. 4) remove write-only commands completely (and probably functional listeners as well because these will lose their purpose) +1 to remove "unconditional writes", at least an entry version check should be applied. I believe we had already pointed out this would eventually happen, pretty much for the reasons you're hitting now. Right now I am inclined towards 4). There could be some internal use (e.g. multimaps) that could use 1) which is ran without a fancy setup, though, but it's asking for trouble. I agree! Thanks WDYT? Radim -- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

Adrian Nistor

4:53 p.m.

...

On 27 June 2017 at 13:43, Adrian Nistor <anistor(a)redhat.com> wrote: > I've said this in a previous thread on this same issue, I will repeat myself > as many times as needed. > > Continuous queries require the previous value itself, not just knowledge of > the type of the previous value. Strongly typed caches solve no problem here. Why do you need to repeat this, Radim already captured this requirement for CQ in the premise of this very thread? > So if we half-fix query but leave CQ broken I will be half-happy (ie. very > depressed) :) There must be a misunderstanding. I merely highlighted the needs for indexed queries, and point out that it might be useful to have *yet another* form of previous-version loading as this specific circumstance would just need some metadata. I'd never suggest to replace or remove the capability to load the "full" previous version, definitely not meaning to suggest to break CQ and many other essential use cases. > I'd remove these commands completely or possibly remove them just from > public API and keep them internal. > > Adrian > > > > On 06/27/2017 01:28 PM, Sanne Grinovero wrote: > > > > On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: > > Hi, > > I am working on entry version history (again). In Como we've discussed > that previous values are needed for (continuous) query and reliable > listeners, > > > Index based queries also require the previous value on a write - unless we > can get "strongly typed caches" giving guarantees about the class to > represent the content of a cache to be unique. > > Essentially we only need to know the type of the previous object. It might > be worth having a way to load the type metadata if the previous value only. > > so I wonder what should we do with functional write-only > commands. These are different to commands with flags, because flags > (other than ignore return value) are expected to break something. > > > Sorry I hope to not derail the thread but let's remind that we hope to > evolve beyond "flags are expected to break stuff" ; we never got to it but > search the mailing list. > > Since flags are exposed to the user I would rather they're not allowed to > break things. > Could they be treated as hints? Ignore the flag (and warn?) if the used > configuration/integrations veto them. > > Alternatively, let's remove them from API. Remember "The Jokre" POC was > intentionally designed to explore pushing the limits on performance w/o end > users having to solve puzzles, such as learning details about these flags > and their possible side effects. > > So assuming they become either "safe" or internal, maybe you can take > advantage of them? > > I see > the available options as: > > 1) run write-only commands 'optimized', ignoring any querying and such > (warn user that he will break it) > > 2) run write-only without any optimization, rendering them useless > > 3) detect when querying is set up (ignoring listeners and maybe other > stuff that could get broken) > > > Might be useful for making a POC work, but I believe query will be very > likely to be often enabled. > Having an either / or switch for different features in Infinispan will make > it harder to use and understand, so I'd rather see work on the right design > as taking temporary shortcuts risks baking into stone features which we > later struggle to fix or maintain. > > > 4) remove write-only commands completely (and probably functional > listeners as well because these will lose their purpose) > > > +1 to remove "unconditional writes", at least an entry version check should > be applied. > I believe we had already pointed out this would eventually happen, pretty > much for the reasons you're hitting now. > > > Right now I am inclined towards 4). There could be some internal use > (e.g. multimaps) that could use 1) which is ran without a fancy setup, > though, but it's asking for trouble. > > > I agree! > > Thanks > > > WDYT? > > Radim > > -- > > Radim Vansa <rvansa(a)redhat.com> > JBoss Performance Team > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev(a)lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >

Radim Vansa

Wednesday, 28 June Wed, 28 Jun

2:03 a.m.

I am with Adrian here, we want the core as simple as possible. The more ifs, the more bugs - it's hard to track all the edge cases. And besides declarative approach (saying that this cache can contain only entities X, Y and Z) and only checking that, I think that there's little space for 'loading just the metadata', because there are (currently) only two sources if the entry is not present: remote nodes and cache store. The actual read operation is cheapish, the cost is in going to the other node/DB machine. The same for maintenance perspective: you need to check for errors & have the correct, the type of value (just metadata/whole prev value) does not change it. R. On 06/27/2017 11:53 PM, Adrian Nistor wrote:

...

The needs for continuous query are stronger than and include the needs of normal query. So by fixing what is needed for continuous query we can all live a happy life without needing to provide "*yet another* form of previous-version loading". The reverse is not true. You'll have to convince me this extra mechanism's added complexity is worth the effort :) On 06/28/2017 12:00 AM, Sanne Grinovero wrote: > On 27 June 2017 at 13:43, Adrian Nistor <anistor(a)redhat.com> wrote: >> I've said this in a previous thread on this same issue, I will repeat myself >> as many times as needed. >> >> Continuous queries require the previous value itself, not just knowledge of >> the type of the previous value. Strongly typed caches solve no problem here. > Why do you need to repeat this, Radim already captured this > requirement for CQ in the premise of this very thread? > >> So if we half-fix query but leave CQ broken I will be half-happy (ie. very >> depressed) :) > There must be a misunderstanding. I merely highlighted the needs for > indexed queries, and point out that it might be useful to have *yet > another* form of previous-version loading as this specific > circumstance would just need some metadata. I'd never suggest to > replace or remove the capability to load the "full" previous version, > definitely not meaning to suggest to break CQ and many other essential > use cases. > >> I'd remove these commands completely or possibly remove them just from >> public API and keep them internal. >> >> Adrian >> >> >> >> On 06/27/2017 01:28 PM, Sanne Grinovero wrote: >> >> >> >> On 27 Jun 2017 10:13, "Radim Vansa" <rvansa(a)redhat.com> wrote: >> >> Hi, >> >> I am working on entry version history (again). In Como we've discussed >> that previous values are needed for (continuous) query and reliable >> listeners, >> >> >> Index based queries also require the previous value on a write - unless we >> can get "strongly typed caches" giving guarantees about the class to >> represent the content of a cache to be unique. >> >> Essentially we only need to know the type of the previous object. It might >> be worth having a way to load the type metadata if the previous value only. >> >> so I wonder what should we do with functional write-only >> commands. These are different to commands with flags, because flags >> (other than ignore return value) are expected to break something. >> >> >> Sorry I hope to not derail the thread but let's remind that we hope to >> evolve beyond "flags are expected to break stuff" ; we never got to it but >> search the mailing list. >> >> Since flags are exposed to the user I would rather they're not allowed to >> break things. >> Could they be treated as hints? Ignore the flag (and warn?) if the used >> configuration/integrations veto them. >> >> Alternatively, let's remove them from API. Remember "The Jokre" POC was >> intentionally designed to explore pushing the limits on performance w/o end >> users having to solve puzzles, such as learning details about these flags >> and their possible side effects. >> >> So assuming they become either "safe" or internal, maybe you can take >> advantage of them? >> >> I see >> the available options as: >> >> 1) run write-only commands 'optimized', ignoring any querying and such >> (warn user that he will break it) >> >> 2) run write-only without any optimization, rendering them useless >> >> 3) detect when querying is set up (ignoring listeners and maybe other >> stuff that could get broken) >> >> >> Might be useful for making a POC work, but I believe query will be very >> likely to be often enabled. >> Having an either / or switch for different features in Infinispan will make >> it harder to use and understand, so I'd rather see work on the right design >> as taking temporary shortcuts risks baking into stone features which we >> later struggle to fix or maintain. >> >> >> 4) remove write-only commands completely (and probably functional >> listeners as well because these will lose their purpose) >> >> >> +1 to remove "unconditional writes", at least an entry version check should >> be applied. >> I believe we had already pointed out this would eventually happen, pretty >> much for the reasons you're hitting now. >> >> >> Right now I am inclined towards 4). There could be some internal use >> (e.g. multimaps) that could use 1) which is ran without a fancy setup, >> though, but it's asking for trouble. >> >> >> I agree! >> >> Thanks >> >> >> WDYT? >> >> Radim >> >> -- >> >> Radim Vansa <rvansa(a)redhat.com> >> JBoss Performance Team >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ infinispan-dev mailing list infinispan-dev(a)lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- Radim Vansa <rvansa(a)redhat.com> JBoss Performance Team

2726

days inactive

2734

days old

infinispan-dev@lists.jboss.org

Manage subscription

20 comments

4 participants

tags (0)

participants (4)

Adrian Nistor
Dan Berindei
Radim Vansa
Sanne Grinovero

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Write-only commands