[infinispan-dev] Stream operations under lock

Thu Mar 23 09:41:09 EDT 2017

On Wed, Mar 22, 2017 at 5:51 AM Radim Vansa <rvansa at redhat.com> wrote:

> On 03/21/2017 06:50 PM, William Burns wrote:
> >
> >
> > On Tue, Mar 21, 2017 at 1:42 PM William Burns <mudokonman at gmail.com
> > <mailto:mudokonman at gmail.com>> wrote:
> >
> >     On Tue, Mar 21, 2017 at 12:53 PM Radim Vansa <rvansa at redhat.com
> >     <mailto:rvansa at redhat.com>> wrote:
> >
> >         On 03/21/2017 04:37 PM, William Burns wrote:
> >         > Some users have expressed the need to have some sort of forEach
> >         > operation that is performed where the Consumer is called
> >         while holding
> >         > the lock for the given key and subsequently released after the
> >         > Consumer operation completes.
> >
> >         Seconding Dan's question - is that intended to be able to
> >         modify the
> >         entry? In my opinion, sending a function that will work on the
> >         ReadWriteEntryView directly to the node is the only reasonable
> >         request.
> >         I wouldn't like to see blocking operations in there.
> >
> >
> >     Hrmm the user can use the FunctionalMap interface for this then it
> >     seems? I wonder if this should just be the going in API. I will
> >     need to discuss with Galder the semantics of the evalAll/evalMany
> >     methods.
> >
> >
> > Actually looking at evalAll it seems it doesn't scale as it keeps all
> > entries in memory at once, so this is only for caches with a limited
> > amount of entries.
>
> Don't look into the implementation; I think Galder has focused more on
> the API side than having optimal implementation. IMO there's no reason
> evalAll should load all the entries into memory in non-transactional mode.
>
>
I agree that it shouldn't, but there is no guarantee this will be ready any
time soon.

> >
> >         >
> >         > Due to the nature of how streams work with retries and
> >         performing the
> >         > operation on the primary owner, this works out quite well
> >         with forEach
> >         > to be done in an efficient way.
> >         >
> >         > The problem is that this only really works well with non tx and
> >         > pessimistic tx. This obviously leaves out optimistic tx,
> >         which at
> >         > first I was a little worried about. But after thinking about
> >         it more,
> >         > this prelocking and optimistic tx don't really fit that well
> >         together
> >         > anyways. So I am thinking whenever this operation is
> >         performed it
> >         > would throw an exception not letting the user use this
> >         feature in
> >         > optimistic transactions.
> >
> >         How exactly reading streams interacts with transactions? Does
> >         it wrap
> >         read entries into context? This would be a scalability issue.
> >
> >
> >     It doesn't wrap read entries into the context for that exact
> >     reason. It does however use existing entries in the context to
> >     override ones in memory/store.
> >
>
> Uuh, so you end up with a copy of the cache in single invocation
> context, without any means to flush it. I think that we need add
>

Maybe I worded it poorly. Streams don't wrap any entries at all. All it
does is read from the current context if there any, it then reads from the
data container (skipping entries it read from the context) and then finally
reading from the store if present.

Although the more I think about it using Stream with lock this may be a non
issue, read below.

> InvocationContext.current().forget(key) API (throwing exception if the
> entry was modified) or something like that, even for the regular
> streams. Maybe an override for filter methods, too, because you want to
> pass a nice predicate, but you can't just forget all filtered out entries.
>
> >
> >         I agree that "locking" should not be exposed with optimistic
> >         transactions.
> >
> >
> >     Yeah I can't find a good way to do this really and it seems to be
> >     opposite of what optimistic transactions are.
> >
> >
> >         With pessimistic transactions, how do you expect to handle
> locking
> >         order? For regular operations, user is responsible for setting
> >         up some
> >         locking order in order to not get a deadlock. With pessimistic
> >         transaction, it's the cache itself who will order the calls.
> >         Also, if
> >         you lock anything that is read, you just end up locking
> >         everything (or,
> >         getting a deadlock). If you don't it's the same as issuing the
> >         lock and
> >         reading again (to check the locked value) - but you'd do that
> >         internally
> >         anyway. Therefore, I don't feel well about pessimistic
> >         transactions neither.
> >
> >
> >     The lock is done per key only for each invocation. There is no
> >     ordering as only one is obtained at a time before it goes to the
> >     next. If the user then acquires a lock for another key while in
> >     the Consumer this could cause a deadlock if the inverse occurs on
> >     a different thread/node, but this is on the user. It is the same
> >     as it is today really, except we do the read lock for them before
> >     invoking their Consumer.
> >
>
> In pessimistic mode, you should not release a lock before the end of the
> transaction.
>

I agree. I didn't discuss finer details, but what I have right now doesn't
work with an ongoing pessimistic transaction. And to be honest I am not
sure if this can work with an ongoing transaction. And even if it did it
would be horrendously horrendously slow since each remote node is
performing the Consumer but it would need to read from the originators
context (or have it copied over). Unless someone with more transaction
knowledge knows of a better way I personally feel using stream with lock
should run in its own dedicated transaction.

>
> >
> >         >
> >         > Another question is what does the API for this look like. I was
> >         > debating between 3 options myself:
> >         >
> >         > 1. AdvancedCache.forEachWithLock(BiConsumer<Cache,
> >         CacheEntry<K, V>>
> >         > consumer)
> >         >
> >         > This require the least amount of changes, however the user
> can't
> >         > customize certain parameters that CacheStream currently
> provides
> >         > (listed below - big one being filterKeys).
> >         >
> >         > 2. CacheStream.forEachWithLock(BiConsumer<Cache,
> >         CacheEntry<K, V>>
> >         > consumer)
> >         >
> >         > This method would only be allowed to be invoked on the
> >         Stream if no
> >         > other intermediate operations were invoked, otherwise an
> >         exception
> >         > would be thrown. This still gives us access to all of the
> >         CacheStream
> >         > methods that aren't on the Stream interface (ie.
> >         > sequentialDistribution, parallelDistribution, parallel,
> >         sequential,
> >         > filterKeys, filterKeySegments, distributedBatchSize,
> >         > disableRehashAware, timeout).
> >
> >         For both options, I don't like Cache being passed around. You
> >         should
> >         modify the CacheEntry (or some kind of view) directly.
> >
> >
> >     I don't know for sure if that is sufficient for the user.
> >     Sometimes they may modify another Cache given the value in this
> >     one for example, which they could access from the CacheManager of
> >     that Cache. Maybe Tristan knows more about some use cases.
> >
>
> Rather than guessing what could the user need, the Consumer could be CDI
> enabled.
>

I am not the biggest CDI fan, especially since the benefit of this is you
can use lambdas (that automatically become Serializable) and make very
concise code. But others can chime in here too.

>
> >
> >         Radim
> >
> >         >
> >         > 3. LockedStream<CacheEntry<K, V>> AdvancedCache.lockedStream()
> >         >
> >         > This requires the most changes, however the API would be the
> >         most
> >         > explicit. In this case the LockedStream would only have the
> >         methods on
> >         > it that are able to be invoked as noted above and forEach.
> >         >
> >         > I personally feel that #3 might be the cleanest, but obviously
> >         > requires adding more classes. Let me know what you guys
> >         think and if
> >         > you think the optimistic exclusion is acceptable.
> >         >
> >         > Thanks,
> >         >
> >         >  - Will
> >         >
> >         >
> >         > _______________________________________________
> >         > infinispan-dev mailing list
> >         > infinispan-dev at lists.jboss.org
> >         <mailto:infinispan-dev at lists.jboss.org>
> >         > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> >         --
> >         Radim Vansa <rvansa at redhat.com <mailto:rvansa at redhat.com>>
> >         JBoss Performance Team
> >
> >         _______________________________________________
> >         infinispan-dev mailing list
> >         infinispan-dev at lists.jboss.org
> >         <mailto:infinispan-dev at lists.jboss.org>
> >         https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Radim Vansa <rvansa at redhat.com>
> JBoss Performance Team
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20170323/64096167/attachment-0001.html