[infinispan-dev] Stream operations under lock

Mon Mar 27 18:50:42 EDT 2017

--
Galder Zamarreño
Infinispan, Red Hat

> On 22 Mar 2017, at 10:51, Radim Vansa <rvansa at redhat.com> wrote:
> 
> On 03/21/2017 06:50 PM, William Burns wrote:
>> 
>> 
>> On Tue, Mar 21, 2017 at 1:42 PM William Burns <mudokonman at gmail.com 
>> <mailto:mudokonman at gmail.com>> wrote:
>> 
>>    On Tue, Mar 21, 2017 at 12:53 PM Radim Vansa <rvansa at redhat.com
>>    <mailto:rvansa at redhat.com>> wrote:
>> 
>>        On 03/21/2017 04:37 PM, William Burns wrote:
>>> Some users have expressed the need to have some sort of forEach
>>> operation that is performed where the Consumer is called
>>        while holding
>>> the lock for the given key and subsequently released after the
>>> Consumer operation completes.
>> 
>>        Seconding Dan's question - is that intended to be able to
>>        modify the
>>        entry? In my opinion, sending a function that will work on the
>>        ReadWriteEntryView directly to the node is the only reasonable
>>        request.
>>        I wouldn't like to see blocking operations in there.
>> 
>> 
>>    Hrmm the user can use the FunctionalMap interface for this then it
>>    seems? I wonder if this should just be the going in API. I will
>>    need to discuss with Galder the semantics of the evalAll/evalMany
>>    methods.
>> 
>> 
>> Actually looking at evalAll it seems it doesn't scale as it keeps all 
>> entries in memory at once, so this is only for caches with a limited 
>> amount of entries.
> 
> Don't look into the implementation; I think Galder has focused more on 
> the API side than having optimal implementation.

That's why it's marked experimental ;p

> IMO there's no reason 
> evalAll should load all the entries into memory in non-transactional mode.
> 
>> 
>>> 
>>> Due to the nature of how streams work with retries and
>>        performing the
>>> operation on the primary owner, this works out quite well
>>        with forEach
>>> to be done in an efficient way.
>>> 
>>> The problem is that this only really works well with non tx and
>>> pessimistic tx. This obviously leaves out optimistic tx,
>>        which at
>>> first I was a little worried about. But after thinking about
>>        it more,
>>> this prelocking and optimistic tx don't really fit that well
>>        together
>>> anyways. So I am thinking whenever this operation is
>>        performed it
>>> would throw an exception not letting the user use this
>>        feature in
>>> optimistic transactions.
>> 
>>        How exactly reading streams interacts with transactions? Does
>>        it wrap
>>        read entries into context? This would be a scalability issue.
>> 
>> 
>>    It doesn't wrap read entries into the context for that exact
>>    reason. It does however use existing entries in the context to
>>    override ones in memory/store.
>> 
> 
> Uuh, so you end up with a copy of the cache in single invocation 
> context, without any means to flush it. I think that we need add 
> InvocationContext.current().forget(key) API (throwing exception if the 
> entry was modified) or something like that, even for the regular 
> streams. Maybe an override for filter methods, too, because you want to 
> pass a nice predicate, but you can't just forget all filtered out entries.
> 
>> 
>>        I agree that "locking" should not be exposed with optimistic
>>        transactions.
>> 
>> 
>>    Yeah I can't find a good way to do this really and it seems to be
>>    opposite of what optimistic transactions are.
>> 
>> 
>>        With pessimistic transactions, how do you expect to handle locking
>>        order? For regular operations, user is responsible for setting
>>        up some
>>        locking order in order to not get a deadlock. With pessimistic
>>        transaction, it's the cache itself who will order the calls.
>>        Also, if
>>        you lock anything that is read, you just end up locking
>>        everything (or,
>>        getting a deadlock). If you don't it's the same as issuing the
>>        lock and
>>        reading again (to check the locked value) - but you'd do that
>>        internally
>>        anyway. Therefore, I don't feel well about pessimistic
>>        transactions neither.
>> 
>> 
>>    The lock is done per key only for each invocation. There is no
>>    ordering as only one is obtained at a time before it goes to the
>>    next. If the user then acquires a lock for another key while in
>>    the Consumer this could cause a deadlock if the inverse occurs on
>>    a different thread/node, but this is on the user. It is the same
>>    as it is today really, except we do the read lock for them before
>>    invoking their Consumer.
>> 
> 
> In pessimistic mode, you should not release a lock before the end of the 
> transaction.
> 
>> 
>>> 
>>> Another question is what does the API for this look like. I was
>>> debating between 3 options myself:
>>> 
>>> 1. AdvancedCache.forEachWithLock(BiConsumer<Cache,
>>        CacheEntry<K, V>>
>>> consumer)
>>> 
>>> This require the least amount of changes, however the user can't
>>> customize certain parameters that CacheStream currently provides
>>> (listed below - big one being filterKeys).
>>> 
>>> 2. CacheStream.forEachWithLock(BiConsumer<Cache,
>>        CacheEntry<K, V>>
>>> consumer)
>>> 
>>> This method would only be allowed to be invoked on the
>>        Stream if no
>>> other intermediate operations were invoked, otherwise an
>>        exception
>>> would be thrown. This still gives us access to all of the
>>        CacheStream
>>> methods that aren't on the Stream interface (ie.
>>> sequentialDistribution, parallelDistribution, parallel,
>>        sequential,
>>> filterKeys, filterKeySegments, distributedBatchSize,
>>> disableRehashAware, timeout).
>> 
>>        For both options, I don't like Cache being passed around. You
>>        should
>>        modify the CacheEntry (or some kind of view) directly.
>> 
>> 
>>    I don't know for sure if that is sufficient for the user.
>>    Sometimes they may modify another Cache given the value in this
>>    one for example, which they could access from the CacheManager of
>>    that Cache. Maybe Tristan knows more about some use cases.
>> 
> 
> Rather than guessing what could the user need, the Consumer could be CDI 
> enabled.
> 
>> 
>>        Radim
>> 
>>> 
>>> 3. LockedStream<CacheEntry<K, V>> AdvancedCache.lockedStream()
>>> 
>>> This requires the most changes, however the API would be the
>>        most
>>> explicit. In this case the LockedStream would only have the
>>        methods on
>>> it that are able to be invoked as noted above and forEach.
>>> 
>>> I personally feel that #3 might be the cleanest, but obviously
>>> requires adding more classes. Let me know what you guys
>>        think and if
>>> you think the optimistic exclusion is acceptable.
>>> 
>>> Thanks,
>>> 
>>> - Will
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>        <mailto:infinispan-dev at lists.jboss.org>
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>>        --
>>        Radim Vansa <rvansa at redhat.com <mailto:rvansa at redhat.com>>
>>        JBoss Performance Team
>> 
>>        _______________________________________________
>>        infinispan-dev mailing list
>>        infinispan-dev at lists.jboss.org
>>        <mailto:infinispan-dev at lists.jboss.org>
>>        https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> -- 
> Radim Vansa <rvansa at redhat.com>
> JBoss Performance Team
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev