[infinispan-dev] Stream operations under lock

Thu Mar 23 10:33:24 EDT 2017

On Wed, Mar 22, 2017 at 11:51 AM, Radim Vansa <rvansa at redhat.com> wrote:
> On 03/21/2017 06:50 PM, William Burns wrote:
>>
>>
>> On Tue, Mar 21, 2017 at 1:42 PM William Burns <mudokonman at gmail.com
>> <mailto:mudokonman at gmail.com>> wrote:
>>
>>     On Tue, Mar 21, 2017 at 12:53 PM Radim Vansa <rvansa at redhat.com
>>     <mailto:rvansa at redhat.com>> wrote:
>>
>>         On 03/21/2017 04:37 PM, William Burns wrote:
>>         > Some users have expressed the need to have some sort of forEach
>>         > operation that is performed where the Consumer is called
>>         while holding
>>         > the lock for the given key and subsequently released after the
>>         > Consumer operation completes.
>>
>>         Seconding Dan's question - is that intended to be able to
>>         modify the
>>         entry? In my opinion, sending a function that will work on the
>>         ReadWriteEntryView directly to the node is the only reasonable
>>         request.
>>         I wouldn't like to see blocking operations in there.
>>
>>
>>     Hrmm the user can use the FunctionalMap interface for this then it
>>     seems? I wonder if this should just be the going in API. I will
>>     need to discuss with Galder the semantics of the evalAll/evalMany
>>     methods.
>>
>>
>> Actually looking at evalAll it seems it doesn't scale as it keeps all
>> entries in memory at once, so this is only for caches with a limited
>> amount of entries.
>
> Don't look into the implementation; I think Galder has focused more on
> the API side than having optimal implementation. IMO there's no reason
> evalAll should load all the entries into memory in non-transactional mode.
>

I'm pretty sure we do need to load all the entries in order to provide
REPEATABLE_READ isolation.

>>
>>         >
>>         > Due to the nature of how streams work with retries and
>>         performing the
>>         > operation on the primary owner, this works out quite well
>>         with forEach
>>         > to be done in an efficient way.
>>         >
>>         > The problem is that this only really works well with non tx and
>>         > pessimistic tx. This obviously leaves out optimistic tx,
>>         which at
>>         > first I was a little worried about. But after thinking about
>>         it more,
>>         > this prelocking and optimistic tx don't really fit that well
>>         together
>>         > anyways. So I am thinking whenever this operation is
>>         performed it
>>         > would throw an exception not letting the user use this
>>         feature in
>>         > optimistic transactions.
>>
>>         How exactly reading streams interacts with transactions? Does
>>         it wrap
>>         read entries into context? This would be a scalability issue.
>>
>>
>>     It doesn't wrap read entries into the context for that exact
>>     reason. It does however use existing entries in the context to
>>     override ones in memory/store.
>>
>
> Uuh, so you end up with a copy of the cache in single invocation
> context, without any means to flush it. I think that we need add
> InvocationContext.current().forget(key) API (throwing exception if the
> entry was modified) or something like that, even for the regular
> streams. Maybe an override for filter methods, too, because you want to
> pass a nice predicate, but you can't just forget all filtered out entries.
>

I think Will said he *doesn't* want to wrap the entries read by the consumer :)

IMO there's no "good" way to provide repeatable read isolation for a
transaction that reads all the keys in the cache, so this API should
create a separate transaction for each entry. I wouldn't try to make
the consumers see the current transaction's modifications if started
from a transaction either, I'd throw an exception if started from a
transaction instead.

>>
>>         I agree that "locking" should not be exposed with optimistic
>>         transactions.
>>
>>
>>     Yeah I can't find a good way to do this really and it seems to be
>>     opposite of what optimistic transactions are.
>>

Ok, the name forEachWithLock doesn't really fit with optimistic
locking, but I think with a more neutral name it could work for
optimistic caches as well.

>>
>>         With pessimistic transactions, how do you expect to handle locking
>>         order? For regular operations, user is responsible for setting
>>         up some
>>         locking order in order to not get a deadlock. With pessimistic
>>         transaction, it's the cache itself who will order the calls.
>>         Also, if
>>         you lock anything that is read, you just end up locking
>>         everything (or,
>>         getting a deadlock). If you don't it's the same as issuing the
>>         lock and
>>         reading again (to check the locked value) - but you'd do that
>>         internally
>>         anyway. Therefore, I don't feel well about pessimistic
>>         transactions neither.
>>
>>
>>     The lock is done per key only for each invocation. There is no
>>     ordering as only one is obtained at a time before it goes to the
>>     next. If the user then acquires a lock for another key while in
>>     the Consumer this could cause a deadlock if the inverse occurs on
>>     a different thread/node, but this is on the user. It is the same
>>     as it is today really, except we do the read lock for them before
>>     invoking their Consumer.
>>
>
> In pessimistic mode, you should not release a lock before the end of the
> transaction.
>

Exactly. Each consumer needs to have its own transaction, otherwise
the transaction's lockedKeys collection would have to grow to include
all the keys in the cache.

>>
>>         >
>>         > Another question is what does the API for this look like. I was
>>         > debating between 3 options myself:
>>         >
>>         > 1. AdvancedCache.forEachWithLock(BiConsumer<Cache,
>>         CacheEntry<K, V>>
>>         > consumer)
>>         >
>>         > This require the least amount of changes, however the user can't
>>         > customize certain parameters that CacheStream currently provides
>>         > (listed below - big one being filterKeys).
>>         >
>>         > 2. CacheStream.forEachWithLock(BiConsumer<Cache,
>>         CacheEntry<K, V>>
>>         > consumer)
>>         >
>>         > This method would only be allowed to be invoked on the
>>         Stream if no
>>         > other intermediate operations were invoked, otherwise an
>>         exception
>>         > would be thrown. This still gives us access to all of the
>>         CacheStream
>>         > methods that aren't on the Stream interface (ie.
>>         > sequentialDistribution, parallelDistribution, parallel,
>>         sequential,
>>         > filterKeys, filterKeySegments, distributedBatchSize,
>>         > disableRehashAware, timeout).
>>
>>         For both options, I don't like Cache being passed around. You
>>         should
>>         modify the CacheEntry (or some kind of view) directly.
>>
>>
>>     I don't know for sure if that is sufficient for the user.
>>     Sometimes they may modify another Cache given the value in this
>>     one for example, which they could access from the CacheManager of
>>     that Cache. Maybe Tristan knows more about some use cases.
>>
>
> Rather than guessing what could the user need, the Consumer could be CDI
> enabled.
>

If the user actually needs to work with more than one entry at a time,
I think it would be much cleaner for him to use regular forEach() and
start an explicit transaction in the consumer.

>>
>>         Radim
>>
>>         >
>>         > 3. LockedStream<CacheEntry<K, V>> AdvancedCache.lockedStream()
>>         >
>>         > This requires the most changes, however the API would be the
>>         most
>>         > explicit. In this case the LockedStream would only have the
>>         methods on
>>         > it that are able to be invoked as noted above and forEach.
>>         >
>>         > I personally feel that #3 might be the cleanest, but obviously
>>         > requires adding more classes. Let me know what you guys
>>         think and if
>>         > you think the optimistic exclusion is acceptable.
>>         >
>>         > Thanks,
>>         >
>>         >  - Will
>>         >