<div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Mon, Mar 27, 2017 at 9:02 PM Galder Zamarreño <<a href="mailto:galder@redhat.com">galder@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br class="gmail_msg">
--<br class="gmail_msg">
Galder Zamarreño<br class="gmail_msg">
Infinispan, Red Hat<br class="gmail_msg">
<br class="gmail_msg">
> On 21 Mar 2017, at 17:16, Dan Berindei <<a href="mailto:dan.berindei@gmail.com" class="gmail_msg" target="_blank">dan.berindei@gmail.com</a>> wrote:<br class="gmail_msg">
><br class="gmail_msg">
> I'm leaning towards option 1.<br class="gmail_msg">
><br class="gmail_msg">
> Are you thinking about also allowing the consumer to modify the entry,<br class="gmail_msg">
> like JCache's EntryProcessors? For a consumer that can only modify the<br class="gmail_msg">
> current entry, we could even "emulate" locking in an optimistic cache<br class="gmail_msg">
> by catching the WriteSkewException and running the consumer again.<br class="gmail_msg">
><br class="gmail_msg">
> I wouldn't allow this to be mixed with other operations in a stream,<br class="gmail_msg">
> because then you may have to run filters/mappers/sorting while holding<br class="gmail_msg">
> the lock as well.<br class="gmail_msg">
<br class="gmail_msg">
^ Would forEach w/ lock still run for all entries in originator? If so, not being able to filter could be a pain. IOW, you'd be forcing all entries to be shipped to a node and user to do its own filtering. Not ideal :\<br class="gmail_msg"></blockquote><div><br></div><div>No the primary owner would run the operation per entry. I was thinking we would have 2 levels of filtering in my proposal above.</div><div><br></div><div>We would have the first one which is using filterKeys on the CacheStream method. This requires serializing keys sent to each node (only primary owned keys are sent). While it has the cost of serialization it makes up for by having constant time lookups (no iterating memory/stores) for the keys as it creates a stream using Cache.get to populate it.</div><div><br></div><div>The second was to support the filter method on the Stream API which would allow for a Predicate so you don't have to serialize keys. In this case you wouldn't want to include keys in this Predicate as all keys would be serialized to all nodes and then you still have to iterate and check the entire data container/store.</div><div><br></div><div>You could actually do both as well. So if you only want a subset of known keys where their values match a Predicate this can be done too.</div><div><br></div><div>cache.lockedStream().filterKeys(keys).filter(predicate).forEach();</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br class="gmail_msg">
<br class="gmail_msg">
><br class="gmail_msg">
> Cheers<br class="gmail_msg">
> Dan<br class="gmail_msg">
><br class="gmail_msg">
><br class="gmail_msg">
> On Tue, Mar 21, 2017 at 5:37 PM, William Burns <<a href="mailto:mudokonman@gmail.com" class="gmail_msg" target="_blank">mudokonman@gmail.com</a>> wrote:<br class="gmail_msg">
>> Some users have expressed the need to have some sort of forEach operation<br class="gmail_msg">
>> that is performed where the Consumer is called while holding the lock for<br class="gmail_msg">
>> the given key and subsequently released after the Consumer operation<br class="gmail_msg">
>> completes.<br class="gmail_msg">
>><br class="gmail_msg">
>> Due to the nature of how streams work with retries and performing the<br class="gmail_msg">
>> operation on the primary owner, this works out quite well with forEach to be<br class="gmail_msg">
>> done in an efficient way.<br class="gmail_msg">
>><br class="gmail_msg">
>> The problem is that this only really works well with non tx and pessimistic<br class="gmail_msg">
>> tx. This obviously leaves out optimistic tx, which at first I was a little<br class="gmail_msg">
>> worried about. But after thinking about it more, this prelocking and<br class="gmail_msg">
>> optimistic tx don't really fit that well together anyways. So I am thinking<br class="gmail_msg">
>> whenever this operation is performed it would throw an exception not letting<br class="gmail_msg">
>> the user use this feature in optimistic transactions.<br class="gmail_msg">
>><br class="gmail_msg">
>> Another question is what does the API for this look like. I was debating<br class="gmail_msg">
>> between 3 options myself:<br class="gmail_msg">
>><br class="gmail_msg">
>> 1. AdvancedCache.forEachWithLock(BiConsumer<Cache, CacheEntry<K, V>><br class="gmail_msg">
>> consumer)<br class="gmail_msg">
>><br class="gmail_msg">
>> This require the least amount of changes, however the user can't customize<br class="gmail_msg">
>> certain parameters that CacheStream currently provides (listed below - big<br class="gmail_msg">
>> one being filterKeys).<br class="gmail_msg">
>><br class="gmail_msg">
>> 2. CacheStream.forEachWithLock(BiConsumer<Cache, CacheEntry<K, V>> consumer)<br class="gmail_msg">
>><br class="gmail_msg">
>> This method would only be allowed to be invoked on the Stream if no other<br class="gmail_msg">
>> intermediate operations were invoked, otherwise an exception would be<br class="gmail_msg">
>> thrown. This still gives us access to all of the CacheStream methods that<br class="gmail_msg">
>> aren't on the Stream interface (ie. sequentialDistribution,<br class="gmail_msg">
>> parallelDistribution, parallel, sequential, filterKeys, filterKeySegments,<br class="gmail_msg">
>> distributedBatchSize, disableRehashAware, timeout).<br class="gmail_msg">
>><br class="gmail_msg">
>> 3. LockedStream<CacheEntry<K, V>> AdvancedCache.lockedStream()<br class="gmail_msg">
>><br class="gmail_msg">
>> This requires the most changes, however the API would be the most explicit.<br class="gmail_msg">
>> In this case the LockedStream would only have the methods on it that are<br class="gmail_msg">
>> able to be invoked as noted above and forEach.<br class="gmail_msg">
>><br class="gmail_msg">
>> I personally feel that #3 might be the cleanest, but obviously requires<br class="gmail_msg">
>> adding more classes. Let me know what you guys think and if you think the<br class="gmail_msg">
>> optimistic exclusion is acceptable.<br class="gmail_msg">
>><br class="gmail_msg">
>> Thanks,<br class="gmail_msg">
>><br class="gmail_msg">
>> - Will<br class="gmail_msg">
>><br class="gmail_msg">
>> _______________________________________________<br class="gmail_msg">
>> infinispan-dev mailing list<br class="gmail_msg">
>> <a href="mailto:infinispan-dev@lists.jboss.org" class="gmail_msg" target="_blank">infinispan-dev@lists.jboss.org</a><br class="gmail_msg">
>> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" class="gmail_msg" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br class="gmail_msg">
> _______________________________________________<br class="gmail_msg">
> infinispan-dev mailing list<br class="gmail_msg">
> <a href="mailto:infinispan-dev@lists.jboss.org" class="gmail_msg" target="_blank">infinispan-dev@lists.jboss.org</a><br class="gmail_msg">
> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" class="gmail_msg" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br class="gmail_msg">
<br class="gmail_msg">
<br class="gmail_msg">
_______________________________________________<br class="gmail_msg">
infinispan-dev mailing list<br class="gmail_msg">
<a href="mailto:infinispan-dev@lists.jboss.org" class="gmail_msg" target="_blank">infinispan-dev@lists.jboss.org</a><br class="gmail_msg">
<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" class="gmail_msg" target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a></blockquote></div></div>