[infinispan-dev] New API to iterate over current entries in cache

William Burns mudokonman at gmail.com
Thu May 1 09:50:51 EDT 2014


On Thu, May 1, 2014 at 8:59 AM, Sanne Grinovero <sanne at infinispan.org> wrote:
> On 30 April 2014 15:06, William Burns <mudokonman at gmail.com> wrote:
>> Was wondering if anyone had any opinions on the API for this.
>>
>> These are a few options that Dan and I mulled over:
>>
>> Note the CloseableIterable inteface mentioned is just an interface
>> that extends both Closeable and Iterable.
>>
>> 1. The API that is very similar to previously proposed in this list
>> but slightly changed:
>>
>> Methods on AdvancedCache
>>
>> CloseableIterable<CacheEntry<K, V>> entryIterable(KeyValueFilter<?
>> super K, ? super V> filter);
>>
>> <C> CloseableIterable<CacheEntry<K, C>> entryIterable(KeyValueFilter<?
>> super K, ? super V> filter, Converter<? super K, ? super V, C>
>> converter);
>>
>> Note the difference here is that it would return an Iterable instead
>> of Iterator, which would allow for it being used in a for loop.
>>
>> Example usage would be (types omitted)
>>
>> for (CacheEntry entry : cache.entryIterable(someFilter, someConverter)) {
>> // Do something
>> }
>
>
> If it's important to close the Iterable, this example highlights a
> problem of the API.
> Ideally I think you might want to drop the need for the #close()
> method, but I'm guessing that's not an option so I'd avoid the
> Iterable API in that case.

Good point, I totally forgot to cover the Closeable aspect in the first email.

Unfortunately changing it to be Iterable does pose a slight issue.  I
was thinking we do something along the lines that Dan was thinking of
by preventing the Iterable from producing more than 1 Iterable (maybe
throw IllegalStateException).  This way when we close the Iterable it
would also close the underlying Iterator.


try (EntryIterable<K, C> entries =
advancedCache.entryIterable(someFilter, someConverter)) {
  for (Entry<K, C> e : entries) {
    ...
  }
}

> You could think of an intermediary place-holder to still allow for
> natural iteration:
>
>  try ( CacheEntryIteratorContext ctx = cache.entryIterable(someFilter,
> someConverter) ) {
>     for (CacheEntry entry : ctx.asIterable()) {
>        // Do something
>     }
>  }
>
> But I'm not liking the names I used above, as I would expect to be
> able to reuse the same iterator for multiple invocations of
> iterable(), and have each to restart the iteration from the beginning.

Obviously from above this wouldn't be possible if we made those
changes.  Do you think this is reason to prevent those changes?  Or do
you think we should allow multiple iterators but closing the Iterable
would also close down each of the Iterators?  I am worried this might
be a bit cumbersome/surprising, but documenting it might be
sufficient.

> Can this be solved with better name choices?
>
>
>> 2. An API that returns a new type EntryIterable for example that can
>> chain methods to provide a filter and converter.
>>
>> on AdvancedCache
>>
>> EntryIterable<K, V> entryIterable();
>>
>> where EntryIterable is defined as:
>>
>> public interface EntryIterable<K, V> extends
>> CloseableIterable<CacheEntry<K, V>> {
>>
>>    public EntryIterable<K, V> filter(KeyValueFilter<? super K, ? super
>> V> filter);
>>
>>    public EntryIterable<K, V> converter(Converter<? super K, ? super
>> V, ? extends V> converter);
>>
>>    public <C> CloseableIterable<CacheEntry<K, C>>
>> projection(Converter<? super K, ? super V, C> converter);
>> }
>>
>> Note that there are 2 methods that take a Converter, this is to
>> preserve the typing, since the method would return a different
>> EntryIterable instance.  However I can also see removing one of the
>> converter method and just rename projection to converter instead.
>>
>> This API would allow for providing optional fields more cleanly or not
>> if all if desired.
>>
>> Example usage would be (types omitted)
>>
>> for (CacheEntry entry :
>> cache.entryIterable().filter(someFilter).converter(someConverter)) {
>> // Do something
>> }
>
> This looks very nice, assuming you fix the missing close().

So in that case you like #2 I assume?  That is what I was leaning
towards as well.

> Am I missing a catch?

Yes it would be very similar to above with the outer try block and
then passing in the CloseableIterable into the inner for loop.

>
> Also it's quite trivial for the user to do his own filtering and
> conversion in the "do something block", so I'm wondering if there is a
> reason beyond API shugar to expose this.

The filters and converters are sent remotely to reduce resulting
network traffic.  The filter is applied on each node to determine what
data it sends back and the converter is applied before sending back
the value as well.

> It would be lovely if for example certain filters could affect loading
> from CacheStores - narrowing down a relational database select for
> example - and I guess the same concept could apply to the converted if
> you'd allow to select a subset of fields.

There are plans to have a JPA string filter/converter that will live
in a new forthcoming module.  We could look into enhancing this in the
future to have better integration with the JPACacheStore.  A possible
issue I can think of is if the projection doesn't contain the key
value, because currently rehash can cause duplicate values that is
detected by the key.

>
> I don't think these optimisations need to be coded right now, but it
> would be nice to keep the option open for future enhancement.

I agree that would be nice

>
>
>> 3. An API that requires the filter up front in the AdvancedCache
>> method.  This also brings up the point should we require a filter to
>> always be provided?  Unfortuantely this doesn't prevent a user from
>> querying every entry as they can just use a filter that accepts all
>> key/value pairs.
>
> Why is that unfortunate?

I was saying along the lines if we wanted to make sure a user doesn't
query all data.  But if we do want them to do this, then it is great.
I know some people are on the fence about it.

>
>>
>> on AdvancedCache
>>
>> EntryIterable<K, V> entryIterable(Filter<? super K, ? super V> filter)
>>
>> where EntryIterable is defined as:
>>
>> public interface EntryIterable<K, V> extends
>> CloseableIterable<CacheEntry<K, V>> {
>>
>>    public <C> CloseableIterable<CacheEntry<K, C>>
>> converter(Converter<? super K, ? super V, C> converter);
>> }
>>
>> The usage would be identical to #2 except the filter is always provided.
>
> I wouldn't mandate it, but in case you overload the method it probably
> is a good idea to internally apply an accept-all filter so you have a
> single implementation. Could be a singleton, which also implies an
> efficient Externalizer.

Although for performance I guess it would be better to not provide a
Filter for the retrieve all case.

>
>>
>>
>> Let me know what you guys think or if you have any other suggestions.
>>
>>  Thanks,
>>
>>  - Will
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


More information about the infinispan-dev mailing list