On 7 Aug 2013, at 14:57, Galder Zamarreño <galder(a)redhat.com> wrote:
A few things to note that have not been mentioned:
- BulkCacheLoader.bulk* methods: if what you're doing is return an iterator, why not
call that method iterator(), just like Collection defines iterator?
- BulkCacheLoader and Iterator classes: I find it a bit confusing that we return
Infinispan specific iterators from bulk* methods but removeAll and others take a different
Iterator… I don't have an answer for this… :|
This will change based on Sanne's suggestion of parallel processing.
However the reason I preferred an custom iterator over the one in java are:
- j.u.Teartor.remove method won't be supported so not nice having it there
- when iterating over the CacheEntries, with the j.u.Itearator I have to create an
Map.Entry object for each entry I process. This object doesn't need to be created on
an native iterator that has the key(), value() and next() methods.
- We're moving away from the cache loader vs cache store
interface separation? I kinda liked the way JSR-107 had defined these, keeping them
separate and keeping them independent [1]. A read-only cache store would implement only
cache loader. A read-write one would implement both interfaces.
Yeah people seem to prefer this for some reason :-)
Here's my though on this : I think CacheLoader + CacheWriter is a nice OOP design, but
is rather theoretical. In all external users scenarios I know, the interaction with the
store is read+write so most of the people thing about a store along this lines. Having a
distinction between loads and stores seems unnatural and creates confusion. For the few
that only need a loader they can simply leave the store() empty - as simple as that.
More below...
[1]
https://github.com/jsr107/jsr107spec/tree/v0.8/src/main/java/javax/cache/...
On Aug 6, 2013, at 7:13 PM, William Burns <wburns(a)redhat.com> wrote:
>
> ----- Original Message -----
>> From: "Manik Surtani" <msurtani(a)redhat.com>
>> To: "Mircea Markus" <mmarkus(a)redhat.com>
>> Cc: "Infinispan Core Devs" <infinispan-core-dev(a)infinispan.org>
>> Sent: Monday, August 5, 2013 9:23:41 AM
>> Subject: Re: the new CacheLoader API
>>
>>
>> On 1 Aug 2013, at 11:50, Mircea Markus <mmarkus(a)redhat.com> wrote:
>>
>>> Hi,
>>>
>>> Based on the feedback I received and after doing some prototyping,
here's
>>> the new CacheLoader API I came up
>>>
with:https://github.com/mmarkus/infinispan/tree/t_ISPN-3290/CS_redesign/c...
>>>
>>> I know everybody's quite busy but can you please take some time and
review
>>> this? It's a very important API chance and this would help to getting it
>>> off on the right foot.
>>>
>>> It should cover all the gathered requirements:
>>>
>>> - support for non-distributed transaction cache stores (1PC) and support
>>> for XA capable cache store
>>> [Decision: all the transaction support is delegated to Infinispan. The
>>> cache store SPI is much simplified.]
>>
>> +1
>>
>>>
>>> - support iteration over all the keys/entries in the store
>>> - needed for efficient Map/Reduce integration
>>> - needed for efficient implementation of Cache.keySet(), Cache.entrySet(),
>>> Cache.values() methods
^ We all agree that these methods are problematic. I don't think we should work
towards making them efficient. Ultimately we should phase them out...
>>> [look in BulkCacheLoader]
>>
>> I think you meant for bulkLoadKeys() to take in a KeyFilter, not a
>> Collection?
>
> My guess is Mircea was going for an overloaded method for bulkLoadKeys where we want
both one that takes a Collection and one that takes a KeyFilter? If so it seems to me it
would be simpler to just have a single method that takes KeyFilter only, but then have
another class like KeyFilters that has various static factory methods that can take a
Iterable or Iterator for example so we don't have too many methods on the loader
itself.
+1
> Also removeAll should probably take a KeyFilter similarly.
+1
> I noticed there is an overloaded bulkLoad which takes no arguments, which I assume
just streams/loads all entries from the cache store. Do we need to have similarly
overloaded methods for bulkLoadKeys? If not than do we even need the no arg bulkLoad
method and just have null/AllKeyFilter be passed in instead? Also similarly would we need
clear method then? Just trying to think if we can make the interface have as few methods
as needed.
+1, and +100 to keeping as few methods as needed.
> Some methods don't have types on them. I am guessing we want to put Object type
just to prevent random warnings for user and CacheStore implementations.
+1
>
>>
>>>
>>> - a simple read(k) + write(k,v) interface to be implemented by users that
>>> just want to position ISPN as a cache between an app and a legacy system
>>> and which don't need/want to be bothered with all the other complex
>>> features
>>> [look at BasicCacheLoader]
>
> We should move to the immutable CacheStoreConfiguration class now instead of
CacheStoreConfig, right?
+1
>
>>
>> Shouldn't remove() return a boolean?
>
> It looks like we don't use the return value in ISPN code currently, but I would
agree for future usage we should probably have it return a boolean. Also to be more
inline with Collection interface do we want a boolean to be returned from the removeAll
method if something was removed?
It all depends what the guarantees should be at the end of the remove… if the method has
no return, implementations are more free to do the remove asynchronously and not having to
wait to see if anything was removed. The return forces some kind of result out of that
removal, which could help Infinsipan makes decisions in the future...
>
>>
>>>
>>> - support for expiration notification (ISPN-3064)
>>> [look at ExpiryCacheLoader.purgeExpired]
>
> I am wondering if there is some way to have this be more of a streaming approach
instead of returning a Set will all of the expired keys in memory. What do you think if
we passed in a callback of some type to the purgeExpired method instead. This would make
it the burden of the CacheLoader to properly invoke the method for each one, but it would
allow for more efficient use of memory when the cache loader can implement a streaming
approach to eviction.
That sounds like an good idea :). Also, why does ExpiryCacheLoader need to extend
BasicCacheLoader? Couldn't it be treated like a mixin? IOW, have a cache loader that
can optionally purgeExpired entries.
>
>>
>> Returning the set of purged keys could be unnecessarily expensive... e.g., in
>> a JDBC-backed store, expiration could be a simple DELETE WHERE... statement,
>> whereas if the keys need to be returned, this becomes far more complex.
>>>
>>> - support for size (efficient implementation of the cache.size() method)
>>> [look at BulkCacheLoader.size()]
>>
>> I'm not so sure if I buy the separate interfaces approach. E.g., what
>> happens to calling cache.size() when the configured cache loader is not a
>> BulkCacheLoader? YOu effectively change the contract of cache.size()?
>
> Actually that is an interesting point. I would normally say it has to fallback to the
bulkLoadKeys method but that is also on the BulkCacheLoader interface. Actually without
defining your Cache to implement BulkCacheLoader you can't even have preload enabled
either. It seems we maybe need another interface or move some of the methods down to the
BasicCacheLoader
I also agree that this separation of bulk vs basic cache loader interface is a bit weird…
, particularly since the difference between the two is not only configuration (whether
preload is enabled or not), but also the fact that some cache operations currently require
an iterable cache store (size, keys, values, entries and iterator!!).
To reiterate what I said above, we shouldn't fixate ourselves with size, keys, values
and entries methods, since they're difficult to implement and potentially expensive,
but iterator() is something we're always going to be needing in the Cache interface.
So, iterator() should really be living in CacheLoader (or BasicCacheLoader) interface.
With all these in mind, I can see the point of separating any cache loader methods that
support cache methods that we'll be phasing out (keys, values, size) to a separate
inteface, i.e. BulkCacheLoader. This way, when we phase them out, we don't need to
modify the base cache loader interface. By simply removing the BulkCacheLoader interface
you've achieved that.
Cheers,
>
>>
>>>
>>> Cheers,
>>> --
>>> Mircea Markus
>>> Infinispan lead (
www.infinispan.org)
>>>
>>>
>>>
>>>
>>
>> --
>> Manik Surtani
>>
>>
>>
>>
--
Galder Zamarreño
galder(a)redhat.com
twitter.com/galderz
Project Lead, Escalante
http://escalante.io
Engineer, Infinispan
http://infinispan.org
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)