[infinispan-dev] Multi get API

Pedro Ruivo pedro at infinispan.org
Tue Mar 24 10:38:15 EDT 2015


Hi,

comments inline...

On 03/24/2015 01:57 PM, Radim Vansa wrote:
> On 03/24/2015 02:22 PM, William Burns wrote:
>> I am nearing completion of the new multi get command [1], allowing for
>> more efficient retrieval of multiple keys at the same time.  Thanks
>> again to Radim for a lot of the initial work.
>>
>> In doing so though I want to make sure I get feedback on how we want
>> the API to actually look, since this is much more difficult to change
>> down the line.
>>
>> There are a few things in particular I wanted to discuss.
>>
>> 1. The actual name of the method.  The main suggestions I have seen
>> are getAll or getMany.  I do like the naming of the former, however it
>> seems a bit misleading (remember API is getAll(Set) since we are
>> really getting a subset.  So I am thinking the possibilities at this
>> point are getAllOf, getAllFor or getMany.  I am leaning maybe towards
>> the last one (getMany).  I am open to any suggestions though.
>
> Actually Tristan suggested the name multiGet() - I would prefer this one
> in the end, and adding multiPut() that would just do putAll (to have the
> API symmetric) and deprecate the putAll() method. multiRemove and others
> can follow later, and the naming is straightforward.
> I would object against getAll() since this sounds like retrieving all
> entries from the cache, not just the specified keys.
>

Why a Set as parameter?

If we have a Collection as parameter, it would allow a user to specify 
easily the keys he want, like multiGet(Arrays.asList(k1, k2, k3)).

>>
>> 2. What should the return value be for this method.  Currently this
>> returns a Map<K, V> which makes sense when we retain these values in
>> the local transactional context and is pretty nice and clean for end
>> users.
>>
>> The other idea is to use a streaming API possibly returning an
>> Iterable<CacheEntry<K, V>>.  The streaming return value doesn't seem
>> as intuitive to me, but opens us up for more efficient usage
>> especially when there may be a lot of keys passed in (also results can
>> be processed concurrently while retrieval is still occurring).
>>
>> I would lean towards returning the Map<K, V> value, however the next
>> point could change that.
>
> I think that Iterable<CacheEntry<K, V>> is too confusing for the end
> user, I would stick to the Map. If you want lazy loading, the Map (and
> it's EntrySet) could be made lazy by a flag.

my vote is to use the Map<K,V> interface. The Iterable will force the 
user to fetch the data using the iterator order, while the Map allow him 
to use any order he wants (i.e. fetch particular keys via map.get(()).

BTW, it can be good to wrap the Map in a unmodifiableMap().

>
>>
>> 3. Where this method should live.  Right now this method is on the
>> BasicCache interface which is a parent of both Cache (embedded) and
>> RemoteCache (remote).  Unfortunately for remote clients to be as
>> efficient as possible we really need a Streaming API so that we don't
>> have multiple copies of data in memory at multiple places at the same
>> time.  For that reason I suggest we use the streaminig API for both or
>> have a different API for remote vs embedded.
>>

IMO, we should have different API for embedded and remote.

BasicCache does not seems a good place to put it. It can be placed in 
AdvancedCache.

Cheers,
Pedro

>
> Radim
>


More information about the infinispan-dev mailing list