[infinispan-dev] Multi get API

William Burns mudokonman at gmail.com
Tue Mar 24 12:45:44 EDT 2015


On Tue, Mar 24, 2015 at 12:14 PM, Tristan Tarrant <ttarrant at redhat.com> wrote:
> On 24/03/2015 16:22, William Burns wrote:
>> On Tue, Mar 24, 2015 at 11:06 AM, Radim Vansa <rvansa at redhat.com> wrote:
>>> On 03/24/2015 03:38 PM, Pedro Ruivo wrote:
>>>> Hi,
>>>>
>>>> comments inline...
>>>>
>>>> On 03/24/2015 01:57 PM, Radim Vansa wrote:
>>>>> On 03/24/2015 02:22 PM, William Burns wrote:
>>>>>> I am nearing completion of the new multi get command [1], allowing for
>>>>>> more efficient retrieval of multiple keys at the same time.  Thanks
>>>>>> again to Radim for a lot of the initial work.
>>>>>>
>>>>>> In doing so though I want to make sure I get feedback on how we want
>>>>>> the API to actually look, since this is much more difficult to change
>>>>>> down the line.
>>>>>>
>>>>>> There are a few things in particular I wanted to discuss.
>>>>>>
>>>>>> 1. The actual name of the method.  The main suggestions I have seen
>>>>>> are getAll or getMany.  I do like the naming of the former, however it
>>>>>> seems a bit misleading (remember API is getAll(Set) since we are
>>>>>> really getting a subset.  So I am thinking the possibilities at this
>>>>>> point are getAllOf, getAllFor or getMany.  I am leaning maybe towards
>>>>>> the last one (getMany).  I am open to any suggestions though.
>>>>> Actually Tristan suggested the name multiGet() - I would prefer this one
>>>>> in the end, and adding multiPut() that would just do putAll (to have the
>>>>> API symmetric) and deprecate the putAll() method. multiRemove and others
>>>>> can follow later, and the naming is straightforward.
>>>>> I would object against getAll() since this sounds like retrieving all
>>>>> entries from the cache, not just the specified keys.
>>>>>
>>>> Why a Set as parameter?
>>>>
>>>> If we have a Collection as parameter, it would allow a user to specify
>>>> easily the keys he want, like multiGet(Arrays.asList(k1, k2, k3)).
>>> It's Set as the collection should not contain duplicities. However, it's
>>> a bit questionable whether the collection should be copied for (async)
>>> internal processing or whether Infinispan takes ownership of this map
>>> (first option is more safe, second is faster).
>> I would say it shouldn't be copied, but I could put a comment on the
>> Javadoc if the Set is modified concurrently while the operation is
>> running the results returned will be nondeterministic.
> The JCache contract for this is:
>
> Map<K, V> getAll(Set<? extends K> keys);

Ah, yes I had forgotten this was already on JCache, in that case we
have a clear precedent and full API definition.  I am thinking to be
consistent using this would be best and later we can look into Stream
API as was mentioned.

>
> Nothing is said about whether "keys" should be "safe", and I think that
> is absolutely fine for us as well (i.e. no copying, if you change the
> set, you get what you deserve).
>
>>
>>> Radim
>>>
>>>>>> 2. What should the return value be for this method.  Currently this
>>>>>> returns a Map<K, V> which makes sense when we retain these values in
>>>>>> the local transactional context and is pretty nice and clean for end
>>>>>> users.
>>>>>>
>>>>>> The other idea is to use a streaming API possibly returning an
>>>>>> Iterable<CacheEntry<K, V>>.  The streaming return value doesn't seem
>>>>>> as intuitive to me, but opens us up for more efficient usage
>>>>>> especially when there may be a lot of keys passed in (also results can
>>>>>> be processed concurrently while retrieval is still occurring).
>>>>>>
>>>>>> I would lean towards returning the Map<K, V> value, however the next
>>>>>> point could change that.
>>>>> I think that Iterable<CacheEntry<K, V>> is too confusing for the end
>>>>> user, I would stick to the Map. If you want lazy loading, the Map (and
>>>>> it's EntrySet) could be made lazy by a flag.
>> The point wasn't that it was lazy, but rather that we could limit how
>> many values are maintained on the caller side to prevent excess memory
>> usage.
>>
>>>> my vote is to use the Map<K,V> interface. The Iterable will force the
>>>> user to fetch the data using the iterator order, while the Map allow him
>>>> to use any order he wants (i.e. fetch particular keys via map.get(()).
>>>>
>>>> BTW, it can be good to wrap the Map in a unmodifiableMap().
>> I wonder if this really matters.  Each invocation would return its own
>> map, so if they modified it, it shouldn't matter.  I would make sure
>> to document that the Map is not backing map in that updates to the
>> cache or this map are not reflected in the other.
> Agree on all points. An iterable variant would probably more closely
> resemble the Java 8 streams, so I wouldn't overcomplicated this specific
> API.
>
>>>>>> 3. Where this method should live.  Right now this method is on the
>>>>>> BasicCache interface which is a parent of both Cache (embedded) and
>>>>>> RemoteCache (remote).  Unfortunately for remote clients to be as
>>>>>> efficient as possible we really need a Streaming API so that we don't
>>>>>> have multiple copies of data in memory at multiple places at the same
>>>>>> time.  For that reason I suggest we use the streaminig API for both or
>>>>>> have a different API for remote vs embedded.
>>>>>>
>>>> IMO, we should have different API for embedded and remote.
>> +1 That is what I was thinking as well that we make the embedded API
>> nicer to use.  We can try to figure out the specifics of the remote
>> API later then.
>>
>>>> BasicCache does not seems a good place to put it. It can be placed in
>>>> AdvancedCache.
>> +1 I was thinking this may be the safest for now and we can always
>> move it to a higher class later if needed, however we can't move it
>> down easily.
> +1 for AdvancedCache for now, for Infinispan 8 Galder and I will discuss
> interfaces next week.
>
> Tristan
>
> --
> Tristan Tarrant
> Infinispan Lead
> JBoss, a division of Red Hat
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


More information about the infinispan-dev mailing list