On Tue, Mar 24, 2015 at 12:45 PM, William Burns <mudokonman(a)gmail.com> wrote:
On Tue, Mar 24, 2015 at 12:14 PM, Tristan Tarrant
<ttarrant(a)redhat.com> wrote:
> On 24/03/2015 16:22, William Burns wrote:
>> On Tue, Mar 24, 2015 at 11:06 AM, Radim Vansa <rvansa(a)redhat.com> wrote:
>>> On 03/24/2015 03:38 PM, Pedro Ruivo wrote:
>>>> Hi,
>>>>
>>>> comments inline...
>>>>
>>>> On 03/24/2015 01:57 PM, Radim Vansa wrote:
>>>>> On 03/24/2015 02:22 PM, William Burns wrote:
>>>>>> I am nearing completion of the new multi get command [1],
allowing for
>>>>>> more efficient retrieval of multiple keys at the same time.
Thanks
>>>>>> again to Radim for a lot of the initial work.
>>>>>>
>>>>>> In doing so though I want to make sure I get feedback on how we
want
>>>>>> the API to actually look, since this is much more difficult to
change
>>>>>> down the line.
>>>>>>
>>>>>> There are a few things in particular I wanted to discuss.
>>>>>>
>>>>>> 1. The actual name of the method. The main suggestions I have
seen
>>>>>> are getAll or getMany. I do like the naming of the former,
however it
>>>>>> seems a bit misleading (remember API is getAll(Set) since we are
>>>>>> really getting a subset. So I am thinking the possibilities at
this
>>>>>> point are getAllOf, getAllFor or getMany. I am leaning maybe
towards
>>>>>> the last one (getMany). I am open to any suggestions though.
>>>>> Actually Tristan suggested the name multiGet() - I would prefer this
one
>>>>> in the end, and adding multiPut() that would just do putAll (to have
the
>>>>> API symmetric) and deprecate the putAll() method. multiRemove and
others
>>>>> can follow later, and the naming is straightforward.
>>>>> I would object against getAll() since this sounds like retrieving
all
>>>>> entries from the cache, not just the specified keys.
>>>>>
>>>> Why a Set as parameter?
>>>>
>>>> If we have a Collection as parameter, it would allow a user to specify
>>>> easily the keys he want, like multiGet(Arrays.asList(k1, k2, k3)).
>>> It's Set as the collection should not contain duplicities. However,
it's
>>> a bit questionable whether the collection should be copied for (async)
>>> internal processing or whether Infinispan takes ownership of this map
>>> (first option is more safe, second is faster).
>> I would say it shouldn't be copied, but I could put a comment on the
>> Javadoc if the Set is modified concurrently while the operation is
>> running the results returned will be nondeterministic.
> The JCache contract for this is:
>
> Map<K, V> getAll(Set<? extends K> keys);
Ah, yes I had forgotten this was already on JCache, in that case we
have a clear precedent and full API definition. I am thinking to be
consistent using this would be best and later we can look into Stream
API as was mentioned.
Actually thinking about this more, the JCache API for getAll seems
wrong. It seems it should be Map<K, V> getAll(Set<?> keys). Putting
an upper or lower bounds on the type is too restrictive. By doing
Set<? extends K> I can't pass in a Set<Object> for example. Looking
at the various Collection classes they do a similar thing with
containsAll, retainAll, removeAll etc. I am thinking I am going to
have ours be Set<?> then instead. We will still have compatibility
with JCache as well.
>
> Nothing is said about whether "keys" should be "safe", and I
think that
> is absolutely fine for us as well (i.e. no copying, if you change the
> set, you get what you deserve).
>
>>
>>> Radim
>>>
>>>>>> 2. What should the return value be for this method. Currently
this
>>>>>> returns a Map<K, V> which makes sense when we retain these
values in
>>>>>> the local transactional context and is pretty nice and clean for
end
>>>>>> users.
>>>>>>
>>>>>> The other idea is to use a streaming API possibly returning an
>>>>>> Iterable<CacheEntry<K, V>>. The streaming return
value doesn't seem
>>>>>> as intuitive to me, but opens us up for more efficient usage
>>>>>> especially when there may be a lot of keys passed in (also
results can
>>>>>> be processed concurrently while retrieval is still occurring).
>>>>>>
>>>>>> I would lean towards returning the Map<K, V> value, however
the next
>>>>>> point could change that.
>>>>> I think that Iterable<CacheEntry<K, V>> is too confusing
for the end
>>>>> user, I would stick to the Map. If you want lazy loading, the Map
(and
>>>>> it's EntrySet) could be made lazy by a flag.
>> The point wasn't that it was lazy, but rather that we could limit how
>> many values are maintained on the caller side to prevent excess memory
>> usage.
>>
>>>> my vote is to use the Map<K,V> interface. The Iterable will force
the
>>>> user to fetch the data using the iterator order, while the Map allow him
>>>> to use any order he wants (i.e. fetch particular keys via map.get(()).
>>>>
>>>> BTW, it can be good to wrap the Map in a unmodifiableMap().
>> I wonder if this really matters. Each invocation would return its own
>> map, so if they modified it, it shouldn't matter. I would make sure
>> to document that the Map is not backing map in that updates to the
>> cache or this map are not reflected in the other.
> Agree on all points. An iterable variant would probably more closely
> resemble the Java 8 streams, so I wouldn't overcomplicated this specific
> API.
>
>>>>>> 3. Where this method should live. Right now this method is on
the
>>>>>> BasicCache interface which is a parent of both Cache (embedded)
and
>>>>>> RemoteCache (remote). Unfortunately for remote clients to be as
>>>>>> efficient as possible we really need a Streaming API so that we
don't
>>>>>> have multiple copies of data in memory at multiple places at the
same
>>>>>> time. For that reason I suggest we use the streaminig API for
both or
>>>>>> have a different API for remote vs embedded.
>>>>>>
>>>> IMO, we should have different API for embedded and remote.
>> +1 That is what I was thinking as well that we make the embedded API
>> nicer to use. We can try to figure out the specifics of the remote
>> API later then.
>>
>>>> BasicCache does not seems a good place to put it. It can be placed in
>>>> AdvancedCache.
>> +1 I was thinking this may be the safest for now and we can always
>> move it to a higher class later if needed, however we can't move it
>> down easily.
> +1 for AdvancedCache for now, for Infinispan 8 Galder and I will discuss
> interfaces next week.
>
> Tristan
>
> --
> Tristan Tarrant
> Infinispan Lead
> JBoss, a division of Red Hat
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev