On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus <mmarkus(a)redhat.com> wrote:
On Oct 3, 2014, at 9:30, Radim Vansa <rvansa(a)redhat.com>
wrote:
> Hi,
>
> recently we had a discussion about what size() returns, but I've
> realized there are more things that users would like to know. My
> question is whether you think that they would really appreciate it, or
> whether it's just my QA point of view where I sometimes compute the
> 'checksums' of cache to see if I didn't lost anything.
>
> There are those sizes:
> A) number of owned entries
> B) number of entries stored locally in memory
> C) number of entries stored in each local cache store
> D) number of entries stored in each shared cache store
> E) total number of entries in cache
>
> So far, we can get
> B via withFlags(SKIP_CACHE_LOAD).size()
> (passivation ? B : 0) + firstNonZero(C, D) via size()
> E via distributed iterators / MR
> A via data container iteration + distribution manager query, but only
> without cache store
> C or D through
>
getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores()
>
> I think that it would go along with users' expectations if size()
> returned E and for the rest we should have special methods on
> AdvancedCache. That would of course change the meaning of size(), but
> I'd say that finally to something that has firm meaning.
>
> WDYT?
There was a lot of arguments in past whether size() and other methods that
operate over all the elements (keySet, values) are useful because:
- they are approximate (data changes during iteration)
- they are very resource consuming and might be miss-used (this is the
reason we chosen to use size() with its current local semantic)
These methods (size, keys, values) are useful for people and I think we
were not wise to implement them only on top of the local data: this is like
preferring efficiency over correctness. This also created a lot of
confusion with our users, question like size() doesn't return the correct
value being asked regularly. I totally agree that size() returns E (i.e.
everything that is stored within the grid, including persistence) and it's
performance implications to be documented accordingly. For keySet and
values - we should stop implementing them (throw exception) and point users
to Will's distributed iterator which is a nicer way to achieve the desired
behavior.
We can also implement keySet() and values() on top of the distributed entry
iterator and document that using the iterator directly is better.
>
> Radim
>
> --
> Radim Vansa <rvansa(a)redhat.com>
> JBoss DataGrid QA
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Cheers,
--
Mircea Markus
Infinispan lead (
www.infinispan.org)
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev