[infinispan-dev] About size()

Dan Berindei dan.berindei at gmail.com
Wed Oct 8 10:11:59 EDT 2014


On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus <mmarkus at redhat.com> wrote:

> On Oct 3, 2014, at 9:30, Radim Vansa <rvansa at redhat.com> wrote:
>
> > Hi,
> >
> > recently we had a discussion about what size() returns, but I've
> > realized there are more things that users would like to know. My
> > question is whether you think that they would really appreciate it, or
> > whether it's just my QA point of view where I sometimes compute the
> > 'checksums' of cache to see if I didn't lost anything.
> >
> > There are those sizes:
> > A) number of owned entries
> > B) number of entries stored locally in memory
> > C) number of entries stored in each local cache store
> > D) number of entries stored in each shared cache store
> > E) total number of entries in cache
> >
> > So far, we can get
> > B via withFlags(SKIP_CACHE_LOAD).size()
> > (passivation ? B : 0) + firstNonZero(C, D) via size()
> > E via distributed iterators / MR
> > A via data container iteration + distribution manager query, but only
> > without cache store
> > C or D through
> >
> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores()
> >
> > I think that it would go along with users' expectations if size()
> > returned E and for the rest we should have special methods on
> > AdvancedCache. That would of course change the meaning of size(), but
> > I'd say that finally to something that has firm meaning.
> >
> > WDYT?
>
> There was a lot of arguments in past whether size() and other methods that
> operate over all the elements (keySet, values) are useful because:
> - they are approximate (data changes during iteration)
> - they are very resource consuming and might be miss-used (this is the
> reason we chosen to use size() with its current local semantic)
>
> These methods (size, keys, values) are useful for people and I think we
> were not wise to implement them only on top of the local data: this is like
> preferring efficiency over correctness. This also created a lot of
> confusion with our users, question like size() doesn't return the correct
> value being asked regularly. I totally agree that size() returns E (i.e.
> everything that is stored within the grid, including persistence) and it's
> performance implications to be documented accordingly. For keySet and
> values - we should stop implementing them (throw exception) and point users
> to Will's distributed iterator which is a nicer way to achieve the desired
> behavior.
>

We can also implement keySet() and values() on top of the distributed entry
iterator and document that using the iterator directly is better.


>
> >
> > Radim
> >
> > --
> > Radim Vansa <rvansa at redhat.com>
> > JBoss DataGrid QA
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> Cheers,
> --
> Mircea Markus
> Infinispan lead (www.infinispan.org)
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/72903689/attachment.html 


More information about the infinispan-dev mailing list