Re: [infinispan-dev] About size()

Wednesday, 8 October 2014

On Wed, Oct 8, 2014 at 5:03 PM, Mircea Markus <mmarkus(a)redhat.com&gt; wrote:

...
 On Oct 3, 2014, at 9:30, Radim Vansa <rvansa(a)redhat.com&gt;
wrote:

 > Hi,
 >
 > recently we had a discussion about what size() returns, but I've
 > realized there are more things that users would like to know. My
 > question is whether you think that they would really appreciate it, or
 > whether it's just my QA point of view where I sometimes compute the
 > 'checksums' of cache to see if I didn't lost anything.
 >
 > There are those sizes:
 > A) number of owned entries
 > B) number of entries stored locally in memory
 > C) number of entries stored in each local cache store
 > D) number of entries stored in each shared cache store
 > E) total number of entries in cache
 >
 > So far, we can get
 > B via withFlags(SKIP_CACHE_LOAD).size()
 > (passivation ? B : 0) + firstNonZero(C, D) via size()
 > E via distributed iterators / MR
 > A via data container iteration + distribution manager query, but only
 > without cache store
 > C or D through
 >
 getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores()
 >
 > I think that it would go along with users' expectations if size()
 > returned E and for the rest we should have special methods on
 > AdvancedCache. That would of course change the meaning of size(), but
 > I'd say that finally to something that has firm meaning.
 >
 > WDYT?

 There was a lot of arguments in past whether size() and other methods that
 operate over all the elements (keySet, values) are useful because:
 - they are approximate (data changes during iteration)
 - they are very resource consuming and might be miss-used (this is the
 reason we chosen to use size() with its current local semantic)

 These methods (size, keys, values) are useful for people and I think we
 were not wise to implement them only on top of the local data: this is like
 preferring efficiency over correctness. This also created a lot of
 confusion with our users, question like size() doesn't return the correct
 value being asked regularly. I totally agree that size() returns E (i.e.
 everything that is stored within the grid, including persistence) and it's
 performance implications to be documented accordingly. For keySet and
 values - we should stop implementing them (throw exception) and point users
 to Will's distributed iterator which is a nicer way to achieve the desired
 behavior.

We can also implement keySet() and values() on top of the distributed entry
iterator and document that using the iterator directly is better.

...

 >
 > Radim
 >
 > --
 > Radim Vansa <rvansa(a)redhat.com&gt;
 > JBoss DataGrid QA
 >
 > _______________________________________________
 > infinispan-dev mailing list
 > infinispan-dev(a)lists.jboss.org
 > https://lists.jboss.org/mailman/listinfo/infinispan-dev

 Cheers,
 --
 Mircea Markus
 Infinispan lead (www.infinispan.org)

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] About size()