[infinispan-dev] About size()

Dan Berindei dan.berindei at gmail.com
Wed Oct 8 10:02:22 EDT 2014


On Tue, Oct 7, 2014 at 4:17 PM, William Burns <mudokonman at gmail.com> wrote:

> On Tue, Oct 7, 2014 at 8:43 AM, Radim Vansa <rvansa at redhat.com> wrote:
> > On 10/07/2014 02:21 PM, William Burns wrote:
> >> On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa <rvansa at redhat.com> wrote:
> >>> If you have one local and one shared cache store, how should the
> command
> >>> behave?
> >>>
> >>> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP,
> >>> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no
> >>> SKIP_BACKUP_ENTRIES flag right now), where this method returns
> >>> localStore.size() for first non-shared cache store + passivation ?
> >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0)
> >> Calling the size method in either distexec or MR will give you
> >> inflated numbers as you need to pay attention to the numOwners to get
> >> a proper count.
> >
> > That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be
> > able to report only primary-owned entries, or we have to iterate and
> > apply the filtering outside.
>
> If we added this functionality then yes it would be promoted up to MR
> counting entries status though it would still have issues with rehash.
> As well as issues with concurrent activations and passivations.
>

I think we can use something like OutdatedTopologyException to make sure we
count each segment once, on the primary owner. But in order to verify that
a particular node is the primary owner we'd have to load each cache store
entry, so performance with cache stores will be pretty bad.

Dealing with concurrent activations/passivations will is even trickier.


>
> >
> >>
> >>> b) distexec/MR sum of sharedStore.size() + passivation ? sum of
> >>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0
> >> Calling the size on a shared cache actually should work somewhat well
> >> (assuming all entries are stored in the shared cache).  The problem is
> >> if passivation is enabled as you point out because you also have to
> >> check the data container which means you can also have an issue with
> >> concurrent activations and passivations (which you can't verify
> >> properly in either case without knowing the keys).
> >>
> >>> c) MR that would count the entries
> >> This is the only reliable way to do this with MR.  And unfortunately
> >> if a rehash occurs I am not sure if you would get inconsistent numbers
> >> or an Exception.  In the latter at least you should be able to make
> >> sure that you have the proper number when it does return without
> >> exception.  I can't say how it works with multiple loaders though, my
> >> guess is that it may process the entry more than once so it depends on
> >> if your mapper is smart enough to realize it.
> >
> > I don't think that reporting incorrect size is *that* harmful - even
> > ConcurrentMap interface says that it's just a wild guess and when things
> > are changing, you can't rely on that.
>
> ConcurrentMap doesn't say anything about size method actually.
> ConcurrentHashMap has some verbage about saying that it might not be
> completely correct under concurrent modification though.
>

> It isn't a wild guess really though for ConcurrentHashMap.  The worst
> is that you could count a value that was there but it is now removed
> or you don't count a value that was recently added.  Really the
> guarantee from CHM is that it counts each individual segment properly
> for a glimpse of time for that segment, the problem is that each
> segment could change (since they are counted at different times).  But
> the values missing in ConcurrentHashMap are totally different than
> losing an entire segment due to a rehash.  You could theoretically
> have a rehash occur right after MR started iterating and see no values
> for that segment or a very small subset.  There is a much larger
> margin of error in this case for what values are seen and which are
> not.
>
>
Interesting... the Map javadoc seems to assume linearizability, maybe
because the original implementation was Hashtable :)

So there is precedent for relaxing the definition of size(). But of course
some users will still expect a 0 error margin when there are no concurrent
writes, so I agree we don't get a free pass to ignore rehashes and
activations during get().


> >
> >>
> >>> d) wrapper on distributed entry iteration with converters set to return
> >>> 0-sized entries
> >> Entry iterator can't return 0 sized entries (just the values).  The
> >> keys are required to make sure that the count is correct and also to
> >> ensure that if a rehash happens in the middle it can properly continue
> >> to operate without having to start over.  Entry iterator should work
> >> properly irrespective of the number of stores/loaders that are
> >> configured, since it keep track of already seen keys (so duplicates
> >> are ignored).
> >
> > Ok, I was simplifying that a bit. And by the way, I don't really like
> > the fact that for distributed entry iteration you need to be able to
> > keep all keys from one segment at one moment in memory. But fine -
> > distributed entry iteration is probably not the right way.
>
> I agree it is annoying to have to keep the keys, but it is one of the
> few ways to reliably get all the values without losing one.  Actually
> this approach provides a much closer approximation to what
> ConcurrentHashMap provides for its size implementation, since it can't
> drop a segment.  It is pretty much required to do it this way to do
> keySet, entrySet, and values where you don't have the luxury of
> dropping whole swaths of entries like you do with calling size()
> method (even if the value(s) was there the entire time).
>

If we decide to improve size(), I'd vote to use distributed entry iterators.

We may be able to avoid sending all the keys to the originator when the
cache doesn't have any stores. But with a store it looks like we can't
avoid reading all the keys from the store, so skipping the transfer of the
keys wouldn't help that much.


>
> >
> >>
> >>
> >>> And what about nodes with different configuration?
> >> Hard to know without knowing what the differences are.
> >
> > I had in my mind different loaders and passivation configuration (e.g.
> > some node could use shared store and some don't - do we want to handle
> > such obscure configs? Can we design that without the need to have
> > complicated decision trees what to include and what not?).
>
> Well the last sentence means we have to use MR or Entry Iterator since
> we can't call size on the shared loader.  I would think that it should
> still work irrespective of the loader configuration (except for MR
> with multiple loaders).  The main issue I can think of is that if
> everyone isn't using the shared loader that you could have stale
> values in the loader if you don't always have a node using the shared
> loader up (assuming purge at startup isn't enabled).
>

We really shouldn't support different store/loader configurations on each
node, except for minor stuff like paths.



>
> >
> > Radim
> >
> >>
> >>> Radim
> >>>
> >>> On 10/06/2014 01:57 PM, Sanne Grinovero wrote:
> >>>> On 6 October 2014 12:44, Tristan Tarrant <ttarrant at redhat.com> wrote:
> >>>>> I think we should provide correct implementations of size() (and
> others)
> >>>>> and provide shortcut implementations using our usual Flag API (e.g.
> >>>>> SKIP_REMOTE_LOOKUP).
> >>>> Right that would be very nice. Same for CacheStore interaction: all
> >>>> cachestores should be included unless skipped explicitly.
> >>>>
> >>>> Sanne
> >>>>
> >>>>> Tristan
> >>>>>
> >>>>> On 06/10/14 12:57, Sanne Grinovero wrote:
> >>>>>> On 3 October 2014 18:38, Dennis Reed <dereed at redhat.com> wrote:
> >>>>>>> Since size() is defined by the ConcurrentMap interface, it already
> has a
> >>>>>>> precisely defined meaning.  The only "correct" implementation is E.
> >>>>>> +1
> >> This is one of the things I have been wanting to do is actually
> >> implement the other Map methods across the entire cache.  However to
> >> do a lot of these in a memory conscious way they would need to be ran
> >> ignoring any ongoing transactions.  Actually having this requirement
> >> allows these methods to be implemented quite easily especially in
> >> conjunction with the EntryIterator.  I almost made a PR for it a while
> >> back, but it seemed a little zealous to do at the same time and it
> >> didn't seem that people were pushing for it very hard (maybe that was
> >> a wrong assumption).  Also I wasn't quite sure the transactional part
> >> not being functional anymore would be a deterrent.
> >>
> >>>>>>> The current non-correct implementation was just because it's
> expensive
> >>>>>>> to calculate correctly.  I'm not sure the current impl is really
> that
> >>>>>>> useful for anything.
> >>>>>> +1
> >>>>>>
> >>>>>> And not just size() but many others from ConcurrentMap.
> >>>>>> The question is if we should drop the interface and all the methods
> >>>>>> which aren't efficiently implementable, or fix all those methods.
> >>>>>>
> >>>>>> In the past I loved that I could inject "Infinispan superpowers"
> into
> >>>>>> an application making extensive use of Map and ConcurrentMap without
> >>>>>> changes, but that has been deceiving and required great care such as
> >>>>>> verifying that these features would not be used anywhere in the
> code.
> >>>>>> I ended up wrapping the Cache implementation in a custom adapter
> which
> >>>>>> would also implement ConcurrentMap but would throw a
> RuntimeException
> >>>>>> if any of the "unallowed" methods was called, at least I would
> detect
> >>>>>> violations safely.
> >>>>>>
> >>>>>> I still think that for the time being - until a better solution is
> >>>>>> planned - we should throw exceptions.. alas that's an old
> conversation
> >>>>>> and it was never done.
> >>>>>>
> >>>>>> Sanne
> >>>>>>
> >>>>>>
> >>>>>>> -Dennis
> >>>>>>>
> >>>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> recently we had a discussion about what size() returns, but I've
> >>>>>>>> realized there are more things that users would like to know. My
> >>>>>>>> question is whether you think that they would really appreciate
> it, or
> >>>>>>>> whether it's just my QA point of view where I sometimes compute
> the
> >>>>>>>> 'checksums' of cache to see if I didn't lost anything.
> >>>>>>>>
> >>>>>>>> There are those sizes:
> >>>>>>>> A) number of owned entries
> >>>>>>>> B) number of entries stored locally in memory
> >>>>>>>> C) number of entries stored in each local cache store
> >>>>>>>> D) number of entries stored in each shared cache store
> >>>>>>>> E) total number of entries in cache
> >>>>>>>>
> >>>>>>>> So far, we can get
> >>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size()
> >>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size()
> >>>>>>>> E via distributed iterators / MR
> >>>>>>>> A via data container iteration + distribution manager query, but
> only
> >>>>>>>> without cache store
> >>>>>>>> C or D through
> >>>>>>>>
> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores()
> >>>>>>>>
> >>>>>>>> I think that it would go along with users' expectations if size()
> >>>>>>>> returned E and for the rest we should have special methods on
> >>>>>>>> AdvancedCache. That would of course change the meaning of size(),
> but
> >>>>>>>> I'd say that finally to something that has firm meaning.
> >>>>>>>>
> >>>>>>>> WDYT?
> >>>>>>>>
> >>>>>>>> Radim
> >>>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> infinispan-dev mailing list
> >>>>>>> infinispan-dev at lists.jboss.org
> >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>>>> _______________________________________________
> >>>>>> infinispan-dev mailing list
> >>>>>> infinispan-dev at lists.jboss.org
> >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>>>>
> >>>>>>
> >>>>> _______________________________________________
> >>>>> infinispan-dev mailing list
> >>>>> infinispan-dev at lists.jboss.org
> >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>> _______________________________________________
> >>>> infinispan-dev mailing list
> >>>> infinispan-dev at lists.jboss.org
> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>
> >>> --
> >>> Radim Vansa <rvansa at redhat.com>
> >>> JBoss DataGrid QA
> >>>
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> infinispan-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >> _______________________________________________
> >> infinispan-dev mailing list
> >> infinispan-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> > --
> > Radim Vansa <rvansa at redhat.com>
> > JBoss DataGrid QA
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20141008/0665c484/attachment-0001.html 


More information about the infinispan-dev mailing list