[infinispan-dev] About size()

Fri Oct 10 14:01:52 EDT 2014

On Oct 10, 2014, at 17:30, William Burns <mudokonman at gmail.com> wrote:

>>>>> Also we didn't really talk about the fact that these methods would
>>>>> ignore ongoing transactions and if that is a concern or not.
>>>>> 
>>>> 
>>>> It might be a concern for the Hibernate 2LC impl, it was their TCK that
>>>> prompted the last round of discussions about clear().
>>> 
>>> Although I wonder how much these methods are even used since they only
>>> work for Local, Replication or Invalidation caches in their current
>>> state (and didn't even use loaders until 6.0).
>> 
>> 
>> There is some more information about the test in the mailing list discussion
>> [1]
>> There's also a JIRA for clear() [2]
>> 
>> I think 2LC almost never uses distribution, so size() being local-only
>> didn't matter, but making it non-tx could cause problems - at least for that
>> particular test.
> 
> I had toyed around with the following idea before, but I never thought
> of it in the scope of the size method solely, but I have a solution
> that would work mostly for transactional caches.  Essentially the size
> method would always operate in a READ_COMMITTED like state, using
> REPEATABLE_READ doesn't seem feasible since we can't keep all the
> contents in memory.  Essentially the iterator would be ran and for
> each key that is found it checks the context to see if it is there.
> If the context entry is marked as removed it doesn't count the key, if
> the key is there it marks the key as found and counts it, and if it is
> not found it counts it.  Then after iteration it finds all the keys in
> the context that were not found and also adds them to the count.  This
> way it doesn't need to store additional memory (besides iteration
> costs) as all the context information is in memory.

sounds good to me.

> 
> My original thought was to also make the EntryIterator transactional
> in the same way which also means the keySet, entrySet and values
> methods could do the same things.  The main reason stumbling block I
> had was the fact that the iterator and various collections returned
> could be used outside of the ongoing transaction which didn't seem to
> make much sense to me.  But maybe these should be changed to be more
> like backing maps which HashMap, ConcurrentHashMap etc use for their
> methods, where instead it would pick up the transaction if there is
> one in the current thread and if there is no transaction just start an
> implicit one.

or if they are outside of a transaction to deny progress

>  This however was a big change from how these
> collections work currently in that they are in memory copies only.
> 
> What do you guys think?

I think that keeping track of the context entries is a better way of iterating so +1. As you mentioned, we should also make it clear that RC semantic applies.

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)