[infinispan-dev] New API to iterate over current entries in cache

William Burns mudokonman at gmail.com
Mon Mar 17 13:57:59 EDT 2014


On Mon, Mar 17, 2014 at 1:43 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
> I do realize you need such a feature, still as I pointed out when we
> first drafted it I'm skeptical because of the complexities you
> mention.
>
> It highly depends on what use cases we want to address, but as a
> general purpose solution I think this "initial state" received by the
> clients needs to be accurate, which implies that providing similar
> iteration guarantees of ConcurrentMap is not fit for the purpose.

Definitely, this feature alone is not enough for the listeners to be
consistent enough.  This is why I had to implement additional queueing
on the listener node to prevent concurrent events from being raised
and only raise them after the state has been applied for each key
respectively.  Unfortunately without some sort of snapshot the only
consistency guarantee we can have is by key.

>
> In the context of the Haddop integration discussions, the need for a
> fully consistent iteration was also mentioned; in that specific case
> we're confident that the feature would still be quite useful even if
> we assume that the user needs to somehow assume the state "shall not
> be changed" while the map/reduce jobs make progress.
>
> Probably not something that you should be working on in the context of
> this specific feature, but I'm getting convinced that we should also
> work on "read consistent iteration" which works on a globally
> consistent snaphost, could be implemented on TOA I guess.
>
> Another use case which comes to mind, is the recent discussion about
> the need for a consistent and accurate "count" implementation.
>
> Note the Snapshot capability would need to take into account CacheStores.

I agree, unfortunately that would make the stores quite a bit more
complex then they are currently.

>
> -- Sanne
>
>
>
>
> On 17 March 2014 15:02, William Burns <mudokonman at gmail.com> wrote:
>> On Mon, Mar 17, 2014 at 10:45 AM, Radim Vansa <rvansa at redhat.com> wrote:
>>> Why listeners are not invoked? JCache iterator() notifies the listeners.
>>
>> Like I mentioned this can be changed.  However, I have not seen a
>> cache entry visitor listener in JCache.  The only listeners I am aware
>> of are for created, removed, expired and updated which wouldn't be
>> affected by this.
>>
>>>
>>> Can the iterator remove entries?
>>
>> Sorry I forgot to mention this, but no it would not be able to remove
>> with what I was planning.  This could be added easily, however this
>> would be problematic for when using repeatable read. Would the remove
>> take part of the transaction and if so it causes an inconsistency
>> since we aren't storing all the values that were read.  I figured it
>> would be easier to just not support it and if the user wants they can
>> simply call cache.remove(key) which this would do anyways.
>>
>>>
>>> I assume there's no ordering guarantee, but behaviour under concurrent
>>> changes would be rather tricky. I don't like the idea of iterating over
>>> changing structure.
>>
>> There are no ordering of elements.  This would have guarantees more
>> similar to the ConcurrentMap entrySet iteration in that you are
>> guaranteed to see a value that was valid for the given key that may or
>> may not be the most up to date.  This would work similarly to how our
>> Cache works now currently though.
>>
>>>
>>> If you need to iterate through all entries, I'd rather introduce the
>>> snapshot ability and then iterate over the snapshot. Then, you wouldn't
>>> have to mess with tx and introduce non-tx operation on tx cache.
>>
>> That would definitely be helpful, but we don't have support for
>> snapshot atm.  The way this is implemented currently though it would
>> be pretty simple to add an option for this though, as all you would
>> need to do is pass along the version of the snapshot in the request
>> command.
>>
>> The reason I was saying not to support this for tx right now, is
>> because of repeatable read, there is no way we can hold all the values
>> of the cache in the current context.
>>
>>>
>>> My 2c
>>>
>>> Radim
>>>
>>> On 03/17/2014 02:30 PM, William Burns wrote:
>>>> While working on ISPN-4068 to add the current state to listeners that
>>>> were added I found that what I essentially needed was a way to iterate
>>>> over the entries of the cache.  I am thinking of adding this to the
>>>> public API available on the AdvancedCache interface.
>>>>
>>>> I wanted to get your guy's opinions if you don't think we should add
>>>> it or any changes you might suggest.
>>>>
>>>> My thought was to add 2 overloaded methods:
>>>>
>>>> <C> Iterator<Map.Entry<K, C>> entryIterator(KeyValueFilter<? super K,
>>>> ? super V> filter, Converter<? super K, ? super V, C> converter);
>>>>
>>>> and
>>>>
>>>> Iterator<Map.Entry<K, V>> entryIterator(KeyValueFilter<? super K, ?
>>>> super V> filter);
>>>>
>>>> The method would return almost immediately after invocation and the
>>>> iterator would queue entries and block as entries are required to be
>>>> returned.  The filter and converter are applied on each of the remote
>>>> nodes and are required to be serializable or have an externalizer
>>>> registered.
>>>>
>>>> Internally the iterator would use chunking to help prevent memory
>>>> saturation.  The max memory usage would be (chunkSize * N) + local
>>>> entries where N is the number of nodes.
>>>>
>>>> These methods would be different than other methods on the
>>>> Cache/AdvancedCache in the following things:
>>>>
>>>> 1. This operation is treated as nontx and thus won't store them into
>>>> the context and thus repeatable read semantics would not be
>>>> guaranteed.  This doesn't preclude manually adding values to the
>>>> context.  Also prior writes in the current context would be ignored
>>>> (current data returned), although this could be changed if desired.
>>>> 2. Values are not activated from loaders and visited listeners would
>>>> not be notified of access.  The latter could be sensibly changed if
>>>> desired.
>>>>
>>>>   - Will
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>> --
>>> Radim Vansa <rvansa at redhat.com>
>>> JBoss DataGrid QA
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


More information about the infinispan-dev mailing list