[infinispan-dev] Map/Reduce or other batch processing on CacheLoader stored entries

Sanne Grinovero sanne at infinispan.org
Fri May 25 07:02:42 EDT 2012


On 25 May 2012 11:33, Manik Surtani <manik at jboss.org> wrote:
> Yes, as a one-off, but there should be a mechanism to set up internal structures and clean up/send finalisation messages to Hibernate Search or completion RPCs, etc.

Ah got it. Maybe we need it, but I was initally - maybe naively -
expecting to deal with initialization myself :

MassIndexingWorkCollector pwc = new MassIndexingWorkCollector();
//implements Processor
pwc.initialize(.custom stuff..) //not defined on Processor
[cacheLoader?].processEntriesWith(pwc); // Blocking! so we know when
we finished loading all entries.
pwc.shutdownWorkers(); //not defined on Processor

minimal API ;-)
But I guess when implementing for real I might need something like that.

Sanne

>
> On 25 May 2012, at 11:31, Sanne Grinovero wrote:
>
>> On 25 May 2012 10:57, Manik Surtani <manik at jboss.org> wrote:
>>> #processEntriesWith(Processor p)
>>>
>>> Processor extends Lifecycle { // Lifecycle for start() and stop() methods…
>>>   void process(CacheEntry e);
>>>   void process(Collection<CacheEntry> e);
>>>   boolean processMoreEntries();
>>> }
>>
>> why the LifeCycle start/stop ?
>> I expect to use it as a one-off, not as something which is
>> "permanently hooked": looks like you' re thinking about a different
>> problem?
>>
>> The use case I'm thinkin of is when we need to iterate on all entries
>> in the cachestore, such as :
>> - Map/Reduce
>>  - evaluating the average value of some attribute
>>  - word counting
>> - MassIndexer
>>
>> In all the use cases I'm having in mind, you want to process all
>> entries, and only once.
>> So #processMoreEntries would be redundant, and I think we should
>> choose just one between CacheEntry or Collection<CacheEntry>.. let's
>> go with the simple CacheEntry ?
>> Should be able to avoid creation of short lived collections, and when
>> passing collections one would likely need to iterate on each element
>> anyway to route the invocation so some internal process(CacheEntry e);
>>
>> -- Sanne
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org
> twitter.com/maniksurtani
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev



More information about the infinispan-dev mailing list