[infinispan-dev] Distributed tasks - specifying task input

Fri Dec 17 14:50:40 EST 2010

On 10-12-17 12:10 PM, Manik Surtani wrote:
>> Finally, I might consider it very useful to be able to declare usage
>> on keys which don't exist yet, hope that's not a problem?
>> So in case of Lucene I might know that I'm going to write and rewrite
>> several times a file called "A",
>> so I send the operation to the node which is going to host A, though
>> this key A might exist, or be created
>> by the task during the process.
> Should be fine, since the CH works on non-existent keys just as well.  But then it would mean that we can't call map() for each entry if the entry doesn't as yet exist.  Perhaps then to satisfy this case, Vladimir's suggestion of calling map() once per node makes more sense.  E.g.,
>
> 	void map(Set<Cache<K, V>>  caches, DistributedTaskContext<K, V, R>  ctx);
>
>

Cool. Except I would possibly "stuff" Set<Cache<K,V>> into 
DistributedTaskContext since they may be needed for reduce call as well. 
In order to use caches in reduce phase users might be tempted to "save" 
reference to these caches into a field of DistributedTask 
implementation. This would create problems as we migrate instances of DT 
across JVMs. Having access to Set<Cache<K,V>> from DistributedTask we 
better signal to users the general life cycle and proper use of these 
references.

Cheers