On 10-12-17 12:10 PM, Manik Surtani wrote:
> Finally, I might consider it very useful to be able to declare
usage
> on keys which don't exist yet, hope that's not a problem?
> So in case of Lucene I might know that I'm going to write and rewrite
> several times a file called "A",
> so I send the operation to the node which is going to host A, though
> this key A might exist, or be created
> by the task during the process.
Should be fine, since the CH works on non-existent keys just as well. But then it would
mean that we can't call map() for each entry if the entry doesn't as yet exist.
Perhaps then to satisfy this case, Vladimir's suggestion of calling map() once per
node makes more sense. E.g.,
void map(Set<Cache<K, V>> caches, DistributedTaskContext<K, V, R>
ctx);
Cool. Except I would possibly "stuff" Set<Cache<K,V>> into
DistributedTaskContext since they may be needed for reduce call as well.
In order to use caches in reduce phase users might be tempted to "save"
reference to these caches into a field of DistributedTask
implementation. This would create problems as we migrate instances of DT
across JVMs. Having access to Set<Cache<K,V>> from DistributedTask we
better signal to users the general life cycle and proper use of these
references.
Cheers