[infinispan-dev] Distributed tasks - specifying task input

Fri Dec 17 10:10:01 EST 2010

On 17 Dec 2010, at 14:58, Sanne Grinovero wrote:

> About>
> void map(String cacheName, Map.Entry<K, V> e, DistributedTaskContext
> <K, V, R> ctx);
> 
> To satisfy a Lucene query it should be able to at least read multiple
> caches at the same time.
> 
> It would still do map/reduce on a single cache, just I need to be able
> to invoke getCache() during both
> the map and reduce operations - you might think that would defeat the
> purpose, but the situation is that one cache is
> distributed and the other replicated, so I won't do remote get
> operations as we obviously use the keys of the
> distributed one.
> 
> So the API seems fine - I hope as I don't have a prototype - if there
> is some means to have a reference to the
> EmdeddableCacheManager.
> 
> Wouldn't it be more usefull to have a reference to the Cache object
> instead of it's name.

Makes sense.  Perhaps we could pass in a Set<Cache> of all caches declared as required in cacheNames().

> Finally, I might consider it very useful to be able to declare usage
> on keys which don't exist yet, hope that's not a problem?
> So in case of Lucene I might know that I'm going to write and rewrite
> several times a file called "A",
> so I send the operation to the node which is going to host A, though
> this key A might exist, or be created
> by the task during the process.

Should be fine, since the CH works on non-existent keys just as well.  But then it would mean that we can't call map() for each entry if the entry doesn't as yet exist.  Perhaps then to satisfy this case, Vladimir's suggestion of calling map() once per node makes more sense.  E.g.,

	void map(Set<Cache<K, V>> caches, DistributedTaskContext <K, V, R> ctx);

Cheers
Manik

--
Manik Surtani
manik at jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org