On 4 Jan 2011, at 16:12, Eduardo Martins wrote:

4) AtomicMap doubt

I read in the Infinispan blog that AtomicMap provides colocation of
all entries, is that idea outdated? If not we may need a way to turn
that off :) For instance would not that mean the Tree API does not
works well with distribution mode? I apologize in advance if I'm
missing something, but if AtomicMap defines colocation, AtomicMap is
good for the node's data map, but not for the node's childs fqns.
Shouldn't each child fqn be freely distributed, being colocated
instead with the related node cache entry and data (atomic)map? Our
impl is kind of an "hybrid" of the Tree API, allows cache entries
references (similar to childs) but no data map, and the storage of
references through AtomicMap in same way as Tree API worries me.
Please clarify.

Each FQN has underneath an AtomicMap, so each you won't find yourself finding k,v pairs belonging to a particular Fqn in different nodes.

We make no guarantees wrt child fqn nodes. So, just cos FQN B is child of FQN A, it does not mean that the atomic map of B will be in same node as atomic map of B.

Before I proceed with my reasoning, please clarify, the colocation
within AtomicMap is real? If I store data there, all data will be
colocated?

Yup, all *data* stored in the AtomicMap will be located in the same node. It's treated as a single entity.


Ok, then lets think on the Tree API, typically/optimally you add, get
and remove a specific node in same cluster node/zone, iterating
through a node's childs is rare and usually without much perf
constraints. With current impl the node's child map entries are
colocated, since each node does that through an AtomicMap with child's
last FQN element, IMHO this is not great for performance:

1. Consider parent node P in node N1
2. In node N2, add a P child, this goes to N1. Adding P Child also
creates the child cache entries, all 3 seem to be colocated through
hashCode(), correct me if I'm wrong. Lets assume all hash ideally to
local node N2.
3. In node N2, get P child, this may go to N1 if we use P, skips it if
use Cache.get(...)
4. In node N2, remove P child, this needs to go to N1

Are you following the issue, if P is a popular parent node (which
happens a lot to root childs), N1 will be hammered by other nodes!

What you said above is not exactly true.  A TreeNode contains 2 AtomicMaps - one for data and one for structure.  The DataMap contains the K/V attribute pairs on the node.  The StructureMap contains information about children.  Firstly, these 2 AtomicMaps aren't necessarily colocated.  This is OK, since you rarely update structure (adding/removing children) and data (attributes on a node) in the same transaction.  Secondly, there is nothing that says parents and children are colocated since they use different keys.  So,

/a/b/c <-- could be on N1
/a/b/c/d <-- could be on N2

so transactions doing stuff on /a/b/c and /a/b/c/d won't be affecting the same node - unless it is structure that is changing.  So as per your example above, in step 2, all 3 don't necessarily go to the same node.  Also if you do something like TreeCache.getNode() with an FQN, you don't walk through parents.

https://github.com/infinispan/infinispan/blob/master/tree/src/main/java/org/infinispan/tree/TreeCache.java#L181

HTH
Manik
--
Manik Surtani
manik@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org