[infinispan-dev] Let me understand DIST

Bela Ban bban at redhat.com
Sat Mar 10 12:07:28 EST 2012


Can you confirm that my understanding of how DIST works is correct ?


#1 Non transactional modifications
- E.g. a PUT
- The PUT is sent to the primary owner (e.g. B) and all backup owners 
(e.g. C) and is applied immediately, with only local lock acquisition 
(lock-put-unlock)
(or is the PUT only sent to B, which in turn then updates C?)


#2 Transactional modifications
- The modifications involve a bunch of keys
- When the TX commits:
   - A PREPARE message with the relevant keys is sent to all primary 
owners P (to the backup owners as well?)
   - All primary owners try to apply the modifications, acquiring locks
   - If all primary owners can successfully apply all of the 
modifications, the TX commits, else it rolls back
   - After a successful TX, the primary owners update the backup owners: 
here, I'm probably wrong, and this is done *inside* of the TX scope, right ?


So is my understanding of #1 and #2 is correct ?

If so, then I can assume that a transactional modification touching a 
number of keys will almost always touch *all* nodes ? Example:
- We have 10 nodes
- numOwners = 2
- If we have a good consistent hash, I can assume that I have to modifiy 
5 different keys (10 / 2) on average in a TX to touch *all* nodes in the 
cluster with the PREPARE/COMMIT phase, correct ?

If my last statement is correct, is it safe to assume that with DIST and 
transactional modifications, I will have a lot of TX contention / 
collisions ?

If this is correct, this would IMO lay even more importance onto the 
work done by the Cloud-TM team, replacing 2PC with total order. Also, if 
we touch almost all nodes, would it make sense to use SEQUENCER for 
*all* updates ? Would this obviliate the need for TOM (total order for 
partial replication) ?

Well, probably not, because we only want to send keys to nodes that 
actually need to store them...

Thoughts ?

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list