[infinispan-dev] Let me understand DIST

Pedro Ruivo pruivo at gsd.inesc-id.pt
Mon Mar 12 14:13:04 EDT 2012


On 3/10/12 5:07 PM, Bela Ban wrote:
> If so, then I can assume that a transactional modification touching a
> number of keys will almost always touch *all* nodes ? Example:
> - We have 10 nodes
> - numOwners = 2
> - If we have a good consistent hash, I can assume that I have to modifiy
> 5 different keys (10 / 2) on average in a TX to touch *all* nodes in the
> cluster with the PREPARE/COMMIT phase, correct ?
>
> If my last statement is correct, is it safe to assume that with DIST and
> transactional modifications, I will have a lot of TX contention /
> collisions ?

We have run experiments with ISPN 5.2 and TPC-C (1 warehouse, which 
gives a high probability of contention among transactions), and compared 
it with ISPN 5.0 (where locks were acquired on all replicas of a key, 
not only on the primary).

The results running w/o write skew check and 10 nodes on our cluster 
(number of owners=2) follow:

                 Tx/sec    Abort Rate
5.2            12         15
5.0            3           30
5.0-TOM   60         0

Our understanding is that acquiring locks on a single node did reduce 
contention probability/abort rate. But that if transactions update on 
average even a small number of keys (TPC-C should update, with the used 
configuration parameters, around 5 keys on avg) contention may still 
have a big impact on performance.
> If this is correct, this would IMO lay even more importance onto the
> work done by the Cloud-TM team, replacing 2PC with total order.

Thanks :)

>   Also, if we touch almost all nodes, would it make sense to use SEQUENCER for
> *all* updates ? Would this obviliate the need for TOM (total order for
> partial replication) ?
This could be done, you are right, it's what sometimes is called 
"non-genuine" partial replication. Our take on this is that this will 
work good on small scale clusters, not on large ones. But on small scale 
clusters, unless memory is a concern, full replication normally works 
better (as all reads can be served locally)... so, we are not the 
biggest fans of this approach :-)
> Well, probably not, because we only want to send keys to nodes that
> actually need to store them...

Yes, and this cost will likely be prohibitive with large scale clusters 
(>10 nodes)
> Thoughts ?
>
Cheers,
Pedro


More information about the infinispan-dev mailing list