On 06/07/2012 17:12, Vladimir Blagojevic wrote:
Mircea,
Privately to you as I am not sure this makes sense for wider
distribution - yet.
Adding infinispan-dev to this as I think it's interesting
for a wider
audience.
Say user has bunch of keys/values to insert into cache. He could do it
one key/value at the time, all in one tx or in tx batches. If he wants
to do it in batches of transactions then it would make sense to group
keys by the primary Address assigned on hashwheel.
Very interesting point. Besides
the locking, grouping keys has another
significant advantage : during the prepare phase each node receives the
complete list of modifications in that transaction and not only the
modification pertaining to it.
E.g. say we have the following key->node mapping:
k1 -> A
k2 -> B
k3 -> C
Where k1, k2 and k3 are keys; A, B and C are nodes.
If Tx1 writes (k1,k2,k3) then during the prepare A,B and C will receive
the the same package containing all the modification - namely (k1,
k2,k3). There are several reasons for doing this (apparently)
unoptimized approach: serialize the prepare only once, better handling
of recovery information.
Now if you group transactions/batches base on key distribution, as you
suggested, the amount of redundant traffic is significantly reduced -
and that translates in better performance especially when the datasets
you're inserting is quite high.
Therefore each tx batch would lock keys only on primary node and
nowhere else - call it tx node pinning if you want! Now imagine a
cluster with bunch of concurrent txs initiated from all nodes. If I am
not mistaken this tx pinning algorithm would not only increase
throughput but also minimize deadlocks.
yes. With optimistic tx caches, the only
possibility for deadlocks is
between transactions touching multiple nodes[1]. As long as your
transactions only write to the same node, even if they do it on the same
key-set, the possibility of deadlock is (almost[2]) zero.
Does this make sense? If so, why not support it somehow on API level
or do we already? ;-)
We don't have a service like this for now. I think your
best option is
to fetch the CH from the advanced cache
(cache.getAdvancedCache().getDistributionManager().getConistentHahs())
and use it to group the inserts.
Thinking about it, might be worth having an blog entry describing this
as it can really boost performance when you need to load an initial
large set of data in Infinispan.
Regards,
Vladimir
[1] this DLD situation will be fixed once we have incremental locking in
place:
https://issues.jboss.org/browse/ISPN-1219
[2] we use key's CH value to induce an order over the keys written in a
transaction - that's in order to avoid deadlocks. If there are
collisions between these values then there's still a chance for deadlock.