[infinispan-dev] [infinispan-internal] Performance based on contention

Fri Jan 4 11:10:49 EST 2013

| The effect on near-CPU caches is an interesting hypothesis, it would
| be nice to measure it; if you happen to run them again, could you use
| perf ?
| Last time I had run it - almost a year ago - the counters had shown
| very large opportunities for improvements (a nice way to say it was a
| disaster), so I guess any change in the locking strategy could have
| improved that.

Can you specify what kind of opportunities have you seen from performance counters? Like too many context switches, or many cache misses? I don't think we have the "luxury" of being able to reorganize memory structures in Java, do we?

| 
| Would also be interesting to run both scenarios under OProfile, so we
| can see which are is slower.

I don't have any experience but recently I have tried to profile XS replication with JProfiler. Instrumentation slowed everything so much that the results made no sense, and sampling cannot prove anything as the methods execute too quickly. Have you used OProfile for ISPN yet, with any useful results?

| 
| Let's not forget that there are many internal structures, so even
| when
| using different keys there are still plenty of contentions.. I just
| wouldn't guess which ones without using the above mentioned tools ;-)
| 
| thanks a lot,
| Sanne
| 
| On 4 January 2013 15:26, Radim Vansa <rvansa at redhat.com> wrote:
| > |
| > | As for the low contention cases, I think they're all about the
| > | same
| > | as the probability for contention is all very low given the
| > | number
| > | of threads used. Perhaps if these were cranked up to maybe 50 or
| > | 100
| > | threads, you'd see a bigger difference?
| >
| > Right, I think that there should be no difference between 24k and
| > 80k, but as you can see in the no-tx case there was some strange
| > behaviour so I added one more case to see the trend in the rest of
| > results. Perflab was bored during xmas anyway ;-)
| > Nevertheless the overall result is pretty surprising for me, as
| > infinspan can handle contention so smoothly that except for really
| > extreme cases (as number of keys == number of threads) contention
| > rather improves the performance (Is there any explanation for
| > this? Some memory caching stuff keeping the often accessed results
| > still occupy near-CPU caches? Can this really affect such
| > high-level thingie as ispn?).
| >
| > Radim
| >
| > | On 4 Jan 2013, at 09:23, Radim Vansa <rvansa at redhat.com> wrote:
| > |
| > | > Hi,
| > | >
| > | > I have created a comparison how JDG (library mode) behaves
| > | > depending on the contention on keys. The test runs standard
| > | > (20%
| > | > puts, 80% gets) stress test on different amount of keys (while
| > | > there are always 80k keys loaded into the cache) with 10
| > | > concurrent threads on each of 8 nodes, for 10 minutes.
| > | > Regarding
| > | > JVM heating there was 10 minute warmup on 80k shared keys, then
| > | > the tests were executed in the order from the table below. TCP
| > | > was
| > | > used as JGroups stack base.
| > | >
| > | > The variants below use pessimistic transactions (one request
| > | > per
| > | > transaction), or no transactions in 6.1.0 case (running w/o
| > | > transactions on JDG 6.0.1 with high contention wouldn't make
| > | > any
| > | > sense). The last 'disjunct' case has slightly different key
| > | > format
| > | > evading any contention. Before the slash is number of reads per
| > | > node (sum for all 10 threads) per second, the latter is number
| > | > of
| > | > writes.
| > | >
| > | > Accessed keys | JDG 6.0.1 TX | JDG 6.1.0 TX | JDG 6.1.0 NO TX
| > | > 80            |  18824/2866  |  21671/3542  |  22264/5978
| > | > 800           |  18740/3625  |  23655/4971  |  20772/6018
| > | > 8000          |  18694/3583  |  21086/4478  |  19882/5748
| > | > 24000         |  18674/3493  |  19342/4078  |  19757/5630
| > | > 80000         |  18680/3459  |  18567/3799  |  22617/6416
| > | > 80k disjunct  |  19023/3670  |  20941/4527  |  20877/6154
| > | >
| > | > I can't much explain why the disjunct sets of keys have so much
| > | > better performance than the low contention cases, the key
| > | > format
| > | > is really just key_(number) for shared keys and
| > | > key_(node)_(thread)_(number) for disjunct ones (and the rest of
| > | > the code paths is the same). The exceptionally good performance
| > | > for 80k keys in non-tx case is also very strange, but I keep
| > | > getting these results consistently.
| > | > Nevertheless, I am really happy about the high performance
| > | > increase
| > | > between 6.0.1 and 6.1.0 and that no-tx case works intuitively
| > | > (no
| > | > deadlocks) and really fast :)
| > | >
| > | > Radim
| > | >
| > | > -----------------------------------------------------------
| > | > Radim Vansa
| > | > Quality Assurance Engineer
| > | > JBoss Datagrid
| > | > tel. +420532294559 ext. 62559
| > | >
| > | > Red Hat Czech, s.r.o.
| > | > Brno, Purkyňova 99/71, PSČ 612 45
| > | > Czech Republic
| > | >
| > | >
| > |
| >
| > _______________________________________________
| > infinispan-dev mailing list
| > infinispan-dev at lists.jboss.org
| > https://lists.jboss.org/mailman/listinfo/infinispan-dev
| 
| _______________________________________________
| infinispan-dev mailing list
| infinispan-dev at lists.jboss.org
| https://lists.jboss.org/mailman/listinfo/infinispan-dev