Here are results of some quick tests I ran on Friday using
PostgreSQL. Note this is total time spent saving session after 500
pages, not time spent indexing.
1.15: 112.916s, 113.433s, 110.835s
1.14: 69.711s, 81.118s, 81.680s
For 400 pages (only ran this test once)
1.15: 61.503s
1.14: 57.856s
And the less pages the more closely 1.15 seems to perform close to
1.14. This is a very low sample size, and my tests might not best
paint an accurate picture, but it does seem that 1.14 consistently
out performs 1.15. So take it as you will. I'm also curious what
performance is like if I would save the session after every page,
instead of one large commit of 500 pages.
- Nick
On 10/19/2012 12:13 PM, Nicolas Filotto
wrote:
Perfs test should be done on supported DB only, do you
confirm that on PostgreSQL the perfs with 1.14 and with 1.15 are
similar? About H2, I was not aware that it is the default
embedded db, I created a JIRA for this https://jira.exoplatform.org/browse/JCR-1982
For these tests it
seemed like the bottleneck was with lucene.
IndexWriter.addDocument for test using 1.14 took 18s while
test using 1.15 took 58s for 500 pages. DB's used were
hsqldb and h2.
I don't think it's a very common use case obviously to be
saving 500 pages; however, the issue came up when I was
testing our import which can do a fair share of reads and
writes. However with this scenario the bottleneck seemed to
be with h2 and the fact
CQJDBCStorageConnection.traverseQPath takes quite awhile for
h2. Switching to postgres this call is irrelevant and import
performance seems comparable to before.
Since it seems we've switched to h2 for GateIn on as7 (which
was news to me), I'm wondering if there are some
optimization's that can be done similar to what has been
done with HSQLDB.
I will note however, I never saw the h2 traverseQPath
bottleneck when running the either Matt's tests or my own
(which were very similar).
- Nick
On 10/19/2012 05:33 AM, Nicolas Filotto wrote:
Which db do you use?
How many threads do you have?
Can you profile to see where is actually coming
from?
Finally, is it something that you need to do very
frequently? because as you probably know in term of
perfs is rarely possible to be better whatever the
use case so it is important to focus on most
frequent use cases only
Is DataStorage.save expected to be slower with
JCR 1.15 (a trade-off for
some performance gain somewhere else) or is
there something which needs
to be done differently when using datastorage
and jcr 1.15? or maybe my
test is completely wrong :)