[infinispan-dev] Performance validation of Remote & Embedded Query functionality

Thu Jun 5 09:54:47 EDT 2014

Hi Sanne,

yes, I do plan to do (Remote|Embedded) Query performance testing, and 
also compare the results with some competing solutions.

I definitely plan to use RadarGun for that as so far the development of 
tests in it proved to be really effective, I have all the infrastructure 
set up to run in distributed environment etc... There is some support 
for embedded query performance testing, but this is really basic and 
I'll probably have to rewrite most of it in order to widen the capabilities.

As RadarGun aims to be generic framework (although Infinispan features 
are naturally covered more thoroughly), I'll have to build some basic 
API common to remote, embedded and other querying frameworks, and 
translate that into Infinispan APIs. Or use just the query string that 
will differ for Infinispan and other solutions. Of course, I could find 
myself tailoring that a bit more for Infinispan :)

 >> break down response time of different phases: be able to generate an 
histogram of a specific phase only.

What do you mean by the phases?

 >> Count the number of RPCs generated by a specific operation

Huh, Infinispan does not report the RPC stats - although I agree that 
would be cool! There's another thread regarding MapReduce where we 
discuss that there should be more collectors of perf data scattered 
around Infinispan. Created [2].

 >> Count the number of CacheStore writes/reads being triggered

Same as above. Created [1]

Regarding data: Specifying size is not enough. Could you draft several 
objects, with some interesting annotations describing the various 
querying settings?
I'd probably start with generated data - the deployment is simpler, and 
we need to generate them somehow anyway. Fixing seeds is simple.

Ad CapeDwarf & Hibernate OGM:
I don't have any experience with CapeDwarf, but think about RadarGun 
this way: it has two main functions:
1) orchestrates the test: starting nodes with some configuration, 
running some test stages, gathering and presenting results and so on
2) abstracts the implementations of any functionality by providing 
unified (maybe simplified) interface [3] - this is important only if we 
want to compare against competitors using the same test

The test itself yet has to be written.
I can't tell whether CapeDwarf guys will be willing to use RadarGun - 
probably not, people are usually reluctant to learn any new tool, 
especially one that has no GUI. I'll focus on the first benchmarks, and 
let's see how that will end up.
PS: For Hibernate OGM, I can easily imagine RadarGun wrapping JPA 
EntityManager and routing to Hibernate OGM or generally any JPA 
implementation, so that could work.

Radim

[1] https://issues.jboss.org/browse/ISPN-4352
[2] https://issues.jboss.org/browse/ISPN-4353
[3] http://xkcd.com/927/

On 06/05/2014 12:53 PM, Sanne Grinovero wrote:
> Hi Radim, all,
> I'm in a face to face meeting with Adrian and Gustavo, to make plans
> for the next steps of Query development.
> One thing which is essential is of course having some idea of its
> scalability and general performance characteristics, both to identify
> the low hanging fruits which might be in the code, to be able to give
> some guidance to users on expectations, and also to know which
> configuration options are better for each use case: I have some
> expectations but these haven't been validated on the new-gen Query
> functionality.
>
> I was assuming that we would have to develop it: we need it to be able
> to get working on our laptops as a first step to use to identify the
> most obvious mistakes, as well as make it possible to validate in the
> QA lab on more realistic hardware configurations, when the most
> obvious issues will have been resolved.
> Martin suggested that you have this task too as one of the next goals,
> so let's develop it together?
>
> We couldn't think of a cool example to use as a model, but roughly
> this is what we aim to cover, and the data we aim to collect;
> suggestions are very welcome:
>
> ## Benchmark 1: Remote Queries (over Hot Rod)
> Should perform a (weighted) mixture of the following Read and Write operations:
>   (R) Ranges (with and without pagination)
>   (R) Exact
>   (R) All the above, combined (with and without pagination)
>   (W) Insert an entry
>   (W) Delete an entry
>   (W) Update an entry
>
>   Configuration options
> - data sizes: let's aim at having a consistent data set of at least 4GB.
> - 1 node / 4 nodes / 8 nodes
>          - REPL/DIST for the Data storing Cache
> - variable ratio of results out of the index (Query retrieves just 5
> entries out of a million vs. half a million)
> - control ratio of operations; eg. : no writes / mostly writes / just
> Range queries
>          - for write operations: make sure to trigger some Merge events
> - SYNC / ASYNC indexing backend and control IndexWriting tuning
> - NRT / Non-NRT backends (Infinispan IndexManager only available as non-NRT)
> - FSDirectory / InfinispanDirectory
>            -- Infinispan Directory: Stored in REPL / DIST independently
> from the Data Cache
>                                   : With/Without CacheStore
> - Have an option to run "Index-Less" (the tablescan approach)
> - Have an option to validate that the queries are returning the expected results
>
> Track:
>   - response time: all samples, build distribution of outliers, output
> histograms.
>   - break down response time of different phases: be able to generate
> an histogram of a specific phase only.
>   - Count the number of RPCs generated by a specific operation
>   - Count the number of CacheStore writes/reads being triggered
>   - number of parallel requests it can handle
>
> Data:
> It could be random generated but in that case let's have it use a
> fixed seed and make sure it generates the same data set at each run,
> probably depending just on the target size.
> We should also test for distribution of properties of the searched
> fields, since we want to be able to predict results to validate them
> (or find a different way to validate).
> Having a random generator makes preparation faster and allows us to
> generate a specific data size, but in alternative we could download
> some known public data set; assertions on validity of queries would be
> much simpler.
>
> I would like to set specific goals to be reached for each metric, but
> let's see the initial results first. We should then also narrow down
> the configuration option combinations that we actually want to run in
> a set of defined profiles to match common use cases, but let's have
> the code ready to run any combination.
>
> ## Benchmark 2: Embedded Queries
>
> Same tests as Remote Queries (using the same API, so no full-text).
> We might want to develop this one first for simplicity, but results
> for the Remote Query functionality are more urgent.
>
> ## Benchmark 3: CapeDwarf & Objectify
>
> Help the CapeDwarf team by validating embedded queries; it's useful
> for us to have a benchmark running a more complex application. I'm not
> too familiar with RadarGun, do you think this could be created as a
> RadarGun job, so to have the benchmark run regularly and simplify
> setup ?
>
> ## Benchmark 4: Hibernate OGM
>
> Another great use case for a more complex test ;-)
> The remote support for OGM still needs some coding though, but we
> could start looking at the embedded mode.
>
> Priorities?
> Some of these are totally independent, but we don't have many hands to
> work on it.
>
> I'm going to move this to a wiki, unless I get some "revolutionary" suggestions.
>
> Cheers,
> Sanne

-- 
Radim Vansa <rvansa at redhat.com>
JBoss DataGrid QA