Uber Jars
by Tristan Tarrant
Dear all,
on Thursday I issued a PR [1] to introduce Uber Jars, i.e. single jars
which wrap our multitude of jars and some of the transitive
dependencies, but it was (rightly) pointed out that we should have a
little discussion here first.
Firstly, I'm using the maven shade plugin which repackages multiple jars
into one with:
- automatic transitive resolution
- the ability to include/exclude certain jars
- move classes if necessary to other packages to avoid conflicts
- rewrite the POM with the new dependencies.
Here is my global strategy:
- define a set of uber-jars (see below)
- include all non-optional dependencies in each uber-jar except for the
specification Jars (e.g. javax.transaction and javax.persistence)
- rename some internal-only dependencies to avoid conflicts
- uber jars MUST NOT inherit from infinispan-parent (too much cruft in
there) but only from infinispan-bom.
The Uber Jars
- infinispan-embedded-all (infinispan-commons, infinispan-core, jgroups,
jboss-marshalling-osgi, jboss-logging, infinispan-cachestore-jdbc,
infinispan-cachestore-jpa, infinispan-cachestore-leveldb)
- infinispan-remote-all (infinispan-commons, infinispan-client-hotrod,
commons-pool, jboss-marshalling-osgi, jboss-logging,
infinispan-remote-query-client, infinispan-query-dsl,
infinispan-protostream, protobuf-java)
- infinispan-query-all (infinispan-query, infinispan-query-dsl,
hibernate-hql-parser, antlr, stringtemplate, hibernate-hql-lucene,
hibernate-search-engine, infinispan-lucene-v4,
hibernate-search-analyzers, lucene*, solr*, avro, jackson-core,
jackson-mapper, paranamer, apache-compress, infinispan-lucene-directory,
hibernate-search-infinispan, hibernate-commons-annotations). This
package will depend on infinispan-embedded-all and we should only make a
Lucene v4 version).
Discuss :)
Tristan
[1] https://github.com/infinispan/infinispan/pull/2589
10 years, 6 months
OGM, Hot Rod and Grouping API
by Davide D'Alto
Hi all,
some time ago we talked on the mailing list about the integration between
Hibernate OGM and Hot Rod.
To achieve this we would need to include the grouping API in the Hot Rod
protocol and to add a couple of methods in the grouping API:
- to get the keys in a group
- to remove the keys in a group
Mircea created an experimental stub where the method " <G, KG> Set<KG>
getGroupKeys(G group) " is added to the Cache interface.
I've rebased the branch to the latest changes (I might have introduce some
errors): https://github.com/DavideD/infinispan/compare/ISPN-3981
I should have implemented the methods but I haven't had the time to work on
these features.
There are also two issues to keep track of this:
https://issues.jboss.org/browse/ISPN-3732
https://issues.jboss.org/browse/ISPN-3981
As far as I know, the API for Infinispan 7 is going to be freezed soon,
I was wondering if this changes have been taken into account and,
if not, is it possible to include them?
Thanks,
Davide
10 years, 6 months
Cutting down on dependencies
by Sanne Grinovero
I'm starting from the most obvious place: looking for unused dependencies.
Some dependencies are declared in the parent pom.xml, but it turns out
nobody actually needed them :-)
I'm trying to remove them all, one by one.. a slow background process
as I want the full testsuite to complete for each removal.
It would be easier if no dependency whatsoever was added to the parent
pom; for example we also have a situation in which we would like to
test:
# if anything works without any transactionmananager available on classpath
# that Log4J is really optional (will it blow up at runtime if it's not there?)
# I need a specific module to _not_ have TestNG on classpath because
of conflicts
None of these are possible as these dependencies (Log4j, Narayana,
TestNG) are mandated globally.
I'm working to gradually remove all global dependencies from the
parent pom: modules which need them should declare them explicitly.
Please support me by never ever adding any global dependency to the parent?
Essentially I think no jar belongs there, but we could make an
exception for things like net.jcip:jcip-annotations.
Cheers,
Sanne
10 years, 6 months
Performance validation of Remote & Embedded Query functionality
by Sanne Grinovero
Hi Radim, all,
I'm in a face to face meeting with Adrian and Gustavo, to make plans
for the next steps of Query development.
One thing which is essential is of course having some idea of its
scalability and general performance characteristics, both to identify
the low hanging fruits which might be in the code, to be able to give
some guidance to users on expectations, and also to know which
configuration options are better for each use case: I have some
expectations but these haven't been validated on the new-gen Query
functionality.
I was assuming that we would have to develop it: we need it to be able
to get working on our laptops as a first step to use to identify the
most obvious mistakes, as well as make it possible to validate in the
QA lab on more realistic hardware configurations, when the most
obvious issues will have been resolved.
Martin suggested that you have this task too as one of the next goals,
so let's develop it together?
We couldn't think of a cool example to use as a model, but roughly
this is what we aim to cover, and the data we aim to collect;
suggestions are very welcome:
## Benchmark 1: Remote Queries (over Hot Rod)
Should perform a (weighted) mixture of the following Read and Write operations:
(R) Ranges (with and without pagination)
(R) Exact
(R) All the above, combined (with and without pagination)
(W) Insert an entry
(W) Delete an entry
(W) Update an entry
Configuration options
- data sizes: let's aim at having a consistent data set of at least 4GB.
- 1 node / 4 nodes / 8 nodes
- REPL/DIST for the Data storing Cache
- variable ratio of results out of the index (Query retrieves just 5
entries out of a million vs. half a million)
- control ratio of operations; eg. : no writes / mostly writes / just
Range queries
- for write operations: make sure to trigger some Merge events
- SYNC / ASYNC indexing backend and control IndexWriting tuning
- NRT / Non-NRT backends (Infinispan IndexManager only available as non-NRT)
- FSDirectory / InfinispanDirectory
-- Infinispan Directory: Stored in REPL / DIST independently
from the Data Cache
: With/Without CacheStore
- Have an option to run "Index-Less" (the tablescan approach)
- Have an option to validate that the queries are returning the expected results
Track:
- response time: all samples, build distribution of outliers, output
histograms.
- break down response time of different phases: be able to generate
an histogram of a specific phase only.
- Count the number of RPCs generated by a specific operation
- Count the number of CacheStore writes/reads being triggered
- number of parallel requests it can handle
Data:
It could be random generated but in that case let's have it use a
fixed seed and make sure it generates the same data set at each run,
probably depending just on the target size.
We should also test for distribution of properties of the searched
fields, since we want to be able to predict results to validate them
(or find a different way to validate).
Having a random generator makes preparation faster and allows us to
generate a specific data size, but in alternative we could download
some known public data set; assertions on validity of queries would be
much simpler.
I would like to set specific goals to be reached for each metric, but
let's see the initial results first. We should then also narrow down
the configuration option combinations that we actually want to run in
a set of defined profiles to match common use cases, but let's have
the code ready to run any combination.
## Benchmark 2: Embedded Queries
Same tests as Remote Queries (using the same API, so no full-text).
We might want to develop this one first for simplicity, but results
for the Remote Query functionality are more urgent.
## Benchmark 3: CapeDwarf & Objectify
Help the CapeDwarf team by validating embedded queries; it's useful
for us to have a benchmark running a more complex application. I'm not
too familiar with RadarGun, do you think this could be created as a
RadarGun job, so to have the benchmark run regularly and simplify
setup ?
## Benchmark 4: Hibernate OGM
Another great use case for a more complex test ;-)
The remote support for OGM still needs some coding though, but we
could start looking at the embedded mode.
Priorities?
Some of these are totally independent, but we don't have many hands to
work on it.
I'm going to move this to a wiki, unless I get some "revolutionary" suggestions.
Cheers,
Sanne
10 years, 6 months
"Unknown Yet"
by Manik Surtani
Really? Do we not like beer anymore? :-)
10 years, 6 months