[hibernate-dev] [infinispan-dev] Feedback on Infinispan patch

Sat Sep 12 13:15:23 EDT 2009

Hi Łukasz,
a web application is certainly a good use case for your work, but it's
very hard to properly stress test your code this way.
JMeter is going to provide you with nice graphs, still you're not
focusing on the interesting code but on a much larger stack, including
HTTP communication, which is not your goal.
What is your goal now?
There are several interesting aspects to test, some approaches:
1)one node reading/writing tests, from one thread.
2)one node reading/writing tests, from several (100+) threads
3)more nodes (at least 4?), several threads each, one node writing
stuff and the others and finding it
4)more nodes, each one making changes and finding stuff (using your
jgroups backend)

The difficulty is mostly to define what a correct answer is, as after
node A has written something ("x") and you want to test if it's going
to be found by a thread in node B, we're going to need something like
a coordinator (a database?) to define if the "x" should be found, or
not.

So a first step could be making some tests without this coordination:
starting your classes directly and spawning many threads, just to
verify the system doesn't crash or deadlock; this is a good setup to
take some timings and measure the needed resources,
like memory consumption to detect memory leaks.
After that using more code from the Hibernate Search stack we could
try using Hibernate entities, and verify the index and the DB content
are always in synch: I hope it's safe to assume that a database will
be able to play the coordination role quite well, as each change is
going to be a transaction.
Currently Hibernate Search will skip results found in the index but
not found in the database without complaining (as the index changes
could be applied asynchronously this is not an error), so we need to
skip Search's query API and go to a direct comparison of database and
index. Basically we could spawn 100 threads doing writes and deletes,
stop the threads, flush all changes and then verify the index and
database are in synch. Then repeat this in several JVM's, changing the
configuration only so that they use an Infinispan shared index, and a
shared database of course: at the end each node has to verify that his
own view of the index is containing the same entities found in the DB,
no more no less. It's probably safe to assume then that they are all
equal without the need to execute queries.

WDYT?

I'm available in chat if you prefer, we could write this last part
together or divide the tasks but I'm going to need to see your code
;-)

cheers,
Sanne

2009/9/12 Łukasz Moreń <lukasz.moren at gmail.com>:
> Hi Sanne,
>
> I think Lucene dir works ok. I improved it following suggestions and fixed
> bugs that came up.
> Now I'm doing some performance tests, and would like to compare it with
> other Directories.
> To do it, I've created web app, that inserts and queries data (one insert
> and one query per request).
> With JMeter I'm simulating dozens of request inserting and quering data
> simultaneously.
> I don't have experience with performance test. if you have some suggestions
> how to better do it plese send me.
>
> I hope on Monday I will send patch.
>
> Cheers,
> Lukasz
>
>
> 2009/9/12 Sanne Grinovero <sanne.grinovero at gmail.com>
>>
>> Hi Łukasz,
>> what are the news about the Lucene Directory?
>> I'm very eager to test it, and have some time to help you if needed.
>>
>> Is there an updated patch to see?
>>
>> How are you testing it? Maybe I could help on that?
>>
>> Also we're going to need some "glue" to integrate the first part of your
>> work
>> (the jgroups backend) with the second part (the directory), as the jgroups
>> backend will need to choose a single node to be used for indexwrites; if
>> this
>> node is removed from the cluster a new one should be elected.
>> Manik had commented about this:
>> "One way to do this is to use the JGroups coordinator as the master..."
>> but since then the discussion on this was discontinued.
>>
>> Sanne
>>
>>
>> 2009/8/26 Łukasz Moreń <lukasz.moren at gmail.com>:
>> > Hi, thanks for comments ant tips. I'm improving it.
>> > Yes, I was checking with profiler tool and hashcode - even not so heavy
>> > -
>> > was called often summary took some time.
>> > There is one test where multiple threads can read or write from/to
>> > different
>> > cache instances. However I think would be good to do some real test e.g.
>> > with JMeter on sample app.
>> >
>> >
>> > 2009/8/26 Manik Surtani <manik at jboss.org>
>> >>
>> >> Hi there - all looks good.  Some comments:
>> >>
>> >> Summary documentation - is this going to be published on a wiki page or
>> >> something somewhere?  Especially the Infinispan bit?  I think people
>> >> will
>> >> find this info very useful...
>> >> CacheKey - if this class is what everything is going to be used in the
>> >> cache, for performance you should cache the hashcode.  Calculate it
>> >> once and
>> >> then cache it as an instance variable.  If this class is immutable it
>> >> can be
>> >> done on construction, even.  Infinispan uses hashcode() a lot.  :-)
>> >>  But
>> >> then again, depending on how many entries live in the cache, the
>> >> overhead of
>> >> an extra int for every entry may be heavy ...
>> >> LockCacheKey - is probably more performant if this is implemented as a
>> >> boolean flag on CacheKey.  Then you won't need to look at the class
>> >> type
>> >> when working out hashcodes
>> >> Have you written any stress or performance tests?
>> >>
>> >> Cheers
>> >> Manik
>> >> On 23 Aug 2009, at 22:53, Łukasz Moreń wrote:
>> >>
>> >> Hi,
>> >> Yes, I can adjust the patch next days. I've just noticed that I send
>> >> summary in not friendly format :), better one is now attached.
>> >> There is explanation for yours questions below.
>> >>
>> >>
>> >> 2009/8/23 Emmanuel Bernard <emmanuel at hibernate.org>
>> >>>
>> >>> Hey Lukasz,
>> >>> Your patch looks quite good and pass tests on my side.
>> >>> I encourage others to check out the patch before we apply it (ideally
>> >>> another person form HSearch and one person from infinispan.
>> >>>
>> >>> Lukasz, I have a few questions/remarks though before applying it. Can
>> >>> you
>> >>> answer / adjust the patch?
>> >>> IndexWriterSetting
>> >>> Why move to return Object in parsing from the initial int?
>> >>
>> >> IndexWriterSetting has to set up MergeScheduler in IndexWriter. Before,
>> >> parsing was responsible for number conversion from String to int. Now I
>> >> have
>> >> to parse class name, and build/return MergeScheduler from it.
>> >>
>> >>>
>> >>> Move DPHelper#createInfinispanCacheManager to IDP
>> >>> this is not something that can be shared as it creates a hard
>> >>> dependency
>> >>> on infinispan otherwise.
>> >>> in createInfinispanCacheManager
>> >>> Don't log in error the fact that xml is not used if a default config
>> >>> is
>> >>> used. Just log in trace at best.
>> >>> Rename InfinispanCacheManagerConfigurationImpl to
>> >>> DefaultInfinispanCacheManagerConfiguration or even better with a name
>> >>> describing nicely the behavior of the infinispan config.
>> >>> in InfinispanIndexOutput, is it possible to get writeBytes bigger than
>> >>> buffer size? If yes, does newCheck creates the appropriate numbers of
>> >>> chunks?
>> >>
>> >> Yes it is possible. Writing process is divided into stages, during
>> >> every
>> >> stage can be written max: buffer_size bytes. At the end of the stage
>> >> its
>> >> checked if necessary is new chunk, if so new chunk is created.
>> >>
>> >>>
>> >>> InfinispanDirectoryProvider
>> >>> put the configuration proeprties available in the
>> >>> InfinispanDirectoryProvider javadoc.
>> >>> I think the default cache name should be "Hibernate Search" instead of
>> >>> "HSInfinispanCache". We know it's in infinispan :)
>> >>> what's the try catch opening and closing an IW about? It looks weird.
>> >>
>> >> IW is opened with create=true parameter, first index have to be
>> >> initialized/created. Always next IW is opened with create=false
>> >> parameter,
>> >> then data is appended to exisitng index. Similar things are done in
>> >> other
>> >> DP's.
>> >>>
>> >>> in stop()
>> >>> you don't close the CacheManager? How is that?
>> >>
>> >> Yes. Should be closed.
>> >>
>> >>>
>> >>> InfinispanCacheManagerConfigurationImpl
>> >>> What does "Infinispan-Cluster" correspond to? Why this name? Shouldn't
>> >>> it
>> >>> be "Hibernate Search cluster"?
>> >>> Is it safe to override the GlobalConfiguration? What if JBoss AS use
>> >>> infinispan to run?
>> >>
>> >> This name is used to distinguish cluster used by HSearch - All nodes
>> >> with
>> >> the same name form a group. Yes, rather "Hibernate Search cluster" is
>> >> better
>> >> name. It is safe to modify GlobalConfiguration, there can be set up
>> >> configuration for CacheManager like communication way (JGroups or
>> >> something
>> >> else), stack configuration for JGroups, etc.; where Configuration is
>> >> used to
>> >> configure specific cache. I think just the infinispan cluster name on
>> >> JBoss
>> >> AS have to be different from HSearch, then they will be independent.
>> >>
>> >>>
>> >>> Why the use of DummyTransactionManagerLookup. Doesn't Infinispan guess
>> >>> the right TM depending on the environment? e in JBoss As use the JBoss
>> >>> one
>> >>> etc?  I think GenericTransactionManagerLookup does that.
>> >>
>> >> Yes right, I was testing it with DummyTM and forgot to change it later.
>> >>
>> >>>
>> >>> InfinispanCacheManagerConfiguration
>> >>> some javadoc on the methods would be useful. I don't know what do
>> >>> implement here.
>> >>>
>> >>> Is there a better name for Metadata? Like FileMetadata maybe?
>> >>
>> >> Better FileMetadata or maybe FileHeader.
>> >>>
>> >>> Where is ispn-cache-default-conf.xml used? For tests only? If not: is
>> >>> it
>> >>> possible to use a programmatic version instead and what is "It's a
>> >>> movie
>> >>> cache"?
>> >>
>> >> Yes in tests only so far. However it can be used as a provided default
>> >> configuration. I will send maybe question to infinispan group about
>> >> best
>> >> configuration parameters. "It's a movie cache" it's the name for cache
>> >> configured in ispn-cache-default-conf.xml. In tests in this cache are
>> >> stored
>> >> indexes for entity Movie. Indexes for all other entities are stored in
>> >> default HSInfinispanCache.
>> >>>
>> >>> Emmanuel
>> >>>
>> >>>
>> >>> Begin forwarded message:
>> >>>
>> >>> From: Łukasz Moreń <lukasz.moren at gmail.com>
>> >>> Date: 21 août 2009 02:11:03 HAEC
>> >>> To: Emmanuel Bernard <emmanuel at hibernate.org>
>> >>> Subject: GSoC patch with Infinispan Directory Provider
>> >>> I'm sending patch and piece of documentation - not much but necessary
>> >>> information are included.
>> >>> There are some todos but I didn't manage to finish it yet.
>> >>> I changed maven jgroups dependency to 2.8.beta2, before version was
>> >>> clashed with used by infinispan.
>> >>> In pom file there was dependency on hibernate common annotations
>> >>> 3.2.shapshot. It should't be 3.5?
>> >>> Cheers,
>> >>> Lukasz
>> >>>
>> >>>
>> >>>
>> >>
>> >> <GSoC2009_summary.pdf>_______________________________________________
>> >> infinispan-dev mailing list
>> >> infinispan-dev at lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>
>> >> --
>> >> Manik Surtani
>> >> manik at jboss.org
>> >> Lead, Infinispan
>> >> Lead, JBoss Cache
>> >> http://www.infinispan.org
>> >> http://www.jbosscache.org
>> >>
>> >>
>> >>
>> >
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >
>
>