For monitoring, anything that you may have already, going through logs at this scale is
tedious. I created a monitor web page that shows the following for each node:
Node name, coordinator, transport cluster name, first heard, last heard,
entries, and evictions.
This will tell me right away how many nodes are up and if they are all using the same
coordinator. I am running Infinispan in an embedded fashion, so it’s my code that
interrogates each cache and reports the information to the monitor server. It’s actually
been pretty handy because we can see the nodes that either have not started or have not
formed one cluster.
Do you have a number for the JIRA issue?
Dave Marion
From: manik(a)jboss.org
Date: Sat, 19 Mar 2011 10:07:32 +0000
To: infinispan-dev(a)lists.jboss.org
Subject: Re: [infinispan-dev] Infinispan Large Scale support
On 18 Mar 2011, at 21:35, Dave wrote:
Won’t be able to get CR4 uploaded, policy dictates that I wait until final release.
However, I was able to get 431 nodes up and running as a replicated cluster and 115 nodes
up as a distributed cluster. For the 430 node cache, I was able to get it started with no
problems about 50% of the time. When they formed multiple clusters they merged together
only some of the time. It really does appear to be a startup issue at this point. We have
not pushed it hard enough yet to see what happens at this scale under load.
Any idea when CR4 will be FINAL?
Hopefully some time next week.
I have documented the system property on the JIRA.
Are there any tools to help diagnose problems / performance at this scale (I ended up
writing my own monitor program)?
Specifically what tools are you after?
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
twitter.com/maniksurtani
Lead, Infinispan
http://www.infinispan.org
_______________________________________________ infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev