<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Tahoma
}
--></style>
</head>
<body class='hmmessage'>
Bela,<BR>
<BR>
Yes, it is a replicated cache and I used your udp-largecluster.xml file and just modified it slightly. It does appear that the distributed cache is in a deadlock (or there is a race condition), the coordinator comes up, but the other caches do not, they sit there and wait. I was able to get a distributed cache up and running on 100+ nodes, now I cannot get 5 of them running. <BR> <BR>
> Date: Tue, 5 Apr 2011 11:09:54 +0200<BR>> From: bban@redhat.com<BR>> To: infinispan-dev@lists.jboss.org<BR>> Subject: Re: [infinispan-dev] Infinispan Large Scale support<BR>> <BR>> <BR>> <BR>> On 4/4/11 5:45 PM, david marion wrote:<BR>> ><BR>> ><BR>> > Good news! I was able to use the system property from ISPN-83 and remove the FLUSH from the jgroups config with 4.2.1.FINAL, and start-up times are much much better. We have a replicated cache on about 420+ nodes up in under 2 minutes.<BR>> <BR>> <BR>> Great ! Just to confirm: this is 420+ Infinispan instances, with <BR>> replication enabled, correct ?<BR>> <BR>> Did you use a specific JGroups config (e.g. udp-largecluster.xml) ?<BR>> <BR>> <BR>> > I am seeing an issue with the distributed cache though with as little as 5 nodes.<BR>> ><BR>> > In the coordinator log I see<BR>> ><BR>> > org.infinispan.distribution.DistributionmanagerImpl: Detected a view change. Member list changed.......<BR>> > org.infinispan.distribution.DistributionmanagerImpl: This is a JOIN event! Wait for notification from new joiner<name><BR>> ><BR>> > In the log from the joining node I see:<BR>> ><BR>> > org.infinispan.distribution.JoinTask: Commencing rehash on node:<name>. Before start, distributionManager.joinComplete=false<BR>> > org.infinispan.distribution.JoinTask: Requesting old consistent hash from coordinator<BR>> ><BR>> > I jstack'd the joiner, the DefaultCacheManager.getCache() method is waiting on org.infinispan.distribution.DistributionManagerImpl.waitForJoinToComplete() and the Rehasher thread<BR>> > is waiting on:<BR>> ><BR>> > at org.infinispan.util.concurrent.ReclosableLatch.await(ReclosableLatch.java:75)<BR>> > at org.infinipsan.remoting.transport.jgroups.JGroupsDistSync.blockUntilNoJoinsInProgress(JGroupsDistSync.java:113)<BR>> ><BR>> > Any thoughts?<BR>> <BR>> <BR>> I recently took a look at the distribution code, and this part is very <BR>> brittle with respect to parallel startup and merging. Plus, I believe <BR>> the (blocking) RPC to fetch the old CH from the coordinator might <BR>> deadlock in certain cases...<BR>> <BR>> I've got a pull request for a push based rebalancing versus pull based <BR>> rebalancing pending. It'll likely make it into 5.x, as a matter of fact <BR>> I've got a chat about this this afternoon.<BR>> <BR>> <BR>> <BR>> <BR>> >> Date: Wed, 23 Mar 2011 15:58:19 +0100<BR>> >> From: bban@redhat.com<BR>> >> To: infinispan-dev@lists.jboss.org<BR>> >> Subject: Re: [infinispan-dev] Infinispan Large Scale support<BR>> >><BR>> >><BR>> >><BR>> >> On 3/23/11 2:39 PM, david marion wrote:<BR>> >>><BR>> >>> Bela,<BR>> >>><BR>> >>> Is there a way to start up the JGroups stack on every node without using Infinispan?<BR>> >><BR>> >><BR>> >> You could use ViewDemo [1] or Draw. Or write your own small test<BR>> >> program; if you take a look at ViewDemo's src, you'll see that it's onyl<BR>> >> a page of code.<BR>> >><BR>> >><BR>> >>> Is there some functional test that I can run or something? I know I can't remove the FLUSH from Infinispan until 5.0.0 and I don't know if I can upgrade the underlying<BR>> >>> JGroups jar.<BR>> >><BR>> >><BR>> >> I suggest test with the latest JGroups (2.12.0) and +FLUSH and -FLUSH.<BR>> >> The +FLUSH config should be less painful now, with the introduction of<BR>> >> view bundling: we need to run flush fewer times than before.<BR>> >><BR>> >><BR>> >> [1] http://community.jboss.org/wiki/TestingJBoss<BR>> >><BR>> >> --<BR>> >> Bela Ban<BR>> >> Lead JGroups / Clustering Team<BR>> >> JBoss<BR>> >> _______________________________________________<BR>> >> infinispan-dev mailing list<BR>> >> infinispan-dev@lists.jboss.org<BR>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev<BR>> > <BR>> ><BR>> ><BR>> ><BR>> > _______________________________________________<BR>> > infinispan-dev mailing list<BR>> > infinispan-dev@lists.jboss.org<BR>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev<BR>> <BR>> -- <BR>> Bela Ban<BR>> Lead JGroups / Clustering Team<BR>> JBoss<BR>> _______________________________________________<BR>> infinispan-dev mailing list<BR>> infinispan-dev@lists.jboss.org<BR>> https://lists.jboss.org/mailman/listinfo/infinispan-dev<BR>                                            </body>
</html>