[infinispan-dev] Infinispan Large Scale support

david marion dlmarion at hotmail.com
Tue Apr 5 08:51:44 EDT 2011


Bela,
 
  Yes, it is a replicated cache and I used your udp-largecluster.xml file and just modified it slightly. It does appear that the distributed cache is in a deadlock (or there is a race condition), the coordinator comes up, but the other caches do not, they sit there and wait. I was able to get a distributed cache up and running on 100+ nodes, now I cannot get 5 of them running. 
 
> Date: Tue, 5 Apr 2011 11:09:54 +0200
> From: bban at redhat.com
> To: infinispan-dev at lists.jboss.org
> Subject: Re: [infinispan-dev] Infinispan Large Scale support
> 
> 
> 
> On 4/4/11 5:45 PM, david marion wrote:
> >
> >
> > Good news! I was able to use the system property from ISPN-83 and remove the FLUSH from the jgroups config with 4.2.1.FINAL, and start-up times are much much better. We have a replicated cache on about 420+ nodes up in under 2 minutes.
> 
> 
> Great ! Just to confirm: this is 420+ Infinispan instances, with 
> replication enabled, correct ?
> 
> Did you use a specific JGroups config (e.g. udp-largecluster.xml) ?
> 
> 
> > I am seeing an issue with the distributed cache though with as little as 5 nodes.
> >
> > In the coordinator log I see
> >
> > org.infinispan.distribution.DistributionmanagerImpl: Detected a view change. Member list changed.......
> > org.infinispan.distribution.DistributionmanagerImpl: This is a JOIN event! Wait for notification from new joiner<name>
> >
> > In the log from the joining node I see:
> >
> > org.infinispan.distribution.JoinTask: Commencing rehash on node:<name>. Before start, distributionManager.joinComplete=false
> > org.infinispan.distribution.JoinTask: Requesting old consistent hash from coordinator
> >
> > I jstack'd the joiner, the DefaultCacheManager.getCache() method is waiting on org.infinispan.distribution.DistributionManagerImpl.waitForJoinToComplete() and the Rehasher thread
> > is waiting on:
> >
> > at org.infinispan.util.concurrent.ReclosableLatch.await(ReclosableLatch.java:75)
> > at org.infinipsan.remoting.transport.jgroups.JGroupsDistSync.blockUntilNoJoinsInProgress(JGroupsDistSync.java:113)
> >
> > Any thoughts?
> 
> 
> I recently took a look at the distribution code, and this part is very 
> brittle with respect to parallel startup and merging. Plus, I believe 
> the (blocking) RPC to fetch the old CH from the coordinator might 
> deadlock in certain cases...
> 
> I've got a pull request for a push based rebalancing versus pull based 
> rebalancing pending. It'll likely make it into 5.x, as a matter of fact 
> I've got a chat about this this afternoon.
> 
> 
> 
> 
> >> Date: Wed, 23 Mar 2011 15:58:19 +0100
> >> From: bban at redhat.com
> >> To: infinispan-dev at lists.jboss.org
> >> Subject: Re: [infinispan-dev] Infinispan Large Scale support
> >>
> >>
> >>
> >> On 3/23/11 2:39 PM, david marion wrote:
> >>>
> >>> Bela,
> >>>
> >>> Is there a way to start up the JGroups stack on every node without using Infinispan?
> >>
> >>
> >> You could use ViewDemo [1] or Draw. Or write your own small test
> >> program; if you take a look at ViewDemo's src, you'll see that it's onyl
> >> a page of code.
> >>
> >>
> >>> Is there some functional test that I can run or something? I know I can't remove the FLUSH from Infinispan until 5.0.0 and I don't know if I can upgrade the underlying
> >>> JGroups jar.
> >>
> >>
> >> I suggest test with the latest JGroups (2.12.0) and +FLUSH and -FLUSH.
> >> The +FLUSH config should be less painful now, with the introduction of
> >> view bundling: we need to run flush fewer times than before.
> >>
> >>
> >> [1] http://community.jboss.org/wiki/TestingJBoss
> >>
> >> --
> >> Bela Ban
> >> Lead JGroups / Clustering Team
> >> JBoss
> >> _______________________________________________
> >> infinispan-dev mailing list
> >> infinispan-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> > 
> >
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> -- 
> Bela Ban
> Lead JGroups / Clustering Team
> JBoss
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20110405/31692ec0/attachment-0001.html 


More information about the infinispan-dev mailing list