[infinispan-dev] 9.2 EmbeddedCacheManager blocked at shutdown
Pedro Ruivo
pedro at infinispan.org
Tue Mar 27 05:08:21 EDT 2018
On 27-03-2018 09:03, Sebastian Laskawiec wrote:
> At the moment, the cluster health status checker enumerates all caches
> in the cache manager [1] and checks whether those cashes are running and
> not in degraded more [2].
>
> I'm not sure how counter caches have been implemented. One thing is for
> sure - they should be taken into account in this loop [3].
The private caches aren't listed by CacheManager.getCacheNames(). We
have to check them via InternalCacheRegistry.getInternalCacheNames().
I'll open a JIRA if you don't mind :)
>
> [1]
> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L22
> [2]
> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/CacheHealthImpl.java#L25
> [3]
> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L23-L24
>
> On Mon, Mar 26, 2018 at 1:59 PM Thomas SEGISMONT <tsegismont at gmail.com
> <mailto:tsegismont at gmail.com>> wrote:
>
> 2018-03-26 13:16 GMT+02:00 Pedro Ruivo <pedro at infinispan.org
> <mailto:pedro at infinispan.org>>:
>
>
>
> On 23-03-2018 15:06, Thomas SEGISMONT wrote:
> > Hi Pedro,
> >
> > 2018-03-23 13:25 GMT+01:00 Pedro Ruivo <pedro at infinispan.org <mailto:pedro at infinispan.org>
> > <mailto:pedro at infinispan.org <mailto:pedro at infinispan.org>>>:
> >
> > Hi Thomas,
> >
> > Is the test in question using any counter/lock?
> >
> >
> > I have seen the problem on a test for counters, on another one for
> > locks, as well as well as caches only.
> > But Vert.x starts the ClusteredLockManager and the CounterManager in all
> > cases (even if no lock/counter is created/used)
> >
> >
> > I did see similar behavior with the counter's in our server test suite.
> > The partition handling makes the cache degraded because nodes are
> > starting and stopping concurrently.
> >
> >
> > As for me I was able to observe the problem even when stopping nodes one
> > after the other and waiting for cluster to go back to HEALTHY status.
> > Is it possible that the status of the counter and lock caches are not
> > taken into account in cluster health?
>
> The counter and lock caches are private. So, they aren't in the
> cluster
> health neither their name are returned by getCacheNames() method.
>
>
> Thanks for the details.
>
> I'm not concerned with these internal caches not being listed when
> calling getCacheNames.
>
> However, the cluster health status should include their status as well.
> Cluster status testing is the recommended way to implement readiness
> checks on Kubernetes for example.
>
> What do you think Sebastian?
>
>
> >
> >
> > I'm not sure if there are any JIRA to tracking. Ryan, Dan
> do you know?
> > If there is none, it should be created.
> >
> > I improved the counters by making the cache start lazily
> when you first
> > get or define a counter [1]. This workaround solved the
> issue for us.
> >
> > As a workaround for your test suite, I suggest to make
> sure the caches
> > (___counter_configuration and org.infinispan.LOCK) have
> finished their
> > state transfer before stopping the cache managers, by
> invoking
> > DefaultCacheManager.getCache(*cache-name*) in all the
> caches managers.
> >
> > Sorry for the inconvenience and the delay in replying.
> >
> >
> > No problem.
> >
> >
> > Cheers,
> > Pedro
> >
> > [1] https://issues.jboss.org/browse/ISPN-8860
> > <https://issues.jboss.org/browse/ISPN-8860>
> >
> > On 21-03-2018 16:16, Thomas SEGISMONT wrote:
> > > Hi everyone,
> > >
> > > I am working on integrating Infinispan 9.2.Final in
> vertx-infinispan.
> > > Before merging I wanted to make sure the test suite
> passed but it
> > > doesn't. It's not the always the same test involved.
> > >
> > > In the logs, I see a lot of messages like "After merge (or
> > coordinator
> > > change), cache still hasn't recovered a majority of
> members and must
> > > stay in degraded mode.
> > > The context involved are "___counter_configuration" and
> > > "org.infinispan.LOCKS"
> > >
> > > Most often it's harmless but, sometimes, I also see
> this exception
> > > "ISPN000210: Failed to request state of cache"
> > > Again the cache involved is either
> "___counter_configuration" or
> > > "org.infinispan.LOCKS"
> > > After this exception, the cache manager is unable to
> stop. It
> > blocks in
> > > method "terminate" (join on cache future).
> > >
> > > I thought the test suite was too rough (we stop all
> nodes at the same
> > > time). So I changed it to make sure that:
> > > - nodes start one after the other
> > > - a new node is started only when the previous one
> indicates
> > HEALTHY status
> > > - nodes stop one after the other
> > > - a node is stopped only when it indicates HEALTHY status
> > > Pretty much what we do on Kubernetes for the readiness
> check
> > actually.
> > > But it didn't get any better.
> > >
> > > Attached are the logs of such a failing test.
> > >
> > > Note that the Vert.x test itself does not fail, it's
> only when
> > closing
> > > nodes that we have issues.
> > >
> > > Here's our XML config:
> > >
> >
> https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml
> >
> <https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml>
> > >
> > > Does that ring a bell? Do you need more info?
> > >
> > > Regards,
> > > Thomas
> > >
> > >
> > >
> > > _______________________________________________
> > > infinispan-dev mailing list
> > > infinispan-dev at lists.jboss.org
> <mailto:infinispan-dev at lists.jboss.org>
> > <mailto:infinispan-dev at lists.jboss.org
> <mailto:infinispan-dev at lists.jboss.org>>
> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> > <https://lists.jboss.org/mailman/listinfo/infinispan-dev>
> > >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> <mailto:infinispan-dev at lists.jboss.org>
> <mailto:infinispan-dev at lists.jboss.org
> <mailto:infinispan-dev at lists.jboss.org>>
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> > <https://lists.jboss.org/mailman/listinfo/infinispan-dev>
> >
> >
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> <mailto:infinispan-dev at lists.jboss.org>
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> <mailto:infinispan-dev at lists.jboss.org>
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
More information about the infinispan-dev
mailing list