[infinispan-dev] 9.2 EmbeddedCacheManager blocked at shutdown

Fri Mar 23 08:25:34 EDT 2018

Hi Thomas,

Is the test in question using any counter/lock?

I did see similar behavior with the counter's in our server test suite. 
The partition handling makes the cache degraded because nodes are 
starting and stopping concurrently.

I'm not sure if there are any JIRA to tracking. Ryan, Dan do you know?
If there is none, it should be created.

I improved the counters by making the cache start lazily when you first 
get or define a counter [1]. This workaround solved the issue for us.

As a workaround for your test suite, I suggest to make sure the caches 
(___counter_configuration and org.infinispan.LOCK) have finished their 
state transfer before stopping the cache managers, by invoking 
DefaultCacheManager.getCache(*cache-name*) in all the caches managers.

Sorry for the inconvenience and the delay in replying.

Cheers,
Pedro

[1] https://issues.jboss.org/browse/ISPN-8860

On 21-03-2018 16:16, Thomas SEGISMONT wrote:
> Hi everyone,
> 
> I am working on integrating Infinispan 9.2.Final in vertx-infinispan. 
> Before merging I wanted to make sure the test suite passed but it 
> doesn't. It's not the always the same test involved.
> 
> In the logs, I see a lot of messages like "After merge (or coordinator 
> change), cache still hasn't recovered a majority of members and must 
> stay in degraded mode.
> The context involved are "___counter_configuration" and 
> "org.infinispan.LOCKS"
> 
> Most often it's harmless but, sometimes, I also see this exception 
> "ISPN000210: Failed to request state of cache"
> Again the cache involved is either "___counter_configuration" or 
> "org.infinispan.LOCKS"
> After this exception, the cache manager is unable to stop. It blocks in 
> method "terminate" (join on cache future).
> 
> I thought the test suite was too rough (we stop all nodes at the same 
> time). So I changed it to make sure that:
> - nodes start one after the other
> - a new node is started only when the previous one indicates HEALTHY status
> - nodes stop one after the other
> - a node is stopped only when it indicates HEALTHY status
> Pretty much what we do on Kubernetes for the readiness check actually.
> But it didn't get any better.
> 
> Attached are the logs of such a failing test.
> 
> Note that the Vert.x test itself does not fail, it's only when closing 
> nodes that we have issues.
> 
> Here's our XML config: 
> https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml
> 
> Does that ring a bell? Do you need more info?
> 
> Regards,
> Thomas
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>