<div dir="ltr">Thanks Sebastian. Is there a JIRA for this already?<br></div><div class="gmail_extra"><br><div class="gmail_quote">2018-03-27 10:03 GMT+02:00 Sebastian Laskawiec <span dir="ltr"><<a href="mailto:slaskawi@redhat.com" target="_blank">slaskawi@redhat.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">At the moment, the cluster health status checker enumerates all caches in the cache manager [1] and checks whether those cashes are running and not in degraded more [2].<div><br></div><div>I'm not sure how counter caches have been implemented. One thing is for sure - they should be taken into account in this loop [3].<br><div><br></div><div>[1] <a href="https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L22" target="_blank">https://github.com/<wbr>infinispan/infinispan/blob/<wbr>master/core/src/main/java/org/<wbr>infinispan/health/impl/<wbr>ClusterHealthImpl.java#L22</a></div><div>[2] <a href="https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/CacheHealthImpl.java#L25" target="_blank">https://github.com/<wbr>infinispan/infinispan/blob/<wbr>master/core/src/main/java/org/<wbr>infinispan/health/impl/<wbr>CacheHealthImpl.java#L25</a></div></div><div>[3] <a href="https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L23-L24" target="_blank">https://github.com/<wbr>infinispan/infinispan/blob/<wbr>master/core/src/main/java/org/<wbr>infinispan/health/impl/<wbr>ClusterHealthImpl.java#L23-L24</a></div></div><div class="HOEnZb"><div class="h5"><br><div class="gmail_quote"><div dir="ltr">On Mon, Mar 26, 2018 at 1:59 PM Thomas SEGISMONT <<a href="mailto:tsegismont@gmail.com" target="_blank">tsegismont@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">2018-03-26 13:16 GMT+02:00 Pedro Ruivo <span dir="ltr"><<a href="mailto:pedro@infinispan.org" target="_blank">pedro@infinispan.org</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><br>
<br>
On 23-03-2018 15:06, Thomas SEGISMONT wrote:<br>
> Hi Pedro,<br>
><br>
> 2018-03-23 13:25 GMT+01:00 Pedro Ruivo <<a href="mailto:pedro@infinispan.org" target="_blank">pedro@infinispan.org</a><br>
</span>> <mailto:<a href="mailto:pedro@infinispan.org" target="_blank">pedro@infinispan.org</a>>><wbr>:<br>
<span>><br>
> Hi Thomas,<br>
><br>
> Is the test in question using any counter/lock?<br>
><br>
><br>
> I have seen the problem on a test for counters, on another one for<br>
> locks, as well as well as caches only.<br>
> But Vert.x starts the ClusteredLockManager and the CounterManager in all<br>
> cases (even if no lock/counter is created/used)<br>
><br>
><br>
> I did see similar behavior with the counter's in our server test suite.<br>
> The partition handling makes the cache degraded because nodes are<br>
> starting and stopping concurrently.<br>
><br>
><br>
> As for me I was able to observe the problem even when stopping nodes one<br>
> after the other and waiting for cluster to go back to HEALTHY status.<br>
> Is it possible that the status of the counter and lock caches are not<br>
> taken into account in cluster health?<br>
<br>
</span>The counter and lock caches are private. So, they aren't in the cluster<br>
health neither their name are returned by getCacheNames() method.<br></blockquote><div><br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Thanks for the details.<br><br></div><div>I'm not concerned with these internal caches not being listed when calling getCacheNames.<br><br></div><div>However, the cluster health status should include their status as well.<br></div><div>Cluster status testing is the recommended way to implement readiness checks on Kubernetes for example.<br><br></div><div>What do you think Sebastian?<br></div></div></div></div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="m_-2735177905698863571m_-1628290978142246603h5"><br>
><br>
><br>
> I'm not sure if there are any JIRA to tracking. Ryan, Dan do you know?<br>
> If there is none, it should be created.<br>
><br>
> I improved the counters by making the cache start lazily when you first<br>
> get or define a counter [1]. This workaround solved the issue for us.<br>
><br>
> As a workaround for your test suite, I suggest to make sure the caches<br>
> (___counter_configuration and org.infinispan.LOCK) have finished their<br>
> state transfer before stopping the cache managers, by invoking<br>
> DefaultCacheManager.getCache(<wbr>*cache-name*) in all the caches managers.<br>
><br>
> Sorry for the inconvenience and the delay in replying.<br>
><br>
><br>
> No problem.<br>
><br>
><br>
> Cheers,<br>
> Pedro<br>
><br>
> [1] <a href="https://issues.jboss.org/browse/ISPN-8860" rel="noreferrer" target="_blank">https://issues.jboss.org/<wbr>browse/ISPN-8860</a><br>
> <<a href="https://issues.jboss.org/browse/ISPN-8860" rel="noreferrer" target="_blank">https://issues.jboss.org/<wbr>browse/ISPN-8860</a>><br>
><br>
> On 21-03-2018 16:16, Thomas SEGISMONT wrote:<br>
> > Hi everyone,<br>
> ><br>
> > I am working on integrating Infinispan 9.2.Final in vertx-infinispan.<br>
> > Before merging I wanted to make sure the test suite passed but it<br>
> > doesn't. It's not the always the same test involved.<br>
> ><br>
> > In the logs, I see a lot of messages like "After merge (or<br>
> coordinator<br>
> > change), cache still hasn't recovered a majority of members and must<br>
> > stay in degraded mode.<br>
> > The context involved are "___counter_configuration" and<br>
> > "org.infinispan.LOCKS"<br>
> ><br>
> > Most often it's harmless but, sometimes, I also see this exception<br>
> > "ISPN000210: Failed to request state of cache"<br>
> > Again the cache involved is either "___counter_configuration" or<br>
> > "org.infinispan.LOCKS"<br>
> > After this exception, the cache manager is unable to stop. It<br>
> blocks in<br>
> > method "terminate" (join on cache future).<br>
> ><br>
> > I thought the test suite was too rough (we stop all nodes at the same<br>
> > time). So I changed it to make sure that:<br>
> > - nodes start one after the other<br>
> > - a new node is started only when the previous one indicates<br>
> HEALTHY status<br>
> > - nodes stop one after the other<br>
> > - a node is stopped only when it indicates HEALTHY status<br>
> > Pretty much what we do on Kubernetes for the readiness check<br>
> actually.<br>
> > But it didn't get any better.<br>
> ><br>
> > Attached are the logs of such a failing test.<br>
> ><br>
> > Note that the Vert.x test itself does not fail, it's only when<br>
> closing<br>
> > nodes that we have issues.<br>
> ><br>
> > Here's our XML config:<br>
> ><br>
> <a href="https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml" rel="noreferrer" target="_blank">https://github.com/vert-x3/<wbr>vertx-infinispan/blob/ispn92/<wbr>src/main/resources/default-<wbr>infinispan.xml</a><br>
> <<a href="https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml" rel="noreferrer" target="_blank">https://github.com/vert-x3/<wbr>vertx-infinispan/blob/ispn92/<wbr>src/main/resources/default-<wbr>infinispan.xml</a>><br>
> ><br>
> > Does that ring a bell? Do you need more info?<br>
> ><br>
> > Regards,<br>
> > Thomas<br>
> ><br>
> ><br>
> ><br>
> > ______________________________<wbr>_________________<br>
> > infinispan-dev mailing list<br>
> > <a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.org</a><br>
</div></div>> <mailto:<a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.<wbr>jboss.org</a>><br>
> > <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>
<span>> <<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a>><br>
> ><br>
> _____________________________<wbr>__________________<br>
> infinispan-dev mailing list<br>
</span>> <a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.<wbr>org</a> <mailto:<a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.<wbr>jboss.org</a>><br>
> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>
<div class="m_-2735177905698863571m_-1628290978142246603HOEnZb"><div class="m_-2735177905698863571m_-1628290978142246603h5">> <<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a>><br>
><br>
><br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> infinispan-dev mailing list<br>
> <a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.org</a><br>
> <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>
><br>
______________________________<wbr>_________________<br>
infinispan-dev mailing list<br>
<a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>
</div></div></blockquote></div></div></div>
______________________________<wbr>_________________<br>
infinispan-dev mailing list<br>
<a href="mailto:infinispan-dev@lists.jboss.org" target="_blank">infinispan-dev@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a></blockquote></div>
</div></div><br>______________________________<wbr>_________________<br>
infinispan-dev mailing list<br>
<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>
<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br></blockquote></div><br></div>