<div dir="ltr">Hi Pedro,<br><div><div class="gmail_extra"><br><div class="gmail_quote">2018-03-23 13:25 GMT+01:00 Pedro Ruivo <span dir="ltr">&lt;<a href="mailto:pedro@infinispan.org" target="_blank">pedro@infinispan.org</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Thomas,<br>

<br>

Is the test in question using any counter/lock?<br></blockquote><div><br></div><div>I have seen the problem on a test for counters, on another one for locks, as well as well as caches only.<br></div><div>But Vert.x starts the ClusteredLockManager and the CounterManager in all cases (even if no lock/counter is created/used)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

I did see similar behavior with the counter&#39;s in our server test suite.<br>

The partition handling makes the cache degraded because nodes are<br>

starting and stopping concurrently.<br></blockquote><div><br></div><div>As for me I was able to observe the problem even when stopping nodes one after the other and waiting for cluster to go back to HEALTHY status.<br></div><div>Is it possible that the status of the counter and lock caches are not taken into account in cluster health?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

I&#39;m not sure if there are any JIRA to tracking. Ryan, Dan do you know?<br>

If there is none, it should be created.<br>

<br>

I improved the counters by making the cache start lazily when you first<br>

get or define a counter [1]. This workaround solved the issue for us.<br>

<br>

As a workaround for your test suite, I suggest to make sure the caches<br>

(___counter_configuration and org.infinispan.LOCK) have finished their<br>

state transfer before stopping the cache managers, by invoking<br>

DefaultCacheManager.getCache(*<wbr>cache-name*) in all the caches managers.<br>

<br>

Sorry for the inconvenience and the delay in replying.<br></blockquote><div><br></div><div>No problem.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Cheers,<br>

Pedro<br>

<br>

[1] <a href="https://issues.jboss.org/browse/ISPN-8860" rel="noreferrer" target="_blank">https://issues.jboss.org/<wbr>browse/ISPN-8860</a><br>

<div><div class="gmail-h5"><br>

On 21-03-2018 16:16, Thomas SEGISMONT wrote:<br>

&gt; Hi everyone,<br>

&gt;<br>

&gt; I am working on integrating Infinispan 9.2.Final in vertx-infinispan.<br>

&gt; Before merging I wanted to make sure the test suite passed but it<br>

&gt; doesn&#39;t. It&#39;s not the always the same test involved.<br>

&gt;<br>

&gt; In the logs, I see a lot of messages like &quot;After merge (or coordinator<br>

&gt; change), cache still hasn&#39;t recovered a majority of members and must<br>

&gt; stay in degraded mode.<br>

&gt; The context involved are &quot;___counter_configuration&quot; and<br>

&gt; &quot;org.infinispan.LOCKS&quot;<br>

&gt;<br>

&gt; Most often it&#39;s harmless but, sometimes, I also see this exception<br>

&gt; &quot;ISPN000210: Failed to request state of cache&quot;<br>

&gt; Again the cache involved is either &quot;___counter_configuration&quot; or<br>

&gt; &quot;org.infinispan.LOCKS&quot;<br>

&gt; After this exception, the cache manager is unable to stop. It blocks in<br>

&gt; method &quot;terminate&quot; (join on cache future).<br>

&gt;<br>

&gt; I thought the test suite was too rough (we stop all nodes at the same<br>

&gt; time). So I changed it to make sure that:<br>

&gt; - nodes start one after the other<br>

&gt; - a new node is started only when the previous one indicates HEALTHY status<br>

&gt; - nodes stop one after the other<br>

&gt; - a node is stopped only when it indicates HEALTHY status<br>

&gt; Pretty much what we do on Kubernetes for the readiness check actually.<br>

&gt; But it didn&#39;t get any better.<br>

&gt;<br>

&gt; Attached are the logs of such a failing test.<br>

&gt;<br>

&gt; Note that the Vert.x test itself does not fail, it&#39;s only when closing<br>

&gt; nodes that we have issues.<br>

&gt;<br>

&gt; Here&#39;s our XML config:<br>

&gt; <a href="https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml" rel="noreferrer" target="_blank">https://github.com/vert-x3/<wbr>vertx-infinispan/blob/ispn92/<wbr>src/main/resources/default-<wbr>infinispan.xml</a><br>

&gt;<br>

&gt; Does that ring a bell? Do you need more info?<br>

&gt;<br>

&gt; Regards,<br>

&gt; Thomas<br>

&gt;<br>

&gt;<br>

&gt;<br>

</div></div>&gt; ______________________________<wbr>_________________<br>

&gt; infinispan-dev mailing list<br>

&gt; <a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

&gt; <a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>

&gt;<br>

______________________________<wbr>_________________<br>

infinispan-dev mailing list<br>

<a href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>

<a href="https://lists.jboss.org/mailman/listinfo/infinispan-dev" rel="noreferrer" target="_blank">https://lists.jboss.org/<wbr>mailman/listinfo/infinispan-<wbr>dev</a><br>

</blockquote></div><br></div></div></div>