Thanks Sebastian. Is there a JIRA for this already?
2018-03-27 10:03 GMT+02:00 Sebastian Laskawiec <slaskawi(a)redhat.com>:
At the moment, the cluster health status checker enumerates all
caches in
the cache manager [1] and checks whether those cashes are running and not
in degraded more [2].
I'm not sure how counter caches have been implemented. One thing is for
sure - they should be taken into account in this loop [3].
[1]
https://github.com/infinispan/infinispan/blob/
master/core/src/main/java/org/infinispan/health/impl/
ClusterHealthImpl.java#L22
[2]
https://github.com/infinispan/infinispan/blob/
master/core/src/main/java/org/infinispan/health/impl/
CacheHealthImpl.java#L25
[3]
https://github.com/infinispan/infinispan/blob/
master/core/src/main/java/org/infinispan/health/impl/
ClusterHealthImpl.java#L23-L24
On Mon, Mar 26, 2018 at 1:59 PM Thomas SEGISMONT <tsegismont(a)gmail.com>
wrote:
> 2018-03-26 13:16 GMT+02:00 Pedro Ruivo <pedro(a)infinispan.org>:
>
>>
>>
>> On 23-03-2018 15:06, Thomas SEGISMONT wrote:
>> > Hi Pedro,
>> >
>> > 2018-03-23 13:25 GMT+01:00 Pedro Ruivo <pedro(a)infinispan.org
>> > <mailto:pedro@infinispan.org>>:
>> >
>> > Hi Thomas,
>> >
>> > Is the test in question using any counter/lock?
>> >
>> >
>> > I have seen the problem on a test for counters, on another one for
>> > locks, as well as well as caches only.
>> > But Vert.x starts the ClusteredLockManager and the CounterManager in
>> all
>> > cases (even if no lock/counter is created/used)
>> >
>> >
>> > I did see similar behavior with the counter's in our server test
>> suite.
>> > The partition handling makes the cache degraded because nodes are
>> > starting and stopping concurrently.
>> >
>> >
>> > As for me I was able to observe the problem even when stopping nodes
>> one
>> > after the other and waiting for cluster to go back to HEALTHY status.
>> > Is it possible that the status of the counter and lock caches are not
>> > taken into account in cluster health?
>>
>> The counter and lock caches are private. So, they aren't in the cluster
>> health neither their name are returned by getCacheNames() method.
>>
>
> Thanks for the details.
>
> I'm not concerned with these internal caches not being listed when
> calling getCacheNames.
>
> However, the cluster health status should include their status as well.
> Cluster status testing is the recommended way to implement readiness
> checks on Kubernetes for example.
>
> What do you think Sebastian?
>
>
>>
>> >
>> >
>> > I'm not sure if there are any JIRA to tracking. Ryan, Dan do you
>> know?
>> > If there is none, it should be created.
>> >
>> > I improved the counters by making the cache start lazily when you
>> first
>> > get or define a counter [1]. This workaround solved the issue for
>> us.
>> >
>> > As a workaround for your test suite, I suggest to make sure the
>> caches
>> > (___counter_configuration and org.infinispan.LOCK) have finished
>> their
>> > state transfer before stopping the cache managers, by invoking
>> > DefaultCacheManager.getCache(*cache-name*) in all the caches
>> managers.
>> >
>> > Sorry for the inconvenience and the delay in replying.
>> >
>> >
>> > No problem.
>> >
>> >
>> > Cheers,
>> > Pedro
>> >
>> > [1]
https://issues.jboss.org/browse/ISPN-8860
>> > <
https://issues.jboss.org/browse/ISPN-8860>
>> >
>> > On 21-03-2018 16:16, Thomas SEGISMONT wrote:
>> > > Hi everyone,
>> > >
>> > > I am working on integrating Infinispan 9.2.Final in
>> vertx-infinispan.
>> > > Before merging I wanted to make sure the test suite passed but
>> it
>> > > doesn't. It's not the always the same test involved.
>> > >
>> > > In the logs, I see a lot of messages like "After merge (or
>> > coordinator
>> > > change), cache still hasn't recovered a majority of members
and
>> must
>> > > stay in degraded mode.
>> > > The context involved are "___counter_configuration" and
>> > > "org.infinispan.LOCKS"
>> > >
>> > > Most often it's harmless but, sometimes, I also see this
>> exception
>> > > "ISPN000210: Failed to request state of cache"
>> > > Again the cache involved is either
"___counter_configuration" or
>> > > "org.infinispan.LOCKS"
>> > > After this exception, the cache manager is unable to stop. It
>> > blocks in
>> > > method "terminate" (join on cache future).
>> > >
>> > > I thought the test suite was too rough (we stop all nodes at
>> the same
>> > > time). So I changed it to make sure that:
>> > > - nodes start one after the other
>> > > - a new node is started only when the previous one indicates
>> > HEALTHY status
>> > > - nodes stop one after the other
>> > > - a node is stopped only when it indicates HEALTHY status
>> > > Pretty much what we do on Kubernetes for the readiness check
>> > actually.
>> > > But it didn't get any better.
>> > >
>> > > Attached are the logs of such a failing test.
>> > >
>> > > Note that the Vert.x test itself does not fail, it's only
when
>> > closing
>> > > nodes that we have issues.
>> > >
>> > > Here's our XML config:
>> > >
>> >
https://github.com/vert-x3/vertx-infinispan/blob/ispn92/
>> src/main/resources/default-infinispan.xml
>> > <
https://github.com/vert-x3/vertx-infinispan/blob/ispn92/
>> src/main/resources/default-infinispan.xml>
>> > >
>> > > Does that ring a bell? Do you need more info?
>> > >
>> > > Regards,
>> > > Thomas
>> > >
>> > >
>> > >
>> > > _______________________________________________
>> > > infinispan-dev mailing list
>> > > infinispan-dev(a)lists.jboss.org
>> > <mailto:infinispan-dev@lists.jboss.org>
>> > >
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> > <
https://lists.jboss.org/mailman/listinfo/infinispan-dev>
>> > >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev(a)lists.jboss.org <mailto:infinispan-dev@lists.
>> jboss.org>
>> >
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> > <
https://lists.jboss.org/mailman/listinfo/infinispan-dev>
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev(a)lists.jboss.org
>> >
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev