[JBoss JIRA] (ISPN-7488) Node tries to register component after stopping
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7488?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec commented on ISPN-7488:
-------------------------------------------
The error message about {{GlobalInboundInvocationHandler}} is absolutely fine. The code [sends nicely goodbye message|https://github.com/infinispan/infinispan/blob/ad0b1cc84778074769a...].
> Node tries to register component after stopping
> -----------------------------------------------
>
> Key: ISPN-7488
> URL: https://issues.jboss.org/browse/ISPN-7488
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations, Core, Server
> Affects Versions: 9.0.0.Beta2
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Custom Infinispan Server build (based on [these instructions|https://github.com/slaskawi/infinispan-1/tree/custom_image]). SHA1 {{2b0731b21649a88a75ed71d21b9cc06ba365e947}}
> Reporter: Sebastian Łaskawiec
> Assignee: Sebastian Łaskawiec
> Priority: Blocker
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-4z05w] 18:09:06,122 WARN [org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler] (remote-thread--p2-t5) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=___script_cache, type=REBALANCE_CONFIRM, sender=transactions-repository-1-1f8dx, joinInfo=null, topologyId=9, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, actualMembers=null, throwable=null, viewId=4}: org.infinispan.commons.CacheConfigurationException: Unable to configure component (type: class org.infinispan.topology.CacheTopologyControlCommand, instance CacheTopologyControlCommand{cache=___script_cache, type=REBALANCE_CONFIRM, sender=transactions-repository-1-1f8dx, joinInfo=null, topologyId=9, rebalanceId=0, currentCH=null, pendingCH=null, availabilityMode=null, actualMembers=null, throwable=null, viewId=4})
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:152)
> [transactions-repository-1-4z05w] at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:160)
> [transactions-repository-1-4z05w] at org.infinispan.util.concurrent.BlockingTaskAwareExecutorServiceImpl$RunnableWrapper.run(BlockingTaskAwareExecutorServiceImpl.java:203)
> [transactions-repository-1-4z05w] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [transactions-repository-1-4z05w] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [transactions-repository-1-4z05w] at java.lang.Thread.run(Thread.java:745)
> [transactions-repository-1-4z05w] Caused by: org.infinispan.IllegalLifecycleStateException: Trying to register a component after stopping: org.infinispan.topology.LocalTopologyManagerFactory
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.registerComponentInternal(AbstractComponentRegistry.java:185)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.registerComponent(AbstractComponentRegistry.java:172)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.registerComponent(AbstractComponentRegistry.java:168)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.createComponentFactoryInternal(AbstractComponentRegistry.java:349)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.getFactory(AbstractComponentRegistry.java:328)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:294)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:247)
> [transactions-repository-1-4z05w] at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:148)
> [transactions-repository-1-4z05w] ... 5 more
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/530241bb695f1f490bcb25eabaf9d676
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Do the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7493) Cannot remove cache configuration while performing Rolling Update
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7493?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec resolved ISPN-7493.
---------------------------------------
Fix Version/s: 9.0.0.CR2
Resolution: Won't Fix
> Cannot remove cache configuration while performing Rolling Update
> -----------------------------------------------------------------
>
> Key: ISPN-7493
> URL: https://issues.jboss.org/browse/ISPN-7493
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations, Core
> Affects Versions: 9.0.0.CR1
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Infinispan Server 9.0.0.Beta2
> Reporter: Sebastian Łaskawiec
> Assignee: Sebastian Łaskawiec
> Fix For: 9.0.0.CR2
>
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-ds1ks] *** JBossAS process (81) received TERM signal ***
> [transactions-repository-1-ds1ks] 09:50:49,010 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> [transactions-repository-1-ds1ks] 09:50:49,060 WARN [org.jboss.msc.service.fail] (MSC service thread 1-6) MSC000004: Failure during stop of service jboss.datagrid-infinispan.clustered.transactional.config: java.lang.IllegalStateException: ISPN000371: Cannot remove cache configuration 'transactional' because it is in use
> [transactions-repository-1-ds1ks] at org.infinispan.manager.DefaultCacheManager.undefineConfiguration(DefaultCacheManager.java:391)
> [transactions-repository-1-ds1ks] at org.infinispan.manager.impl.AbstractDelegatingEmbeddedCacheManager.undefineConfiguration(AbstractDelegatingEmbeddedCacheManager.java:49)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:120)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:117)
> [transactions-repository-1-ds1ks] at org.infinispan.security.Security.doPrivileged(Security.java:76)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:64)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions.undefineContainerConfiguration(SecurityActions.java:124)
> [transactions-repository-1-ds1ks] at org.jboss.as.clustering.infinispan.subsystem.AbstractCacheConfigurationService.stop(AbstractCacheConfigurationService.java:89)
> [transactions-repository-1-ds1ks] at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2056)
> [transactions-repository-1-ds1ks] at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2017)
> [transactions-repository-1-ds1ks] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [transactions-repository-1-ds1ks] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [transactions-repository-1-ds1ks] at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/2308b4c5e9bbf523fb3e02a7cc45fa24
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Start Spring Session Demo `cd session-demo && mvn fabric8:run`
> * Create a client which inserts data (`watch -n 0.5 curl http://<spring-session-demo-pod-ip>/sessions`) and at the same time invoke the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7493) Cannot remove cache configuration while performing Rolling Update
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7493?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec commented on ISPN-7493:
-------------------------------------------
After fixing ISPN-7494 and ISPN-7487 this issue no longer exists. Marking it as resolved.
> Cannot remove cache configuration while performing Rolling Update
> -----------------------------------------------------------------
>
> Key: ISPN-7493
> URL: https://issues.jboss.org/browse/ISPN-7493
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations, Core
> Affects Versions: 9.0.0.CR1
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Infinispan Server 9.0.0.Beta2
> Reporter: Sebastian Łaskawiec
> Assignee: Sebastian Łaskawiec
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-ds1ks] *** JBossAS process (81) received TERM signal ***
> [transactions-repository-1-ds1ks] 09:50:49,010 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> [transactions-repository-1-ds1ks] 09:50:49,060 WARN [org.jboss.msc.service.fail] (MSC service thread 1-6) MSC000004: Failure during stop of service jboss.datagrid-infinispan.clustered.transactional.config: java.lang.IllegalStateException: ISPN000371: Cannot remove cache configuration 'transactional' because it is in use
> [transactions-repository-1-ds1ks] at org.infinispan.manager.DefaultCacheManager.undefineConfiguration(DefaultCacheManager.java:391)
> [transactions-repository-1-ds1ks] at org.infinispan.manager.impl.AbstractDelegatingEmbeddedCacheManager.undefineConfiguration(AbstractDelegatingEmbeddedCacheManager.java:49)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:120)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:117)
> [transactions-repository-1-ds1ks] at org.infinispan.security.Security.doPrivileged(Security.java:76)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:64)
> [transactions-repository-1-ds1ks] at org.infinispan.server.infinispan.SecurityActions.undefineContainerConfiguration(SecurityActions.java:124)
> [transactions-repository-1-ds1ks] at org.jboss.as.clustering.infinispan.subsystem.AbstractCacheConfigurationService.stop(AbstractCacheConfigurationService.java:89)
> [transactions-repository-1-ds1ks] at org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2056)
> [transactions-repository-1-ds1ks] at org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2017)
> [transactions-repository-1-ds1ks] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [transactions-repository-1-ds1ks] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [transactions-repository-1-ds1ks] at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/2308b4c5e9bbf523fb3e02a7cc45fa24
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Start Spring Session Demo `cd session-demo && mvn fabric8:run`
> * Create a client which inserts data (`watch -n 0.5 curl http://<spring-session-demo-pod-ip>/sessions`) and at the same time invoke the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7509) TotalOrderStateTransferInterceptor doesn't handle OutdatedTopologyException for read commands
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-7509?page=com.atlassian.jira.plugin.... ]
Dan Berindei reassigned ISPN-7509:
----------------------------------
Assignee: Pedro Ruivo
> TotalOrderStateTransferInterceptor doesn't handle OutdatedTopologyException for read commands
> ---------------------------------------------------------------------------------------------
>
> Key: ISPN-7509
> URL: https://issues.jboss.org/browse/ISPN-7509
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.0.0.CR1
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 9.0.0.CR2
>
>
> Since ISPN-6859, read commands can also trigger {{OutdatedTopologyException}}, but these are only handled in the non-TO {{StateTransferInterceptor}}. When a read should be retried because of a topology change in a TO cache, the {{OutdatedTopologyException}} is instead thrown to the user:
> {noformat}
> 16:50:40,595 DEBUG (jgroups-9,Test-NodeA-483:[]) [InvocationContextInterceptor] ISPN000311: Received a command from an outdated topology, returning the exception to caller
> org.infinispan.statetransfer.OutdatedTopologyException: Did not get any successful response, got {Test-NodeC-46564=UnsuccessfulResponse}
> 16:50:40,595 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.tx.totalorder.statetransfer.DistTotalOrderVersionedStateTransferTest.testStateTransfer
> org.infinispan.statetransfer.OutdatedTopologyException: Did not get any successful response, got {Test-NodeC-46564=UnsuccessfulResponse}
> {noformat}
> This causes random failures in {{DistTotalOrderVersionedStateTransferTest}}.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7509) TotalOrderStateTransferInterceptor doesn't handle OutdatedTopologyException for read commands
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-7509?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-7509:
-------------------------------
Status: Open (was: New)
> TotalOrderStateTransferInterceptor doesn't handle OutdatedTopologyException for read commands
> ---------------------------------------------------------------------------------------------
>
> Key: ISPN-7509
> URL: https://issues.jboss.org/browse/ISPN-7509
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.0.0.CR1
> Reporter: Dan Berindei
> Assignee: Pedro Ruivo
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 9.0.0.CR2
>
>
> Since ISPN-6859, read commands can also trigger {{OutdatedTopologyException}}, but these are only handled in the non-TO {{StateTransferInterceptor}}. When a read should be retried because of a topology change in a TO cache, the {{OutdatedTopologyException}} is instead thrown to the user:
> {noformat}
> 16:50:40,595 DEBUG (jgroups-9,Test-NodeA-483:[]) [InvocationContextInterceptor] ISPN000311: Received a command from an outdated topology, returning the exception to caller
> org.infinispan.statetransfer.OutdatedTopologyException: Did not get any successful response, got {Test-NodeC-46564=UnsuccessfulResponse}
> 16:50:40,595 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.tx.totalorder.statetransfer.DistTotalOrderVersionedStateTransferTest.testStateTransfer
> org.infinispan.statetransfer.OutdatedTopologyException: Did not get any successful response, got {Test-NodeC-46564=UnsuccessfulResponse}
> {noformat}
> This causes random failures in {{DistTotalOrderVersionedStateTransferTest}}.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7494) Prevent Kubernetes from killing 2 nodes at the same time
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7494?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec resolved ISPN-7494.
---------------------------------------
Fix Version/s: 9.0.0.CR2
Resolution: Won't Fix
> Prevent Kubernetes from killing 2 nodes at the same time
> --------------------------------------------------------
>
> Key: ISPN-7494
> URL: https://issues.jboss.org/browse/ISPN-7494
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations
> Affects Versions: 9.0.0.Beta2
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Infinispan Server 9.0.0.Beta2
> Reporter: Sebastian Łaskawiec
> Assignee: Sebastian Łaskawiec
> Priority: Blocker
> Fix For: 9.0.0.CR2
>
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-hqz3v] *** JBossAS process (83) received TERM signal ***
> [transactions-repository-1-dwl81] 09:52:09,522 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> [transactions-repository-1-dwl81] *** JBossAS process (80) received TERM signal ***
> [transactions-repository-1-hqz3v] 09:52:09,526 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/2308b4c5e9bbf523fb3e02a7cc45fa24
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Start Spring Session Demo `cd session-demo && mvn fabric8:run`
> * Create a client which inserts data (`watch -n 0.5 curl http://<spring-session-demo-pod-ip>/sessions`) and at the same time invoke the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7494) Prevent Kubernetes from killing 2 nodes at the same time
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7494?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec commented on ISPN-7494:
-------------------------------------------
I found the solution. It was the matter of Liveness and Readiness probes misconfiguration.
The main idea is that we need to perform multiple checks and mark the probe as failed if several of them fail. Here's an example:
{code}
livenessProbe:
exec:
command:
- /usr/local/bin/is_running.sh
initialDelaySeconds: 10
timeoutSeconds: 80
periodSeconds: 60
successThreshold: 1
failureThreshold: 5
readinessProbe:
exec:
command:
- /usr/local/bin/is_healthy.sh
{code}
> Prevent Kubernetes from killing 2 nodes at the same time
> --------------------------------------------------------
>
> Key: ISPN-7494
> URL: https://issues.jboss.org/browse/ISPN-7494
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations
> Affects Versions: 9.0.0.Beta2
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Infinispan Server 9.0.0.Beta2
> Reporter: Sebastian Łaskawiec
> Assignee: Sebastian Łaskawiec
> Priority: Blocker
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-hqz3v] *** JBossAS process (83) received TERM signal ***
> [transactions-repository-1-dwl81] 09:52:09,522 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> [transactions-repository-1-dwl81] *** JBossAS process (80) received TERM signal ***
> [transactions-repository-1-hqz3v] 09:52:09,526 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/2308b4c5e9bbf523fb3e02a7cc45fa24
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Start Spring Session Demo `cd session-demo && mvn fabric8:run`
> * Create a client which inserts data (`watch -n 0.5 curl http://<spring-session-demo-pod-ip>/sessions`) and at the same time invoke the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7494) Prevent Kubernetes from killing 2 nodes at the same time
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7494?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec edited comment on ISPN-7494 at 2/22/17 8:15 AM:
--------------------------------------------------------------------
I found the solution. It was the matter of Liveness and Readiness probes misconfiguration.
The main idea is that we need to perform multiple checks and mark the probe as failed if several of them fail. Here's an example:
{code}
livenessProbe:
exec:
command:
- /usr/local/bin/is_running.sh
initialDelaySeconds: 10
timeoutSeconds: 80
periodSeconds: 60
successThreshold: 1
failureThreshold: 5
readinessProbe:
exec:
command:
- /usr/local/bin/is_healthy.sh
initialDelaySeconds: 10
timeoutSeconds: 40
periodSeconds: 30
successThreshold: 2
failureThreshold: 5
{code}
was (Author: sebastian.laskawiec):
I found the solution. It was the matter of Liveness and Readiness probes misconfiguration.
The main idea is that we need to perform multiple checks and mark the probe as failed if several of them fail. Here's an example:
{code}
livenessProbe:
exec:
command:
- /usr/local/bin/is_running.sh
initialDelaySeconds: 10
timeoutSeconds: 80
periodSeconds: 60
successThreshold: 1
failureThreshold: 5
readinessProbe:
exec:
command:
- /usr/local/bin/is_healthy.sh
{code}
> Prevent Kubernetes from killing 2 nodes at the same time
> --------------------------------------------------------
>
> Key: ISPN-7494
> URL: https://issues.jboss.org/browse/ISPN-7494
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations
> Affects Versions: 9.0.0.Beta2
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Infinispan Server 9.0.0.Beta2
> Reporter: Sebastian Łaskawiec
> Assignee: Sebastian Łaskawiec
> Priority: Blocker
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-hqz3v] *** JBossAS process (83) received TERM signal ***
> [transactions-repository-1-dwl81] 09:52:09,522 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> [transactions-repository-1-dwl81] *** JBossAS process (80) received TERM signal ***
> [transactions-repository-1-hqz3v] 09:52:09,526 INFO [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/2308b4c5e9bbf523fb3e02a7cc45fa24
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Start Spring Session Demo `cd session-demo && mvn fabric8:run`
> * Create a client which inserts data (`watch -n 0.5 curl http://<spring-session-demo-pod-ip>/sessions`) and at the same time invoke the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7489) org.jgroups.protocols.TCP emits errors when node leaves the cluster
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-7489?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec edited comment on ISPN-7489 at 2/22/17 8:13 AM:
--------------------------------------------------------------------
According to [Kubernetes documentation|https://kubernetes.io/docs/user-guide/pods/#termination-of-pods]:
{quote}
(simultaneous with 3), Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly can continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
{quote}
Kube doesn't seal it off the network. Perhaps we're doing it on our own?
was (Author: sebastian.laskawiec):
According to [Kubernetes documentation|https://kubernetes.io/docs/user-guide/pods/#termination-of-pods]:
{quote}
(simultaneous with 3), Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly can continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.
{quote}
So Kube doesn't seal it off the network. Perhaps we're doing it on our own?
> org.jgroups.protocols.TCP emits errors when node leaves the cluster
> -------------------------------------------------------------------
>
> Key: ISPN-7489
> URL: https://issues.jboss.org/browse/ISPN-7489
> Project: Infinispan
> Issue Type: Bug
> Components: Cloud Integrations, Core
> Affects Versions: 9.0.0.CR1
> Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Custom Infinispan Server build (based on [these instructions|https://github.com/slaskawi/infinispan-1/tree/custom_image]). SHA1 {{2b0731b21649a88a75ed71d21b9cc06ba365e947}}
> Reporter: Sebastian Łaskawiec
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-04x09] 18:09:12,193 ERROR [org.jgroups.protocols.TCP] (jgroups-30,transactions-repository-1-04x09) JGRP000029: transactions-repository-1-04x09: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=5262, TP: [cluster_name=cluster]
> [transactions-repository-1-1f8dx] 18:09:12,310 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-1-1f8dx) JGRP000029: transactions-repository-1-1f8dx: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=6259, TP: [cluster_name=cluster]
> [transactions-repository-1-04x09] 18:09:12,997 ERROR [org.jgroups.protocols.TCP] (jgroups-22,transactions-repository-1-04x09) JGRP000029: transactions-repository-1-04x09: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=5262, TP: [cluster_name=cluster]
> [transactions-repository-1-1f8dx] 18:09:13,113 ERROR [org.jgroups.protocols.TCP] (jgroups-16,transactions-repository-1-1f8dx) JGRP000029: transactions-repository-1-1f8dx: failed sending message to transactions-repository-1-4z05w (71 bytes): java.net.SocketTimeoutException: connect timed out, headers: GMS: GmsHeader[VIEW_ACK], UNICAST3: DATA, seqno=6259, TP: [cluster_name=cluster]
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/530241bb695f1f490bcb25eabaf9d676
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Do the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (ISPN-7509) TotalOrderStateTransferInterceptor doesn't handle OutdatedTopologyException for read commands
by Dan Berindei (JIRA)
Dan Berindei created ISPN-7509:
----------------------------------
Summary: TotalOrderStateTransferInterceptor doesn't handle OutdatedTopologyException for read commands
Key: ISPN-7509
URL: https://issues.jboss.org/browse/ISPN-7509
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 9.0.0.CR1
Reporter: Dan Berindei
Priority: Critical
Fix For: 9.0.0.CR2
Since ISPN-6859, read commands can also trigger {{OutdatedTopologyException}}, but these are only handled in the non-TO {{StateTransferInterceptor}}. When a read should be retried because of a topology change in a TO cache, the {{OutdatedTopologyException}} is instead thrown to the user:
{noformat}
16:50:40,595 DEBUG (jgroups-9,Test-NodeA-483:[]) [InvocationContextInterceptor] ISPN000311: Received a command from an outdated topology, returning the exception to caller
org.infinispan.statetransfer.OutdatedTopologyException: Did not get any successful response, got {Test-NodeC-46564=UnsuccessfulResponse}
16:50:40,595 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.tx.totalorder.statetransfer.DistTotalOrderVersionedStateTransferTest.testStateTransfer
org.infinispan.statetransfer.OutdatedTopologyException: Did not get any successful response, got {Test-NodeC-46564=UnsuccessfulResponse}
{noformat}
This causes random failures in {{DistTotalOrderVersionedStateTransferTest}}.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month