]
William DeCoste commented on ISPN-5869:
---------------------------------------
Node 1 and 2 didn't seem to fail. It's not the logging but doing all this on my
laptop makes it more likely than in our lab env. I can recreate it 20-25% of scale
up/downs without the extra logging.
I will retest with your suggested config. I think EAP needs to recover better than it does
now.
Timeout in initial state replication when EAP cluster scaled up/down
--------------------------------------------------------------------
Key: ISPN-5869
URL:
https://issues.jboss.org/browse/ISPN-5869
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 5.2.11.Final
Reporter: William DeCoste
Assignee: Dan Berindei
Attachments: down.sh, haproxy.cfg.down, haproxy.cfg.up, node1.log.gz,
node2.log.gz, node3.log.gz, node4.log.gz, node5.log.gz, server.log,
SocketConnectionTimeoutTest.java, up.sh
When scaling an EAP cluster between 1 and 5 active instances with active traffic the
below exception occurs fairly frequently which results in a dead EAP instance. This issue
also results in a dead Pod when running in OpenShift. Active traffic is created state
replication for HTTPSession and SFSB.
13:27:42,553 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 66)
MSC000001: Failed to start service jboss.infinispan.web.default-host/sfsbTest:
org.jboss.msc.service.StartException in service
jboss.infinispan.web.default-host/sfsbTest: org.infinispan.CacheException: Unable to
invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:91)
[jboss-as-clustering-common-7.5.0.Final-redhat-21.jar:7.5.0.Final-redhat-21]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[rt.jar:1.7.0_45]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[rt.jar:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45]
at org.jboss.threads.JBossThread.run(JBossThread.java:122)
[jboss-threads-2.1.2.Final-redhat-1.jar:2.1.2.Final-redhat-1]
Caused by: org.infinispan.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
at
org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886)
at
org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657)
at
org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:646)
at
org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549)
at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:217)
at org.infinispan.CacheImpl.start(CacheImpl.java:582)
at
org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:686)
at
org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:559)
at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:113)
at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:104)
at
org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:78)
at org.jboss.as.clustering.msc.AsynchronousService$1.run(AsynchronousService.java:86)
[jboss-as-clustering-common-7.5.0.Final-redhat-21.jar:7.5.0.Final-redhat-21]
... 4 more
Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache
default-host/sfsbTest on node2/web
at
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_45]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
[rt.jar:1.7.0_45]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[rt.jar:1.7.0_45]
at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_45]
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
... 18 more
13:27:42,559 ERROR [org.jboss.as.controller.management-operation] (Controller Boot
Thread) JBAS014612: Operation ("deploy") failed - address:
([("deployment" => "sfsbTest.war")]) - failure description:
{"JBAS014671: Failed services" =>
{"jboss.infinispan.web.default-host/sfsbTest" =>
"org.jboss.msc.service.StartException in service
jboss.infinispan.web.default-host/sfsbTest: org.infinispan.CacheException: Unable to
invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
Caused by: org.infinispan.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache
default-host/sfsbTest on node2/web"}}
13:27:42,606 INFO [org.jboss.as.server] (ServerService Thread Pool -- 35) JBAS015859:
Deployed "sfsbTest.war" (runtime-name : "sfsbTest.war")
13:27:42,607 INFO [org.jboss.as.controller] (Controller Boot Thread) JBAS014774: Service
status report
JBAS014777: Services which failed to start: service
jboss.infinispan.web.default-host/sfsbTest: org.jboss.msc.service.StartException in
service jboss.infinispan.web.default-host/sfsbTest: org.infinispan.CacheException: Unable
to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.InterruptedException on object of type StateTransferManagerImpl
13:27:42,613 INFO [org.jboss.as] (Controller Boot Thread) JBAS015961: Http management
interface listening on
http://127.0.0.1:10000/management
13:27:42,613 INFO [org.jboss.as] (Controller Boot Thread) JBAS015951: Admin console
listening on
http://127.0.0.1:10000
13:27:42,613 ERROR [org.jboss.as] (Controller Boot Thread) JBAS015875: JBoss EAP 6.4.0.GA
(AS 7.5.0.Final-redhat-21) started (with errors) in 72734ms - Started 340 of 411 services
(4 services failed or missing dependencies, 126 services are lazy, passive or on-demand)
13:27:42,850 INFO [org.jboss.as.jpa] (ServerService Thread Pool -- 23) JBAS011403:
Stopping Persistence Unit Service 'sfsbTest.war#jpa-test'
13:27:42,858 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 23)
JBAS010282: Stopped sfsbTest.war#jpa-test.org.jboss.jndiTest.TestEntity cache from
hibernate container
13:27:42,872 INFO [org.infinispan.jmx.CacheJmxRegistration] (ServerService Thread Pool
-- 23) ISPN000031: MBeans were successfully registered to the platform MBean server.
13:27:42,873 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 23)
JBAS010281: Started local-query cache from hibernate container
13:27:42,873 INFO [org.hibernate.tool.hbm2ddl.SchemaExport] (ServerService Thread Pool
-- 23) HHH000227: Running hbm2ddl schema export
13:27:42,878 INFO [org.hibernate.tool.hbm2ddl.SchemaExport] (ServerService Thread Pool
-- 23) HHH000230: Schema export complete
13:27:42,894 INFO [org.jboss.as.clustering.infinispan] (MSC service thread 1-5)
JBAS010282: Stopped pending-puts cache from hibernate container
13:27:42,898 INFO [org.jboss.as.clustering.infinispan] (MSC service thread 1-5)
JBAS010282: Stopped local-query cache from hibernate container
13:27:42,899 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC
service thread 1-5) ISPN000080: Disconnecting and closing JGroups Channel
13:27:42,905 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread
Pool -- 23) ISPN000029: Passivating all entries to disk
13:27:42,906 INFO [org.infinispan.eviction.PassivationManagerImpl] (ServerService Thread
Pool -- 23) ISPN000030: Passivated 0 entries in 1 milliseconds
13:27:42,915 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 23)
JBAS010282: Stopped repl cache from web container
13:27:42,918 INFO [org.infinispan.eviction.PassivationManagerImpl] (MSC service thread
1-4) ISPN000029: Passivating all entries to disk
13:27:42,920 INFO [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877:
Stopped deployment sfsbTest.war (runtime-name: sfsbTest.war) in 85ms
13:27:43,263 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC
service thread 1-5) ISPN000082: Stopping the RpcDispatcher
13:27:47,210 INFO [org.infinispan.eviction.PassivationManagerImpl] (MSC service thread
1-4) ISPN000030: Passivated 3955 entries in 4.29 seconds
13:27:47,223 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC
service thread 1-4) ISPN000080: Disconnecting and closing JGroups Channel
13:27:47,587 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC
service thread 1-4) ISPN000082: Stopping the RpcDispatcher
13:27:47,662 INFO [org.jboss.as.server] (DeploymentScanner-threads - 2) JBAS015858:
Undeployed "sfsbTest.war" (runtime-name: "sfsbTest.war")
13:27:47,664 INFO [org.jboss.as.controller] (DeploymentScanner-threads - 2) JBAS014774:
Service status report
JBAS014775: New missing/unsatisfied dependencies:
service
jboss.deployment.unit."sfsbTest.war".component.EntityTesterBean.START (missing)
dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component.StatefulBean1.START (missing)
dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component.StatelessBean1.START (missing)
dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component.StatelessBean2.START (missing)
dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service jboss.deployment.unit."sfsbTest.war".component.TimerBean.START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."com.sun.faces.config.ConfigureListener".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."javax.faces.webapp.FacetTag".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."javax.servlet.jsp.jstl.tlv.PermittedTaglibsTLV".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."javax.servlet.jsp.jstl.tlv.ScriptFreeTLV".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."org.apache.catalina.servlets.DefaultServlet".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."org.apache.jasper.servlet.JspServlet".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service
jboss.deployment.unit."sfsbTest.war".component."org.jboss.sfsbTest.SfsbServlet".START
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service jboss.deployment.unit."sfsbTest.war".jndiDependencyService
(missing) dependents: [service jboss.web.deployment.default-host./sfsbTest]
service jboss.infinispan.web.default-host/sfsbTest (missing) dependents: [service
jboss.web.deployment.default-host./sfsbTest.session]
service jboss.persistenceunit."sfsbTest.war#jpa-test" (missing)
dependents: [service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service jboss.web.deployment.default-host./sfsbTest (missing) dependents: [service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
service jboss.web.deployment.default-host./sfsbTest.realm (missing) dependents:
[service jboss.web.deployment.default-host./sfsbTest, service
jboss.deployment.unit."sfsbTest.war".deploymentCompleteService]
JBAS014777: Services which failed to start: service
jboss.infinispan.web.default-host/sfsbTest