]
Thiago Presa closed WFLY-5831.
------------------------------
Resolution: Rejected
I found it! Let num_apps be the number of servers in your domain cluster, and consider the
property port-range of the jgroups configuration XML. If num_apps > port-range, then
some servers will not be able to join the jgroups cluster, and hence the timeouts.
Initial state transfer timed out for cache dist on WF10 CR4
-----------------------------------------------------------
Key: WFLY-5831
URL:
https://issues.jboss.org/browse/WFLY-5831
Project: WildFly
Issue Type: Bug
Components: Clustering
Affects Versions: 10.0.0.CR4
Environment: Wildfly 10.0.0.CR4 running on Debian 8. Running in domain mode with
two slave nodes and hosting multiple deployments in multiple server-groups.
Reporter: Thiago Presa
Assignee: Paul Ferraro
When I boot a server with a deployment, I get:
2015-12-10 08:31:29,845 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool --
67) MSC000001: Failed to start service jboss.infinispan.ejb.dist:
org.jboss.msc.service.StartException in service jboss.infinispan.ejb.dist:
org.infinispan.commons.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.Exception on object of type StateTransferManagerImpl
at
org.wildfly.clustering.service.AsynchronousServiceBuilder$1.run(AsynchronousServiceBuilder.java:107)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at org.jboss.threads.JBossThread.run(JBossThread.java:320)
Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.Exception on object of type StateTransferManagerImpl
at
org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:172)
at
org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
at
org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
at
org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
at
org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:218)
at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:850)
at
org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:629)
at
org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:580)
at
org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:445)
at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:117)
at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:112)
at
org.wildfly.clustering.infinispan.spi.service.CacheBuilder.start(CacheBuilder.java:80)
at
org.wildfly.clustering.service.AsynchronousServiceBuilder$1.run(AsynchronousServiceBuilder.java:102)
... 4 more
Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for
cache dist on <node-name>:<server-name>
at
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:225)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
... 17 more
2015-12-10 08:50:08,902 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool --
67) MSC000001: Failed to start service
jboss.infinispan.web."<war-name>.war":
org.jboss.msc.service.StartException in service
jboss.infinispan.web."<war-name>.war":
org.infinispan.commons.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.Exception on object of type StateTransferManagerImpl
at
org.wildfly.clustering.service.AsynchronousServiceBuilder$1.run(AsynchronousServiceBuilder.java:107)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at org.jboss.threads.JBossThread.run(JBossThread.java:320)
Caused by: org.infinispan.commons.CacheException: Unable to invoke method public void
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete()
throws java.lang.Exception on object of type StateTransferManagerImpl
at
org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:172)
at
org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
at
org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
at
org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
at
org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:218)
at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:850)
at
org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:629)
at
org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:580)
at
org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:445)
at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:117)
at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.getCache(DefaultCacheContainer.java:112)
at
org.wildfly.clustering.infinispan.spi.service.CacheBuilder.start(CacheBuilder.java:80)
at
org.wildfly.clustering.service.AsynchronousServiceBuilder$1.run(AsynchronousServiceBuilder.java:102)
... 4 more
Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for
cache <war-name>.war on <node-name>:<server-name>
at
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:225)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
2015-12-10 08:50:08,908 ERROR [stderr] (ServerService Thread Pool -- 67) Exception in
thread "ServerService Thread Pool -- 67" java.lang.IllegalStateException:
ISPN000371: Cannot remove cache configuration '<war-name>.war' because it is
in use
2015-12-10 08:50:08,908 ERROR [stderr] (ServerService Thread Pool -- 67) at
org.infinispan.manager.DefaultCacheManager.undefineConfiguration(DefaultCacheManager.java:396)
2015-12-10 08:50:08,908 ERROR [stderr] (ServerService Thread Pool -- 67) at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.undefineConfiguration(DefaultCacheContainer.java:88)
2015-12-10 08:50:08,909 ERROR [stderr] (ServerService Thread Pool -- 67) at
org.wildfly.clustering.infinispan.spi.service.ConfigurationBuilder.stop(ConfigurationBuilder.java:80)
2015-12-10 08:50:08,909 ERROR [stderr] (ServerService Thread Pool -- 67) at
org.wildfly.clustering.service.AsynchronousServiceBuilder$2.run(AsynchronousServiceBuilder.java:130)
2015-12-10 08:50:08,909 ERROR [stderr] (ServerService Thread Pool -- 67) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2015-12-10 08:50:08,909 ERROR [stderr] (ServerService Thread Pool -- 67) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2015-12-10 08:50:08,910 ERROR [stderr] (ServerService Thread Pool -- 67) at
java.lang.Thread.run(Thread.java:745)
2015-12-10 08:50:08,910 ERROR [stderr] (ServerService Thread Pool -- 67) at
org.jboss.threads.JBossThread.run(JBossThread.java:320)
2015-12-10 08:50:08,911 ERROR [stderr] (ServerService Thread Pool -- 64) Exception in
thread "ServerService Thread Pool -- 64" java.lang.IllegalStateException:
ISPN000371: Cannot remove cache configuration 'dist' because it is in use
2015-12-10 08:50:08,911 ERROR [stderr] (ServerService Thread Pool -- 64) at
org.infinispan.manager.DefaultCacheManager.undefineConfiguration(DefaultCacheManager.java:396)
2015-12-10 08:50:08,911 ERROR [stderr] (ServerService Thread Pool -- 64) at
org.jboss.as.clustering.infinispan.DefaultCacheContainer.undefineConfiguration(DefaultCacheContainer.java:88)
2015-12-10 08:50:08,911 ERROR [stderr] (ServerService Thread Pool -- 64) at
org.wildfly.clustering.infinispan.spi.service.ConfigurationBuilder.stop(ConfigurationBuilder.java:80)
2015-12-10 08:50:08,911 ERROR [stderr] (ServerService Thread Pool -- 64) at
org.wildfly.clustering.service.AsynchronousServiceBuilder$2.run(AsynchronousServiceBuilder.java:130)
2015-12-10 08:50:08,912 ERROR [stderr] (ServerService Thread Pool -- 64) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2015-12-10 08:50:08,912 ERROR [stderr] (ServerService Thread Pool -- 64) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2015-12-10 08:50:08,912 ERROR [stderr] (ServerService Thread Pool -- 64) at
java.lang.Thread.run(Thread.java:745)
2015-12-10 08:50:08,912 ERROR [stderr] (ServerService Thread Pool -- 64) at
org.jboss.threads.JBossThread.run(JBossThread.java:320)
2015-12-10 08:50:08,916 WARN [org.infinispan.statetransfer.StateConsumerImpl]
(transport-thread--p16-t3) ISPN000209: Failed to retrieve transactions for segments [0, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79] of cache <war-name>.war from node
4276f493-0b50-d3f2-5b58-bfbabcf4d05a: org.infinispan.util.concurrent.TimeoutException:
Replication timeout for 4276f493-0b50-d3f2-5b58-bfbabcf4d05a
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:755)
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$80(JGroupsTransport.java:589)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:46)
at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.call(SingleResponseFuture.java:17)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
This seems to trigger the server shutdown, which also writes the following stack traces:
2015-12-10 08:50:08,983 WARN [org.infinispan.statetransfer.StateConsumerImpl]
(transport-thread--p16-t2) ISPN000209: Failed to retrieve transactions for segments [0, 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79] of cache dist from node
4276f493-0b50-d3f2-5b58-bfbabcf4d05a: org.infinispan.remoting.RemoteException: ISPN000217:
Received exception from 4276f493-0b50-d3f2-5b58-bfbabcf4d05a, see cause for remote stack
trace
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.checkRsp(JGroupsTransport.java:747)
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.lambda$invokeRemotelyAsync$80(JGroupsTransport.java:589)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
at
org.infinispan.remoting.transport.jgroups.SingleResponseFuture.futureDone(SingleResponseFuture.java:30)
at org.jgroups.blocks.Request.checkCompletion(Request.java:169)
at org.jgroups.blocks.UnicastRequest.transportClosed(UnicastRequest.java:182)
at org.jgroups.blocks.RequestCorrelator.stop(RequestCorrelator.java:270)
at org.jgroups.blocks.MessageDispatcher.stop(MessageDispatcher.java:163)
at
org.jgroups.blocks.MessageDispatcher.channelDisconnected(MessageDispatcher.java:534)
at org.jgroups.Channel.notifyChannelDisconnected(Channel.java:533)
at org.jgroups.fork.ForkChannel.disconnect(ForkChannel.java:186)
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.stop(JGroupsTransport.java:262)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
at
org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
at
org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:674)
at
org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:552)
at
org.infinispan.factories.GlobalComponentRegistry.stop(GlobalComponentRegistry.java:261)
at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:704)
at
org.jboss.as.clustering.infinispan.subsystem.CacheContainerBuilder.stop(CacheContainerBuilder.java:120)
at
org.jboss.msc.service.ServiceControllerImpl$StopTask.stopService(ServiceControllerImpl.java:2056)
at
org.jboss.msc.service.ServiceControllerImpl$StopTask.run(ServiceControllerImpl.java:2017)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: transport was closed
at org.jgroups.blocks.UnicastRequest.transportClosed(UnicastRequest.java:173)
... 22 more
2015-12-10 08:50:08,985 ERROR [org.infinispan.remoting.rpc.RpcManagerImpl]
(transport-thread--p16-t2) ISPN000073: Unexpected error while replicating:
org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not
connected
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.rethrowAsCacheException(CommandAwareRpcDispatcher.java:128)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:120)
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotelyAsync(JGroupsTransport.java:568)
at
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotelyAsync(RpcManagerImpl.java:170)
at
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:212)
at
org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:830)
at
org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:749)
at
org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:693)
at
org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:355)
at
org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:200)
at
org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:47)
at
org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:120)
at
org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:415)
at
org.infinispan.topology.LocalTopologyManagerImpl$3.run(LocalTopologyManagerImpl.java:377)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
org.infinispan.executors.SemaphoreCompletionService$QueueingTask.runInternal(SemaphoreCompletionService.java:173)
at
org.infinispan.executors.SemaphoreCompletionService$QueueingTask.run(SemaphoreCompletionService.java:151)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: channel is not connected
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:702)
at
org.jgroups.blocks.RequestCorrelator.sendUnicastRequest(RequestCorrelator.java:205)
at org.jgroups.blocks.UnicastRequest.sendRequest(UnicastRequest.java:43)
at org.jgroups.blocks.Request.execute(Request.java:83)
at
org.jgroups.blocks.MessageDispatcher.sendMessageWithFuture(MessageDispatcher.java:462)
at
org.jgroups.blocks.MessageDispatcher.sendMessageWithFuture(MessageDispatcher.java:479)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:265)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:117)
... 19 more
And then I get a bunch of "ISPN000208: No live owners found for segment xx"
until the server shuts down. Initially I thought this was WFLY-5307, but since this is
already fixed in 10.0.0.CR4 I believe I'm getting another issue altogether.