[infinispan-issues] [JBoss JIRA] (ISPN-1961) View change causes state transfer timeout to be reduced to rpc timeout time
RH Bugzilla Integration (JIRA)
jira-events at lists.jboss.org
Thu Apr 5 05:30:47 EDT 2012
[ https://issues.jboss.org/browse/ISPN-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681969#comment-12681969 ]
RH Bugzilla Integration commented on ISPN-1961:
-----------------------------------------------
Misha H. Ali <mhusnain at redhat.com> made a comment on [bug 809042|https://bugzilla.redhat.com/show_bug.cgi?id=809042]
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
RPC timeout indicates the maximum timeout allowed for remote user operations like GET and PUT. State transfer timeout contains the timeout value allowed for internal remote operations used by the state transfer process (for example, APPLY_STATE). DistributedSyncTimeout is used for cache view installations, which include state tranfers. As a result, the value set for distributedSyncTimeout should be greater than the value of stateTransfer.timeout.
However, the state transfer timeout does not change during the state transfer. As a result, if a user operation fails with a TimeoutException, it is because rpcTimeout milliseconds passes, regardless of whether it occurs during a state transfer or not.
In this instance, there is an unstable cluster in which nodes 2 and 3 consider themselves part of the cluster, while the remaining nodes in the cluster do not view them as part of the cluster. It takes over one minute to establish a common/stable JGroups cluster view and to perform a state transfer, therefore all pending operations fail.
> View change causes state transfer timeout to be reduced to rpc timeout time
> ---------------------------------------------------------------------------
>
> Key: ISPN-1961
> URL: https://issues.jboss.org/browse/ISPN-1961
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.1.3.FINAL
> Reporter: Michal Linhard
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.1.4.CR1, 5.1.4.FINAL
>
>
> Appeared in JDG 6.0.0.ER6
> http://www.qa.jboss.com/~mlinhard/hyperion/run44-elas-dist/
> We've got lot's of these:
> {code}
> 2012-03-30 05:31:42,670 355515 ERROR [org.jboss.smartfrog.edg.loaddriver.DriverNode] (Client-456:) Error doing PUT(key251456) to node node0002 (lastOpTime 63343 ms)
> org.infinispan.client.hotrod.exceptions.HotRodClientException:Request for message id[99888] returned server error (status=0x86): org.infinispan.util.concurrent.TimeoutException: Timed out waiting for the state transfer to end
> at org.infinispan.client.hotrod.impl.protocol.Codec10.checkForErrorsInResponseStatus(Codec10.java:154)
> at org.infinispan.client.hotrod.impl.protocol.Codec10.readHeader(Codec10.java:109)
> at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:78)
> at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:72)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:52)
> at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:41)
> at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:68)
> at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:219)
> at org.infinispan.CacheSupport.put(CacheSupport.java:52)
> at org.jboss.qa.edg.adapter.HotRodAdapter$HotRodRemoteCacheAdapter.put(HotRodAdapter.java:249)
> at org.jboss.qa.edg.adapter.HotRodAdapter$HotRodRemoteCacheAdapter.put(HotRodAdapter.java:234)
> at org.jboss.smartfrog.edg.loaddriver.DriverNodeImpl$ClientThread.makeRequest(DriverNodeImpl.java:244)
> at org.jboss.smartfrog.edg.loaddriver.DriverNodeImpl$ClientThread.run(DriverNodeImpl.java:375)
> {code}
> No state transfer takes more than 10min in this test.
> and state transfer timeout is 10min.
> RPC timeout is 1 min and this one seems to be in play here. see the exceptions on server:
> {code}
> 05:31:39,449 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (MemcachedServerWorker-2-137) ISPN000136: Execution error: org.infinispan.util.concurrent.TimeoutException: Timed out waiting for the state transfer to end
> at org.infinispan.interceptors.StateTransferLockInterceptor.handleWithRetries(StateTransferLockInterceptor.java:218) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.StateTransferLockInterceptor.handleWriteCommand(StateTransferLockInterceptor.java:181) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.StateTransferLockInterceptor.visitPutKeyValueCommand(StateTransferLockInterceptor.java:152) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutKeyValueCommand(CacheMgmtInterceptor.java:124) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:130) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:89) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:61) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:345) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:944) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.CacheImpl.put(CacheImpl.java:657) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.DecoratedCache.put(DecoratedCache.java:179) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.AbstractDelegatingCache.put(AbstractDelegatingCache.java:119) [infinispan-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.AbstractProtocolDecoder.put(AbstractProtocolDecoder.scala:196) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.AbstractProtocolDecoder.decodeValue(AbstractProtocolDecoder.scala:150) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:72) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.AbstractProtocolDecoder.decode(AbstractProtocolDecoder.scala:45) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.transport.CustomReplayingDecoder.callDecode(CustomReplayingDecoder.java:250) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.transport.CustomReplayingDecoder.messageReceived(CustomReplayingDecoder.java:223) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.infinispan.server.core.AbstractProtocolDecoder.messageReceived(AbstractProtocolDecoder.scala:360) [infinispan-server-core-5.1.3.ER6-redhat-1.jar:5.1.3.ER6-redhat-1]
> at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:80) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:351) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:282) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:202) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.2.6.Final-redhat-1.jar:]
> at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:44) [netty-3.2.6.Final-redhat-1.jar:]
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_30]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_30]
> at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_30]
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list