[
https://issues.jboss.org/browse/ISPN-1317?page=com.atlassian.jira.plugin....
]
Galder Zamarreño resolved ISPN-1317.
------------------------------------
Fix Version/s: (was: 5.1.0.FINAL)
Resolution: Won't Fix
This issue is no longer present in 5.1. If anything, it could be fixed for the 5.0.x
series but that's not likely to happen.
Concurrent state transfer requests can lead to premature flush wait
closures
----------------------------------------------------------------------------
Key: ISPN-1317
URL:
https://issues.jboss.org/browse/ISPN-1317
Project: Infinispan
Issue Type: Bug
Components: State transfer
Affects Versions: 5.0.0.FINAL
Reporter: Galder Zamarreño
Assignee: Galder Zamarreño
Priority: Critical
Attachments: 1317-analysis.txt
Logs in JBPAPP-6929 show:
{code}15:40:22,698 ERROR [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
(STREAMING_STATE_TRANSFER-sender-1,default,michal-linhard-12702)
ISPN000095: Caught while responding to state transfer request:
org.infinispan.statetransfer.StateTransferException:
java.util.concurrent.TimeoutException: Timed out waiting for a cluster-wide sync to be
acquired. (timeout = 60 seconds)
at
org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:162)
[infinispan-core-5.0.0-SNAPSHOT.jar:5.0.0-SNAPSHOT]
at
org.infinispan.remoting.InboundInvocationHandlerImpl.generateState(InboundInvocationHandlerImpl.java:248)
[infinispan-core-5.0.0-SNAPSHOT.jar:5.0.0-SNAPSHOT]
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.getState(JGroupsTransport.java:590)
[infinispan-core-5.0.0-SNAPSHOT.jar:5.0.0-SNAPSHOT]
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.handleUpEvent(MessageDispatcher.java:690)
[jgroups-2.12.1.Final.jar:]
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:771)
[jgroups-2.12.1.Final.jar:]
at org.jgroups.JChannel.up(JChannel.java:1484) [jgroups-2.12.1.Final.jar:]
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1074)
[jgroups-2.12.1.Final.jar:]
at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:477) [jgroups-2.12.1.Final.jar:]
at
org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderHandler.process(STREAMING_STATE_TRANSFER.java:651)
[jgroups-2.12.1.Final.jar:]
at
org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER$StateProviderThreadSpawner$1.run(STREAMING_STATE_TRANSFER.java:580)
[jgroups-2.12.1.Final.jar:]
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[:1.6.0_25]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[:1.6.0_25]
at java.lang.Thread.run(Thread.java:662) [:1.6.0_25]
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for a cluster-wide
sync to be acquired. (timeout = 60 seconds)
at
org.infinispan.remoting.transport.jgroups.JGroupsDistSync.blockUntilAcquired(JGroupsDistSync.java:62)
[infinispan-core-5.0.0-SNAPSHOT.jar:5.0.0-SNAPSHOT]
at
org.infinispan.statetransfer.StateTransferManagerImpl.generateTransactionLog(StateTransferManagerImpl.java:196)
[infinispan-core-5.0.0-SNAPSHOT.jar:5.0.0-SNAPSHOT]
at
org.infinispan.statetransfer.StateTransferManagerImpl.generateState(StateTransferManagerImpl.java:152)
[infinispan-core-5.0.0-SNAPSHOT.jar:5.0.0-SNAPSHOT]
... 12 more{code}
Now, what's odd about this is that the JGroupsDistSync.flushWaitGate behind it is
only acquired/released while state transfer control command is sent and the logs show that
both the enabling(acquiring) and disabling(releasing) state transfer control commands
where built:
{code}15:39:20,902 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
(MSC service thread 1-4)
dests=[michal-linhard-37465], command=StateTransferControlCommand{enabled=true},
mode=SYNCHRONOUS, timeout=480000
...
15:39:21,074 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC
service thread 1-4)
dests=[michal-linhard-37465], command=StateTransferControlCommand{enabled=false},
mode=SYNCHRONOUS, timeout=480000{code}
There's no other references to StateTransferControlCommand, so how can it be that
flushWaitGate is open?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira