[
https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin....
]
Robert Stupp commented on ISPN-2240:
------------------------------------
Method
{noformat}
org.infinispan.interceptors.DistributionInterceptor#handleWriteCommand
{noformat}
submits messages unnecessarily to itself via JGroups
There seems to be something going wrong in the distribution code.
Our test case is as follows:
* 4 load generators submit 100 requests at once (4 * 100 parallel threads)
* 4 application servers form an Infinispan cluster
* all app servers are allowed to get and put all cache entries - concurrent updates to the
same cache keys occur
The result is that I see a lot of TimeoutExceptions - all of them look like this:
TimeoutException caused in either an app server thread or a JGroups OOB thread. Either
wants to lock a single key.
Both fail because another local thread is waiting for a distribution response.
Origin locking thread (thread holding the lock):
{noformat}
sun.misc.Unsafe.park Unsafe.java(-2^M
java.util.concurrent.locks.LockSupport.parkNanos LockSupport.java(196^M
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
AbstractQueuedSynchronizer.java(2116^M
org.jgroups.blocks.Request.responsesComplete Request.java(195^M
org.jgroups.blocks.Request.execute Request.java(89^M
org.jgroups.blocks.MessageDispatcher.sendMessage MessageDispatcher.java(366^M
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall
CommandAwareRpcDispatcher.java(275^M
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand
CommandAwareRpcDispatcher.java(165^M
org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely
JGroupsTransport.java(489^M
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely RpcManagerImpl.java(161^M
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely RpcManagerImpl.java(183^M
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely RpcManagerImpl.java(240^M
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely RpcManagerImpl.java(227^M
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely RpcManagerImpl.java(222^M
org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely RpcManagerImpl.java(217^M
org.infinispan.interceptors.DistributionInterceptor.handleWriteCommand
DistributionInterceptor.java(512^M
org.infinispan.interceptors.DistributionInterceptor.visitPutKeyValueCommand
DistributionInterceptor.java(270^M
org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor
PutKeyValueCommand.java(77^M
...
{noformat}
{noformat}
java.lang.Thread.getStackTrace Thread.java(1479^M
org.infinispan.util.concurrent.locks.containers.ReentrantPerEntryLockContainer.tryLock
ReentrantPerEntryLockContainer.java(97^M
org.infinispan.util.concurrent.locks.containers.ReentrantPerEntryLockContainer.tryLock
ReentrantPerEntryLockContainer.java(34^M
org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer.acquireLock
AbstractPerEntryLockContainer.java(76^M
org.infinispan.util.concurrent.locks.DeadlockDetectingLockManager.lockAndRecord
DeadlockDetectingLockManager.java(115^M
org.infinispan.util.concurrent.locks.LockManagerImpl.lock LockManagerImpl.java(209^M
org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLockNoCheck
LockManagerImpl.java(201^M
org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockKey
AbstractLockingInterceptor.java(114^M
org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.visitPutKeyValueCommand
NonTransactionalLockingInterceptor.java(67^M
org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor
PutKeyValueCommand.java(77^M
org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor
CommandInterceptor.java(116^M
org.infinispan.interceptors.base.CommandInterceptor.handleDefault
CommandInterceptor.java(130^M
org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand
AbstractVisitor.java(62^M
org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor
PutKeyValueCommand.java(77^M
org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor
CommandInterceptor.java(116^M
...
org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal
InboundInvocationHandlerImpl.java(127^M
org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks
InboundInvocationHandlerImpl.java(136^M
org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithRetry
InboundInvocationHandlerImpl.java(162^M
org.infinispan.remoting.InboundInvocationHandlerImpl.handle
InboundInvocationHandlerImpl.java(114^M
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommand
CommandAwareRpcDispatcher.java(226^M
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle
CommandAwareRpcDispatcher.java(203^M
org.jgroups.blocks.RequestCorrelator.handleRequest RequestCorrelator.java(465^M
org.jgroups.blocks.RequestCorrelator.receiveMessage RequestCorrelator.java(372^M
org.jgroups.blocks.RequestCorrelator.receive RequestCorrelator.java(247^M
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up MessageDispatcher.java(601^M
org.jgroups.JChannel.up JChannel.java(715^M
org.jgroups.stack.ProtocolStack.up ProtocolStack.java(1020^M
{noformat}
Per-key lock container leads to superfluous TimeoutExceptions on
concurrent access to same key
----------------------------------------------------------------------------------------------
Key: ISPN-2240
URL:
https://issues.jboss.org/browse/ISPN-2240
Project: Infinispan
Issue Type: Bug
Components: Locking and Concurrency
Affects Versions: 5.1.6.FINAL, 5.1.x
Reporter: Robert Stupp
Assignee: Mircea Markus
Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
Hi,
I've encountered a lot of TimeoutExceptions just running a load test against an
infinispan cluster.
I tracked down the reason and found out, that the code in
org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock()
causes these superfluous TimeoutExceptions.
A small test case (which just prints out timeouts, too late timeouts and
"paints" a lot of dots to the console - more dots/second on the console means
better throughput ;-)
In a short test I extended the class ReentrantPerEntryLockContainer and changed the
implementation of releaseLock() as follows:
{noformat}
public void releaseLock(Object lockOwner, Object key) {
ReentrantLock l = locks.get(key);
if (l != null) {
if (!l.isHeldByCurrentThread())
throw new IllegalStateException("Lock for [" + key + "]
not held by current thread " + Thread.currentThread());
while (l.isHeldByCurrentThread())
unlock(l, lockOwner);
if (!l.hasQueuedThreads())
locks.remove(key);
}
else
throw new IllegalStateException("No lock for [" + key +
']');
}
{noformat}
The main improvement is that locks are not removed from the concurrent map as long as
other threads are waiting on that lock.
If the lock is removed from the map while other threads are waiting for it, they may run
into timeouts and force TimeoutExceptions to the client.
The above methods "paints more dots per second" - means: it gives a better
throughput for concurrent accesses to the same key.
The re-implemented method should also fix some replication timeout exceptions.
Please, please add this to 5.1.7, if possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira