]
Robert Stupp commented on ISPN-2240:
------------------------------------
We don't need it at the moment - thanks.
Per-key lock container leads to superfluous TimeoutExceptions on
concurrent access to same key
----------------------------------------------------------------------------------------------
Key: ISPN-2240
URL:
https://issues.jboss.org/browse/ISPN-2240
Project: Infinispan
Issue Type: Bug
Security Level: Public(Everyone can see)
Components: Transactions
Affects Versions: 4.0.0.ALPHA1, 5.1.6.FINAL
Reporter: Robert Stupp
Assignee: Mircea Markus
Priority: Critical
Fix For: 7.0.0.Beta1, 7.0.0.Final
Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
Hi,
I've encountered a lot of TimeoutExceptions just running a load test against an
infinispan cluster.
I tracked down the reason and found out, that the code in
org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock()
causes these superfluous TimeoutExceptions.
A small test case (which just prints out timeouts, too late timeouts and
"paints" a lot of dots to the console - more dots/second on the console means
better throughput ;-)
In a short test I extended the class ReentrantPerEntryLockContainer and changed the
implementation of releaseLock() as follows:
{noformat}
public void releaseLock(Object lockOwner, Object key) {
ReentrantLock l = locks.get(key);
if (l != null) {
if (!l.isHeldByCurrentThread())
throw new IllegalStateException("Lock for [" + key + "]
not held by current thread " + Thread.currentThread());
while (l.isHeldByCurrentThread())
unlock(l, lockOwner);
if (!l.hasQueuedThreads())
locks.remove(key);
}
else
throw new IllegalStateException("No lock for [" + key +
']');
}
{noformat}
The main improvement is that locks are not removed from the concurrent map as long as
other threads are waiting on that lock.
If the lock is removed from the map while other threads are waiting for it, they may run
into timeouts and force TimeoutExceptions to the client.
The above methods "paints more dots per second" - means: it gives a better
throughput for concurrent accesses to the same key.
The re-implemented method should also fix some replication timeout exceptions.
Please, please add this to 5.1.7, if possible.