[
https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin....
]
Dan Berindei commented on ISPN-2240:
------------------------------------
Mircea, it doesn't seem to be the same as ISPN-2381 to me.
It looks like Robert is using a non-transactional cache, and
NonTransactionalLockingInterceptor still acquires the lock on the originator and on all
the owners - not just on the primary owner.
I'm pretty sure the fix in the description is not correct, because
`l.hasQueuedThreads()` can return false even while another thread is busy working with the
key.
The solution described in
https://issues.jboss.org/browse/ISPN-2240?focusedCommentId=12715426&p...
seems reasonable, mimicking the 2-phase commit in transactional caches. Perhaps doing the
RPCs in the DistributionInterceptor would be more reasonable though.
I'm not sure about the problem with L1 invalidations arriving later, I thought we
explicitly do not send L1 invalidations to the origin of the PutKeyValueCommand, but it
could be that that check is only used when the L1 invalidation threshold is non-zero (0 is
the default). This probably warrants a separate issue.
Per-key lock container leads to superfluous TimeoutExceptions on
concurrent access to same key
----------------------------------------------------------------------------------------------
Key: ISPN-2240
URL:
https://issues.jboss.org/browse/ISPN-2240
Project: Infinispan
Issue Type: Bug
Components: Locking and Concurrency
Affects Versions: 5.1.6.FINAL, 5.1.x
Reporter: Robert Stupp
Assignee: Manik Surtani
Fix For: 5.2.0.Final
Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
Hi,
I've encountered a lot of TimeoutExceptions just running a load test against an
infinispan cluster.
I tracked down the reason and found out, that the code in
org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock()
causes these superfluous TimeoutExceptions.
A small test case (which just prints out timeouts, too late timeouts and
"paints" a lot of dots to the console - more dots/second on the console means
better throughput ;-)
In a short test I extended the class ReentrantPerEntryLockContainer and changed the
implementation of releaseLock() as follows:
{noformat}
public void releaseLock(Object lockOwner, Object key) {
ReentrantLock l = locks.get(key);
if (l != null) {
if (!l.isHeldByCurrentThread())
throw new IllegalStateException("Lock for [" + key + "]
not held by current thread " + Thread.currentThread());
while (l.isHeldByCurrentThread())
unlock(l, lockOwner);
if (!l.hasQueuedThreads())
locks.remove(key);
}
else
throw new IllegalStateException("No lock for [" + key +
']');
}
{noformat}
The main improvement is that locks are not removed from the concurrent map as long as
other threads are waiting on that lock.
If the lock is removed from the map while other threads are waiting for it, they may run
into timeouts and force TimeoutExceptions to the client.
The above methods "paints more dots per second" - means: it gives a better
throughput for concurrent accesses to the same key.
The re-implemented method should also fix some replication timeout exceptions.
Please, please add this to 5.1.7, if possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira