[infinispan-issues] [JBoss JIRA] (ISPN-2240) Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key

Mon Sep 3 09:37:32 EDT 2012

    [ https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715426#comment-12715426 ] 

Robert Stupp commented on ISPN-2240:
------------------------------------

The solution could look like this:

# Add a new interceptor which is called before any locking interceptor (call it for example ModifyingInterceptor)
# Let the ModifyingInterceptor intercept all write commands
# The interceptors' logic shall do:
## If the local node is the primary master, just go through the normal interceptor chain
## If the local node is not the primary master
### Issue the command against the primary master
### Update the local cache according to the result from the primary master (ignore L1 updates from the primary master for this update)
### Return the result from the primary master

This would ensure that no deadlocks due to concurrent updates against the same cache key can occur because all locking takes place on the primary master.

> Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
> ----------------------------------------------------------------------------------------------
>
>                 Key: ISPN-2240
>                 URL: https://issues.jboss.org/browse/ISPN-2240
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Locking and Concurrency
>    Affects Versions: 5.1.6.FINAL, 5.1.x
>            Reporter: Robert Stupp
>            Assignee: Mircea Markus
>         Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
>
>
> Hi,
> I've encountered a lot of TimeoutExceptions just running a load test against an infinispan cluster.
> I tracked down the reason and found out, that the code in org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock() causes these superfluous TimeoutExceptions.
> A small test case (which just prints out timeouts, too late timeouts and "paints" a lot of dots to the console - more dots/second on the console means better throughput ;-)
> In a short test I extended the class ReentrantPerEntryLockContainer and changed the implementation of releaseLock() as follows:
> {noformat}
>     public void releaseLock(Object lockOwner, Object key) {
>         ReentrantLock l = locks.get(key);
>         if (l != null) {
>             if (!l.isHeldByCurrentThread())
>                 throw new IllegalStateException("Lock for [" + key + "] not held by current thread " + Thread.currentThread());
>             while (l.isHeldByCurrentThread())
>                 unlock(l, lockOwner);
>             if (!l.hasQueuedThreads())
>                 locks.remove(key);
>         }
>         else
>             throw new IllegalStateException("No lock for [" + key + ']');
>     }
> {noformat}
> The main improvement is that locks are not removed from the concurrent map as long as other threads are waiting on that lock.
> If the lock is removed from the map while other threads are waiting for it, they may run into timeouts and force TimeoutExceptions to the client.
> The above methods "paints more dots per second" - means: it gives a better throughput for concurrent accesses to the same key.
> The re-implemented method should also fix some replication timeout exceptions.
> Please, please add this to 5.1.7, if possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira