[JBoss JIRA] (ISPN-3803) Stale locks during state transfer in non-tx caches
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-3803?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-3803:
-------------------------------
Labels: 621 testsuite_stability (was: testsuite_stability)
> Stale locks during state transfer in non-tx caches
> --------------------------------------------------
>
> Key: ISPN-3803
> URL: https://issues.jboss.org/browse/ISPN-3803
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer, Transactions
> Affects Versions: 6.0.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 621, testsuite_stability
> Fix For: 7.0.0.Alpha1, 7.0.0.Final
>
>
> The problem is that {{SingleKeyNonTxInvocationContext.addLockedKey()}} only sets a {{isLocked}} flag, the actual key is set when the entry is wrapped and inserted in the context. If the topology changes between the lock acquisition and the entry wrapping, {{SingleKeyNonTxInvocationContext.getLockedKeys()}} will return an empty set and the lock won't be released.
> Future commands won't try to acquire a lock on this node, so the stale lock is harmless most of the time. But if there was already a command waiting for the lock, that command will time out (instead of retrying on the new primary owner).
> This causes random failures in NonTxPutIfAbsentDuringJoinStressTest:
> {noformat}
> 12:34:50,960 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key_52, value=value_52_2, flags=null, putIfAbsent=true, valueMatcher=MATCH_EXPECTED, metadata=EmbeddedMetadata{version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@478c9796]
> 12:34:50,960 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [NonTransactionalLockingInterceptor] Are (NodeA-19338) we the lock owners for key 'key_52'? true
> 12:34:50,960 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [LockManagerImpl] Attempting to lock key_52 with acquisition timeout of 10000 millis
> 12:34:50,962 TRACE (remote-thread-1,NodeA:___defaultcache) [NonTransactionalLockingInterceptor] Are (NodeA-19338) we the lock owners for key 'key_52'? true
> 12:34:50,962 TRACE (remote-thread-1,NodeA:___defaultcache) [LockManagerImpl] Attempting to lock key_52 with acquisition timeout of 10000 millis
> 12:34:50,965 TRACE (asyncTransportThread-0,NodeA:___defaultcache) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = false, isMember = true, topology = CacheTopology{id=4, currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-19338, NodeB-49967, NodeC-56763]}, pendingCH=null}
> 12:34:50,966 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [LockManagerImpl] Successfully acquired lock key_52!
> *** 12:34:50,966 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [EntryWrappingInterceptor] Wrapping entry 'key_52'? false
> 12:34:50,966 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [StateTransferInterceptor] Retrying command because of topology change: PutKeyValueCommand{key=key_52, value=value_52_2, flags=null, putIfAbsent=true, valueMatcher=MATCH_EXPECTED, metadata=EmbeddedMetadata{version=null}, successful=true}
> 12:34:50,978 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [NonTransactionalLockingInterceptor] Are (NodeA-19338) we the lock owners for key 'key_52'? false
> 12:34:50,978 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [EntryWrappingInterceptor] Wrapping entry 'key_52'? false
> 12:34:50,978 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [BaseDistributionInterceptor] I'm not the primary owner, so sending the command to the primary owner(NodeC-56763) in order to be forwarded
> 12:34:51,007 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [EntryWrappingInterceptor] The return value is value_52_0
> 12:35:00,963 TRACE (remote-thread-1,NodeA:___defaultcache) [ReentrantPerEntryLockContainer] Timed out attempting to acquire lock for key key_52 after 10 seconds
> 12:35:00,963 DEBUG (remote-thread-1,NodeA:___defaultcache) [LockManagerImpl] Failed to acquire lock key_52, owner is Thread[ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest,5,]
> 12:35:00,963 DEBUG (remote-thread-1,NodeA:___defaultcache) [LockManagerImpl] This transaction (Thread[remote-thread-1,NodeA,5,main]) already owned locks []
> 12:35:00,966 ERROR (testng-NonTxPutIfAbsentDuringJoinStressTest:) [UnitTestTestNGListener] Test testNodeJoiningDuringPutIfAbsent(org.infinispan.distribution.rehash.NonTxPutIfAbsentDuringJoinStressTest) failed.
> java.util.concurrent.ExecutionException: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from NodeA-19338, see cause for remote stack trace
> Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [key_52] for requestor [Thread[remote-thread-1,NodeA,5,main]]! Lock held by [Thread[ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest,5,]]{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 2 months
[JBoss JIRA] (ISPN-4076) JON server plugin: bytesRead and bytesWritten stats are broken
by Tomas Sykora (JIRA)
Tomas Sykora created ISPN-4076:
----------------------------------
Summary: JON server plugin: bytesRead and bytesWritten stats are broken
Key: ISPN-4076
URL: https://issues.jboss.org/browse/ISPN-4076
Project: Infinispan
Issue Type: Bug
Components: JMX, reporting and management
Affects Versions: 7.0.0.Alpha1
Reporter: Tomas Sykora
Assignee: Tomas Sykora
resource path: subsystem=endpoint
and underlined connectors provides statistics: bytesRead, bytesWritten.
However plugin asks for bytes-read and bytes-written which is wrong. Small change in map in MetricsRemappingComponent is needed.
+ We will remove monitoring of those statistics for REST endpoint, as this endpoint does not provide those statistics at all.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 2 months
[JBoss JIRA] (ISPN-3868) Deadlock in RemoteCache getAsync
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-3868?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-3868:
-----------------------------------------------
Tristan Tarrant <ttarrant(a)redhat.com> changed the Status of [bug 1073331|https://bugzilla.redhat.com/show_bug.cgi?id=1073331] from NEW to POST
> Deadlock in RemoteCache getAsync
> --------------------------------
>
> Key: ISPN-3868
> URL: https://issues.jboss.org/browse/ISPN-3868
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 6.0.0.Final
> Environment: RemoteCahe component of 6.0.0.Final
> Reporter: Alexander Furer
> Assignee: Mircea Markus
> Labels: remote
> Fix For: 7.0.0.Alpha1, 7.0.0.Final
>
>
> Here is the implementation of remoteCahe.getAsync() :
> {code}
> public NotifyingFuture<V> getAsync(final K key) {
> assertRemoteCacheManagerIsStarted();
> final NotifyingFutureImpl<V> result = new NotifyingFutureImpl<V>();
> Future<V> future = executorService.submit(new Callable<V>() {
> @Override
> public V call() throws Exception {
> V toReturn = get(key);
> result.notifyFutureCompletion();
> return toReturn;
> }
> });
> result.setExecuting(future);
> return result;
> }
> {code}
> 2 problems here :
> 1. Callable's call method might be called BEFORE calling client had a chance to add listener (i.e. getAsync is not returned yet), in this case its' listener futureDone method will never be called.
> 2. Even case #1 has not happened and notifyFutureCompletion is called on listener, but the future is not resolved yet : "call" has not returned, that's why the future that is passed to listener is not resolved, and calling future.get from listener blocks forever.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 2 months