[JBoss JIRA] (ISPN-3803) Stale locks during state transfer in non-tx caches
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-3803?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-3803:
-----------------------------------------------
Tristan Tarrant <ttarrant(a)redhat.com> changed the Status of [bug 1073327|https://bugzilla.redhat.com/show_bug.cgi?id=1073327] from MODIFIED to ON_QA
> Stale locks during state transfer in non-tx caches
> --------------------------------------------------
>
> Key: ISPN-3803
> URL: https://issues.jboss.org/browse/ISPN-3803
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer, Transactions
> Affects Versions: 6.0.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 621, testsuite_stability
> Fix For: 7.0.0.Alpha1, 7.0.0.Final
>
>
> The problem is that {{SingleKeyNonTxInvocationContext.addLockedKey()}} only sets a {{isLocked}} flag, the actual key is set when the entry is wrapped and inserted in the context. If the topology changes between the lock acquisition and the entry wrapping, {{SingleKeyNonTxInvocationContext.getLockedKeys()}} will return an empty set and the lock won't be released.
> Future commands won't try to acquire a lock on this node, so the stale lock is harmless most of the time. But if there was already a command waiting for the lock, that command will time out (instead of retrying on the new primary owner).
> This causes random failures in NonTxPutIfAbsentDuringJoinStressTest:
> {noformat}
> 12:34:50,960 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key_52, value=value_52_2, flags=null, putIfAbsent=true, valueMatcher=MATCH_EXPECTED, metadata=EmbeddedMetadata{version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@478c9796]
> 12:34:50,960 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [NonTransactionalLockingInterceptor] Are (NodeA-19338) we the lock owners for key 'key_52'? true
> 12:34:50,960 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [LockManagerImpl] Attempting to lock key_52 with acquisition timeout of 10000 millis
> 12:34:50,962 TRACE (remote-thread-1,NodeA:___defaultcache) [NonTransactionalLockingInterceptor] Are (NodeA-19338) we the lock owners for key 'key_52'? true
> 12:34:50,962 TRACE (remote-thread-1,NodeA:___defaultcache) [LockManagerImpl] Attempting to lock key_52 with acquisition timeout of 10000 millis
> 12:34:50,965 TRACE (asyncTransportThread-0,NodeA:___defaultcache) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = false, isMember = true, topology = CacheTopology{id=4, currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-19338, NodeB-49967, NodeC-56763]}, pendingCH=null}
> 12:34:50,966 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [LockManagerImpl] Successfully acquired lock key_52!
> *** 12:34:50,966 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [EntryWrappingInterceptor] Wrapping entry 'key_52'? false
> 12:34:50,966 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [StateTransferInterceptor] Retrying command because of topology change: PutKeyValueCommand{key=key_52, value=value_52_2, flags=null, putIfAbsent=true, valueMatcher=MATCH_EXPECTED, metadata=EmbeddedMetadata{version=null}, successful=true}
> 12:34:50,978 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [NonTransactionalLockingInterceptor] Are (NodeA-19338) we the lock owners for key 'key_52'? false
> 12:34:50,978 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [EntryWrappingInterceptor] Wrapping entry 'key_52'? false
> 12:34:50,978 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [BaseDistributionInterceptor] I'm not the primary owner, so sending the command to the primary owner(NodeC-56763) in order to be forwarded
> 12:34:51,007 TRACE (ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest:___defaultcache) [EntryWrappingInterceptor] The return value is value_52_0
> 12:35:00,963 TRACE (remote-thread-1,NodeA:___defaultcache) [ReentrantPerEntryLockContainer] Timed out attempting to acquire lock for key key_52 after 10 seconds
> 12:35:00,963 DEBUG (remote-thread-1,NodeA:___defaultcache) [LockManagerImpl] Failed to acquire lock key_52, owner is Thread[ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest,5,]
> 12:35:00,963 DEBUG (remote-thread-1,NodeA:___defaultcache) [LockManagerImpl] This transaction (Thread[remote-thread-1,NodeA,5,main]) already owned locks []
> 12:35:00,966 ERROR (testng-NonTxPutIfAbsentDuringJoinStressTest:) [UnitTestTestNGListener] Test testNodeJoiningDuringPutIfAbsent(org.infinispan.distribution.rehash.NonTxPutIfAbsentDuringJoinStressTest) failed.
> java.util.concurrent.ExecutionException: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from NodeA-19338, see cause for remote stack trace
> Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [key_52] for requestor [Thread[remote-thread-1,NodeA,5,main]]! Lock held by [Thread[ForkThread-3,NonTxPutIfAbsentDuringJoinStressTest,5,]]{noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-4083) Do not invalidate null Transport
by Radim Vansa (JIRA)
Radim Vansa created ISPN-4083:
---------------------------------
Summary: Do not invalidate null Transport
Key: ISPN-4083
URL: https://issues.jboss.org/browse/ISPN-4083
Project: Infinispan
Issue Type: Bug
Components: Remote Protocols
Affects Versions: 7.0.0.Alpha1
Reporter: Radim Vansa
Assignee: Galder Zamarreño
Priority: Minor
In RetryOnFailureOperation.execute, when the getTransport throws TransportException, we try to invalidate the transport. However, it may be still null - in that case the invalidation fails and WARNing is logged.
Add a check preventing invalidation with null transport.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-4082) TcpTransportFactory shouldn't log error when it cannot fetch transport
by Radim Vansa (JIRA)
Radim Vansa created ISPN-4082:
---------------------------------
Summary: TcpTransportFactory shouldn't log error when it cannot fetch transport
Key: ISPN-4082
URL: https://issues.jboss.org/browse/ISPN-4082
Project: Infinispan
Issue Type: Feature Request
Components: Remote Protocols
Affects Versions: 7.0.0.Alpha1
Reporter: Radim Vansa
Assignee: Galder Zamarreño
Priority: Minor
With most HotRod operations being automatically retried when the connection fails, it's rather undesired to log errors each time one of the server fails - it confuses the users as if the application on client side did some error.
I believe that the verbosity of log.couldNotFetchTransport (used in TcpTransportFactory.borrowTransportFromPool) should be reduced to trace or debug level.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-4082) TcpTransportFactory shouldn't log error when it cannot fetch transport
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-4082?page=com.atlassian.jira.plugin.... ]
Radim Vansa updated ISPN-4082:
------------------------------
Issue Type: Bug (was: Feature Request)
> TcpTransportFactory shouldn't log error when it cannot fetch transport
> ----------------------------------------------------------------------
>
> Key: ISPN-4082
> URL: https://issues.jboss.org/browse/ISPN-4082
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Protocols
> Affects Versions: 7.0.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Galder Zamarreño
> Priority: Minor
>
> With most HotRod operations being automatically retried when the connection fails, it's rather undesired to log errors each time one of the server fails - it confuses the users as if the application on client side did some error.
> I believe that the verbosity of log.couldNotFetchTransport (used in TcpTransportFactory.borrowTransportFromPool) should be reduced to trace or debug level.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-4081) JBoss Marshalling 1.4.4.Final (ASL)
by Tristan Tarrant (JIRA)
Tristan Tarrant created ISPN-4081:
-------------------------------------
Summary: JBoss Marshalling 1.4.4.Final (ASL)
Key: ISPN-4081
URL: https://issues.jboss.org/browse/ISPN-4081
Project: Infinispan
Issue Type: Component Upgrade
Affects Versions: 6.0.1.Final
Reporter: Tristan Tarrant
Assignee: Tristan Tarrant
Fix For: 6.0.2.Final
JBoss Marshalling 1.4.4.Final has been licensed under the ASL (and fixes a couple of bugs).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-3868) Deadlock in RemoteCache getAsync
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-3868?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-3868:
-----------------------------------------------
Pedro Ruivo <pruivo(a)redhat.com> changed the Status of [bug 1073331|https://bugzilla.redhat.com/show_bug.cgi?id=1073331] from POST to MODIFIED
> Deadlock in RemoteCache getAsync
> --------------------------------
>
> Key: ISPN-3868
> URL: https://issues.jboss.org/browse/ISPN-3868
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 6.0.0.Final
> Environment: RemoteCahe component of 6.0.0.Final
> Reporter: Alexander Furer
> Assignee: Mircea Markus
> Labels: remote
> Fix For: 7.0.0.Alpha1, 7.0.0.Final
>
>
> Here is the implementation of remoteCahe.getAsync() :
> {code}
> public NotifyingFuture<V> getAsync(final K key) {
> assertRemoteCacheManagerIsStarted();
> final NotifyingFutureImpl<V> result = new NotifyingFutureImpl<V>();
> Future<V> future = executorService.submit(new Callable<V>() {
> @Override
> public V call() throws Exception {
> V toReturn = get(key);
> result.notifyFutureCompletion();
> return toReturn;
> }
> });
> result.setExecuting(future);
> return result;
> }
> {code}
> 2 problems here :
> 1. Callable's call method might be called BEFORE calling client had a chance to add listener (i.e. getAsync is not returned yet), in this case its' listener futureDone method will never be called.
> 2. Even case #1 has not happened and notifyFutureCompletion is called on listener, but the future is not resolved yet : "call" has not returned, that's why the future that is passed to listener is not resolved, and calling future.get from listener blocks forever.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months