[JBoss JIRA] (ISPN-8305) PessimisticTxPartitionAndMergeDuringPrepareTest.testPrimaryOwnerIsolatedPartitionWithDiscard[DIST_SYNC] randomly failing
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-8305?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-8305:
-----------------------------------
Description:
Dan said it could be related to ISPN-8232 (Dan had not seen failures since ISPN-6997)
{code}
Error Message
Expected key 'MagicKey#k1{151C/F56895B4/127@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561}' to be locked on cache 'Cache 'pes-cache'@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561'
Stacktrace
java.lang.AssertionError: Expected key 'MagicKey#k1{151C/F56895B4/127@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561}' to be locked on cache 'Cache 'pes-cache'@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561'
at org.infinispan.test.AbstractCacheTest.assertLocked(AbstractCacheTest.java:77)
at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringPrepareTest.checkLocksDuringPartition(PessimisticTxPartitionAndMergeDuringPrepareTest.java:75)
at org.infinispan.partitionhandling.BasePessimisticTxPartitionAndMergeTest.doTest(BasePessimisticTxPartitionAndMergeTest.java:79)
at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringPrepareTest.testPrimaryOwnerIsolatedPartitionWithDiscard(PessimisticTxPartitionAndMergeDuringPrepareTest.java:41)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
... Removed 18 stack frames{code}
was:
Dan said it could be related to ISPN-8232 (Dan had not seen failures since ISPN-6997)
{code}
testPrimaryOwnerIsolatedPartitionWithDiscard[DIST_SYNC]
java.lang.AssertionError: Expected key 'MagicKey#k1{151C/F56895B4/127@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561}' to be locked on cache 'Cache 'pes-cache'@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561' at org.infinispan.test.AbstractCacheTest.assertLocked(AbstractCacheTest.java:77) at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringPrepareTest.checkLocksDuringPartition(PessimisticTxPartitionAndMergeDuringPrepareTest.java:75) at org.infinispan.partitionhandling.BasePessimisticTxPartitionAndMergeTest.doTest(BasePessimisticTxPartitionAndMergeTest.java:79) at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringPrepareTest.testPrimaryOwnerIsolatedPartitionWithDiscard(PessimisticTxPartitionAndMergeDuringPrepareTest.java:41) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ... Removed 18 stack frames
{code}
> PessimisticTxPartitionAndMergeDuringPrepareTest.testPrimaryOwnerIsolatedPartitionWithDiscard[DIST_SYNC] randomly failing
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-8305
> URL: https://issues.jboss.org/browse/ISPN-8305
> Project: Infinispan
> Issue Type: Bug
> Reporter: Galder Zamarreño
> Labels: testsuite_stability
>
> Dan said it could be related to ISPN-8232 (Dan had not seen failures since ISPN-6997)
> {code}
> Error Message
> Expected key 'MagicKey#k1{151C/F56895B4/127@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561}' to be locked on cache 'Cache 'pes-cache'@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561'
> Stacktrace
> java.lang.AssertionError: Expected key 'MagicKey#k1{151C/F56895B4/127@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561}' to be locked on cache 'Cache 'pes-cache'@PessimisticTxPartitionAndMergeDuringPrepareTest[DIST_SYNC]-NodeB-47561'
> at org.infinispan.test.AbstractCacheTest.assertLocked(AbstractCacheTest.java:77)
> at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringPrepareTest.checkLocksDuringPartition(PessimisticTxPartitionAndMergeDuringPrepareTest.java:75)
> at org.infinispan.partitionhandling.BasePessimisticTxPartitionAndMergeTest.doTest(BasePessimisticTxPartitionAndMergeTest.java:79)
> at org.infinispan.partitionhandling.PessimisticTxPartitionAndMergeDuringPrepareTest.testPrimaryOwnerIsolatedPartitionWithDiscard(PessimisticTxPartitionAndMergeDuringPrepareTest.java:41)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> ... Removed 18 stack frames{code}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 6 months
[JBoss JIRA] (ISPN-8111) OffHeapBoundedSingleNodeTest.testMultiThreaded fails with trace logging enabled
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-8111?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-8111:
------------------------------------
I'm not closing this yet, because {{OffHeapBoundedSingleNodeTest}} is still using transactions :)
> OffHeapBoundedSingleNodeTest.testMultiThreaded fails with trace logging enabled
> -------------------------------------------------------------------------------
>
> Key: ISPN-8111
> URL: https://issues.jboss.org/browse/ISPN-8111
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.1.0.Final
> Reporter: Dan Berindei
> Assignee: William Burns
> Priority: Critical
> Labels: testsuite_stability
>
> This is related to ISPN-8110, but it may be possible to fix ISPN-8110 and still have {{OffHeapBoundedSingleNodeTest.testMultiThreaded}} take more than 10 seconds.
> 10 seconds seems more than enough for a CPU-bound test, but maybe we need to allow for more time when trace logging is enabled, or maybe we can test only the data container to eliminate the unrelated logging from the (unnecessary, I think) commands and interceptors:
> {noformat}
> 15:52:10,227 TRACE (testng-Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=WrappedByteArray{bytes=[B0x010201046B657931, hashCode=1745974967}, value=WrappedByteArray{bytes=[B0x0102010676616C75..[10], hashCode=635645735}, flags=[], commandInvocationId=, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext@5f3c06c1]
> 15:52:10,229 TRACE (testng-Test:[]) [InvocationContextInterceptor] Invoked with command PrepareCommand {modifications=[PutKeyValueCommand{key=WrappedByteArray{bytes=[B0x010201046B657931, hashCode=1745974967}, value=WrappedByteArray{bytes=[B0x0102010676616C75..[10], hashCode=635645735}, flags=[], commandInvocationId=, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1}], onePhaseCommit=false, retried=false, gtx=GlobalTx:local:2, cacheName='___defaultcache', topologyId=-1} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext@3bc20a52]
> 15:52:10,231 TRACE (testng-Test:[]) [InvocationContextInterceptor] Invoked with command CommitCommand {gtx=GlobalTx:local:2, cacheName='___defaultcache', topologyId=-1} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext@2af56f39]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 6 months
[JBoss JIRA] (ISPN-8110) OffHeapBoundedDataContainer.ensureSize() busy loop takes too long
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-8110?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-8110:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> OffHeapBoundedDataContainer.ensureSize() busy loop takes too long
> -----------------------------------------------------------------
>
> Key: ISPN-8110
> URL: https://issues.jboss.org/browse/ISPN-8110
> Project: Infinispan
> Issue Type: Task
> Components: Core
> Affects Versions: 9.1.0.Final
> Reporter: Dan Berindei
> Assignee: William Burns
> Fix For: 9.1.1.Final
>
> Attachments: OffHeapBoundedSingleNodeTest_ISPN-5476_parallel_xsite_20170726.log.gz
>
>
> {{OffHeapBoundedDataContainer.ensureSize()}} needs to acquire the {{lruLock}} in order to check the container size, and then it needs to acquire the write lock of the entry pointed at by the LRU list head ({{firstAddress}}) in order to remove it.
> Currently it tries to acquire the entry write lock with {{locks.getLockFromHashCode(hashCode).writeLock().tryLock()}} while holding the {{lruLock}}. But the entry's read-write lock is actually shared by a bucket of entries, so another thread reading a completely different entry fail the {{tryLock}} and will force {{ensureSize()}} to unlock {{lruLock}} and try again.
> Even worse, a regular put will acquire the entry/bucket's write lock first, and then try to acquire the {{lruLock}} while holding the bucket lock. Because {{lruLock}} isn't fair, it's possible for 2 or more {{ensureSize}} threads to block a writer thread for a long time.
> {{OffHeapBoundedSingleNodeTest.testMultiThreaded}} shows a different pathological scenario, one that I don't have an explanation for yet. At least with trace logging enabled, the {{ensureSize()}} loop can repeat for 100+ milliseconds, but without any other thread logging anything:
> {noformat}
> 15:52:20,012 TRACE (ForkThread-3,Test:[]) [BoundedOffHeapDataContainer] Removing first LRU node at 140339935456768
> 15:52:20,012 TRACE (ForkThread-3,Test:[]) [UnsafeWrapper] Retrieved long value 140339734124528 from address 140339935456784
> 15:52:20,012 TRACE (ForkThread-3,Test:[]) [UnsafeWrapper] Wrote long value 0 to address 140339734124536
> 15:52:20,012 TRACE (ForkThread-3,Test:[]) [UnpooledOffHeapMemoryAllocator] Deallocating off heap memory at 140339935456768 with 28 bytes. Total size: 8727
> 15:52:20,012 TRACE (ForkThread-3,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140340069670800 to [B@38d5b85f offset by 16 with a total of 17 bytes
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-3,Test:[]) [UnpooledOffHeapMemoryAllocator] Deallocating off heap memory at 140340069670784 with 57 bytes. Total size: 8670
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Retrieved long value 140339868355200 from address 140339734124528
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [BoundedOffHeapDataContainer] Removing entry: 140339868355200 due to eviction due to size 102 being larger than maximum of 100
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140339868355216 to [B@5e23dca1 offset by 16 with a total of 17 bytes
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140339868355233 to [B@50382f4a offset by 16 with a total of 11 bytes
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Retrieved long value 0 from address 140339868355208
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140339868355216 to [B@3b554f41 offset by 16 with a total of 17 bytes
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140339868355233 to [B@472da290 offset by 16 with a total of 11 bytes
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140339868355244 to [B@7aba800 offset by 16 with a total of 0 bytes
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Copying memory of object null offset by 140339868355244 to [B@167885c0 offset by 16 with a total of 13 bytes
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-5,Test:[]) [UnsafeWrapper] Retrieved long value 140339734124528 from address 140339868355200
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,012 TRACE (ForkThread-2,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> ...
> 15:52:20,147 TRACE (ForkThread-3,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,147 TRACE (ForkThread-3,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,147 TRACE (ForkThread-3,Test:[]) [UnsafeWrapper] Retrieved int value -1075141583 from address 140339734124552
> 15:52:20,147 TRACE (ForkThread-5,Test:[]) [BoundedOffHeapDataContainer] Removing first LRU node at 140339734124528
> {noformat}
> Off-topic, it's a bit weird that {{UnsafeWrapper}} logs every address being read from/written to, but the data container doesn't log the keys and values that it reads/writes...
> I propose (after discussing this with Will) that {{ensureSize()}} should acquire the bucket lock after releasing the {{lruLock}}, so it can use {{lock()}} instead of {{tryLock()}}.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 6 months
[JBoss JIRA] (ISPN-8263) More Hibernate random size failures
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-8263?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-8263:
-----------------------------------
Git Pull Request: https://github.com/infinispan/infinispan/pull/5417, https://github.com/infinispan/infinispan/pull/5436 (was: https://github.com/infinispan/infinispan/pull/5417)
> More Hibernate random size failures
> -----------------------------------
>
> Key: ISPN-8263
> URL: https://issues.jboss.org/browse/ISPN-8263
> Project: Infinispan
> Issue Type: Bug
> Components: Hibernate Cache
> Affects Versions: 9.1.0.Final
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: testsuite_stability
>
> org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.testAll[nonstrict, REPL_SYNC]
> org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.testAll[nonstrict, DIST_SYNC]
> {code}
> Error Message
> expected:<1> but was:<2>
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<2>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at org.infinispan.test.hibernate.cache.functional.cluster.NaturalIdInvalidationTest.testAll(NaturalIdInvalidationTest.java:129)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:45)
> at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
8 years, 6 months