[infinispan-issues] [JBoss JIRA] Commented: (ISPN-277) LRU data container endlesly looping or exhibiting heavy contention
Manik Surtani (JIRA)
jira-events at lists.jboss.org
Wed Dec 9 11:59:29 EST 2009
[ https://jira.jboss.org/jira/browse/ISPN-277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12499075#action_12499075 ]
Manik Surtani commented on ISPN-277:
------------------------------------
Corrected this. Essentially,
* FIFODataContainer and LRUDataContainer are implementations of a concurrent linked hashmap, similar to the JDK's LinkedHashMap, utilising the Sundell/Tsigas lock-free linked list algorithm [1].
* There is a race in the CorrectPrev() function, which leads to an infinite loop under very specific conditions. Easy to recreate in a stress test I have.
* Fixing this is very hard. I now have 2 alternate implementations (FIFODataContainer and FIFOAMRDataContainer, see Javadocs on both to understand the differences)
It appears that the problem lies in the concurrent update of a link on get as well as put, something that the algorithm may not support. I am in conversation with the algorithm designers to detail this out some more.
In the meanwhile, I have implemented FIFO and LRU data containers using standard JDK collections but it will perform much worse - O(N log N) during iteration although put, get and remove have the same O(1) performance. There is a memory overhead too, since last used or creation timestamps are collected for every entry (an extra 8 bytes per entry for immortal entries, no extra overhead for others).
My plan is to go with these implementations for now, and once we have sorted the concurrent linked list issues we look at reverting to the better performing implementations.
[1] http://www.md.chalmers.se/~tsigas/papers/Lock-Free-Deques-Doubly-Lists-JPDC.pdf
> LRU data container endlesly looping or exhibiting heavy contention
> ------------------------------------------------------------------
>
> Key: ISPN-277
> URL: https://jira.jboss.org/jira/browse/ISPN-277
> Project: Infinispan
> Issue Type: Bug
> Components: Eviction
> Affects Versions: 4.0.0.CR2
> Reporter: Galder Zamarreno
> Assignee: Manik Surtani
> Priority: Critical
> Fix For: 4.0.0.CR3
>
> Attachments: 3dumps-fail.txt, td2.txt
>
>
> Something around LRU container is not working fine. The attached log from an concurrency test in the 2nd level cache shows that in 3 thread dumps taken over 30 seconds appart, UserRunnerThread-5 is stuck in:
> "UserRunnerThread-5" prio=10 tid=0x6f65bc00 nid=0xdea runnable [0x05efb000]
> java.lang.Thread.State: RUNNABLE
> at org.infinispan.container.FIFODataContainer$LinkedEntry.casNext(FIFODataContainer.java:180)
> at org.infinispan.container.FIFODataContainer.correctPrev(FIFODataContainer.java:329)
> at org.infinispan.container.FIFODataContainer.linkAtEnd(FIFODataContainer.java:252)
> at org.infinispan.container.LRUDataContainer.put(LRUDataContainer.java:70)
> at org.infinispan.container.entries.ReadCommittedEntry.commit(ReadCommittedEntry.java:161)
> at org.infinispan.interceptors.LockingInterceptor.commitEntry(LockingInterceptor.java:298)
> at org.infinispan.interceptors.LockingInterceptor.cleanupLocks(LockingInterceptor.java:281)
> at org.infinispan.interceptors.LockingInterceptor.doAfterCall(LockingInterceptor.java:243)
> at org.infinispan.interceptors.LockingInterceptor.visitPutKeyValueCommand(LockingInterceptor.java:200)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
> at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132)
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:57)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
> at org.infinispan.interceptors.MarshalledValueInterceptor.visitPutKeyValueCommand(MarshalledValueInterceptor.java:93)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
> at org.infinispan.interceptors.TxInterceptor.enlistWriteAndInvokeNext(TxInterceptor.java:185)
> at org.infinispan.interceptors.TxInterceptor.visitPutKeyValueCommand(TxInterceptor.java:132)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:48)
> at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:34)
> at org.infinispan.commands.AbstractVisitor.visitPutKeyValueCommand(AbstractVisitor.java:57)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:76)
> at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:269)
> at org.infinispan.CacheDelegate.putIfAbsent(CacheDelegate.java:422)
> at org.infinispan.CacheDelegate.putIfAbsent(CacheDelegate.java:153)
> at org.infinispan.CacheDelegate.putForExternalRead(CacheDelegate.java:243)
> at org.hibernate.cache.infinispan.util.CacheAdapterImpl.putForExternalRead(CacheAdapterImpl.java:115)
> at org.hibernate.cache.infinispan.access.TransactionalAccessDelegate.putFromLoad(TransactionalAccessDelegate.java:91)
> at org.hibernate.cache.infinispan.collection.TransactionalAccess.putFromLoad(TransactionalAccess.java:44)
> at org.hibernate.engine.loading.CollectionLoadContext.addCollectionToCache(CollectionLoadContext.java:333)
> at org.hibernate.engine.loading.CollectionLoadContext.endLoadingCollection(CollectionLoadContext.java:279)
> at org.hibernate.engine.loading.CollectionLoadContext.endLoadingCollections(CollectionLoadContext.java:245)
> at org.hibernate.engine.loading.CollectionLoadContext.endLoadingCollections(CollectionLoadContext.java:218)
> at org.hibernate.loader.Loader.endCollectionLoad(Loader.java:901)
> at org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:886)
> at org.hibernate.loader.Loader.doQuery(Loader.java:750)
> at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:257)
> at org.hibernate.loader.Loader.loadCollection(Loader.java:2019)
> at org.hibernate.loader.collection.CollectionLoader.initialize(CollectionLoader.java:62)
> at org.hibernate.persister.collection.AbstractCollectionPersister.initialize(AbstractCollectionPersister.java:628)
> at org.hibernate.event.def.DefaultInitializeCollectionEventListener.onInitializeCollection(DefaultInitializeCollectionEventListener.java:83)
> at org.hibernate.impl.SessionImpl.initializeCollection(SessionImpl.java:1817)
> at org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:366)
> at org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersistentCollection.java:108)
> at org.hibernate.collection.AbstractPersistentCollection.readSize(AbstractPersistentCollection.java:131)
> at org.hibernate.collection.PersistentSet.isEmpty(PersistentSet.java:169)
> at org.hibernate.test.cache.infinispan.functional.ConcurrentWriteTest.getFirstContact(ConcurrentWriteTest.java:405)
> at org.hibernate.test.cache.infinispan.functional.ConcurrentWriteTest.access$0(ConcurrentWriteTest.java:397)
> at org.hibernate.test.cache.infinispan.functional.ConcurrentWriteTest$UserRunner.contactExists(ConcurrentWriteTest.java:519)
> at org.hibernate.test.cache.infinispan.functional.ConcurrentWriteTest$UserRunner.call(ConcurrentWriteTest.java:540)
> at org.hibernate.test.cache.infinispan.functional.ConcurrentWriteTest$UserRunner.call(ConcurrentWriteTest.java:1)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:636)
> This concurrency test does not get to finish in over a minute. However, once LRU is switched to NONE or FIFO, the test runs in 5 seconds.
> So, something looks fishy with LRU.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list