[infinispan-dev] Fwd: CloudCacheStore Bug

Manik Surtani manik at jboss.org
Thu Feb 4 10:29:49 EST 2010


On 4 Feb 2010, at 14:42, Philippe Van Dyck wrote:

> 
> 
> I'm trying to think of how this can be.  Worker threads adding data, adding stuff to the async cache store queue for flushing.  The eviction thread removing stuff from the data container *only*.  
> 
> *Perhaps* what you see is a race where you have:
> 
> 1 add item to data container
> 2 enqueue in async cache store for storage
> 3 evict in memory
> 4 attempt a get
> 
> Actually, the more I think about it, the transaction probably fails because the datacontainer has been emptied (get(key) does not work anymore).... But it is definitely not supposed to die silently !
>  
> 
> where steps 1 - 4 happen *before* the async cache store can flush its queue to disk.  So this would result in the thread in 4 consulting the data container, not finding the entry, then checking the cache store and not finding it there either since it hasn't been flushed yet.  
> 
> Now IMO this is normal behaviour - the price you pay for asynchronously writing to a store.  But perhaps this window can be
> 
> Am I missing something ? Loosing data is something I cannot afford ! I Plan to use this store as a *permanent* one... I have no backup ! (Actually S3 is the backup) - So, no, I don't want this ... at any price ;-)

Then set <async enabled="false" /> in your cache store config.  :-)

>  
> reduced by looking through the async queue as well, before checking the underlying store.  But as I said, this just reduces the size of this window and not eliminate it altogether, since this is async and there is no guarantee that the cache store has finished writing internally (e.g., an fsync() operation or in the case of S3, Amazon's eventual consistency model).
> 
> 
> Why should eviction be transactional?  I don't need eviction to be an all-or-nothing, reversible event. :)  If an entry gets evicted, cool.  If not (for whatever reason), too bad, move on to the next evictable entry.  
> 
> You are right, we don't want to rollback evictions... but maybe we should use a priority queue to be sure that evictions are done after any other command ? Doesn't it solve it all ? 
> 
> 1) The eviction thread runs (we could lower the priority of this thread too)
> 2) It fills a queue of keys to evict
> 3) The async queue is prioritized and evicts entries ... when there is nothing else to do (suddenly it looks like garbage collecting)

That is a possibility.  But I don't expect to be making any drastic changes to the existing eviction code anymore.  Don't know if you have been following discussions re: LIRS, lock amortization, etc., but Vladimir is working on some very interesting self-evicting, bounded data containers which would mean that the eviction threads, etc all get ripped out.

>   
> 
> WDYT ?
> 
> 
> 
> Cheers
> Manik
> 
>> 
>> Looks like a design issue ? WDYT ?
>> 
>> 
>> Cheers,
>> 
>> Phil
>> 
>> 
>> On Thu, Feb 4, 2010 at 10:44 AM, Manik Surtani <manik at jboss.org> wrote:
>> That is strange since there is no correlation between eviction and the synchronicity of cache stores.  Have you got a reproducible test for this?
>> 
>> Cheers
>> Manik
>> 
>> On 3 Feb 2010, at 18:37, Philippe Van Dyck wrote:
>> 
>>> Thanks Manik,
>>> 
>>> I have a another problem with eviction, it seems to destroy cache entries, only when I use async.
>>> 
>>> Of course, all updates are transactional.
>>> 
>>> Where should I search for clues ? Any idea ?
>>> 
>>> Here is my config:
>>> 
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> 
>>> <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>> 	xmlns="urn:infinispan:config:4.0">
>>> 	<global>
>>> 		<transport
>>> 			transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
>>> 			<properties>
>>> 				<property name="configurationFile" value="jgroups.xml" />
>>> 			</properties>
>>> 		</transport>
>>> 
>>> 	</global>
>>> 
>>> 	<namedCache name="qi4j">
>>> 		<transaction
>>> 			transactionManagerLookupClass="org.infinispan.transaction.lookup.DummyTransactionManagerLookup" />
>>> 		<clustering mode="distribution">
>>> 			<l1 enabled="true" lifespan="100000" />
>>> 			<hash numOwners="1" rehashRpcTimeout="120000" />
>>> 		</clustering>
>>> 
>>> 		<loaders passivation="false" shared="true" preload="false">
>>> 
>>> 			<loader class="org.infinispan.loaders.file.FileCacheStore"
>>> 				fetchPersistentState="false" ignoreModifications="false"
>>> 				purgeOnStartup="true">
>>> 				<properties>
>>> 					<property name="location" value="/tmp" />
>>> 				</properties>
>>> 				<async enabled="true" threadPoolSize="3" />
>>> 			</loader>
>>> 
>>> 			</loaders>
>>> 		
>>> 		<deadlockDetection enabled="true" spinDuration="1000"></deadlockDetection>
>>> 
>>> 		<eviction strategy="FIFO" wakeUpInterval="1000" maxEntries="10" />
>>> 
>>> 		<unsafe unreliableReturnValues="true" />
>>> 
>>> 	</namedCache>
>>> </infinispan>
>>> 
>>> 
>>> phil
>>> 
>>> 
>>> 
>>> On Wed, Feb 3, 2010 at 6:42 PM, Manik Surtani <manik at jboss.org> wrote:
>>> Ugh, good point.  I thought the unit tests would have trapped a dumb-ass mistake like this.
>>> 
>>> The reason for transforming the name of the bucket is that we usually use hashcodes as the bucket name, which can take Integer.MIN_VALUE to Integer.MAX_VALUE.  These are then translated into Strings, and this becomes the name of the storage unit, e.g., 12345.bucket in the FileCacheStore.  Now filesystems are happy to accept a -12345.bucket but certain cloud storage providers barf when encountering the '-' character.  Hence the transformation to A12345.bucket in some cases.
>>> 
>>> Cheers
>>> Manik
>>> 
>>> PS: pushing up a new snapshot as I type, containing this fix + lower verbosity on eviction-related lock timeouts.
>>> 
>>> On 3 Feb 2010, at 17:16, Philippe Van Dyck wrote:
>>> 
>>>> And BTW, why do it ?
>>>> 
>>>> p
>>>> 
>>>> ---------- Forwarded message ----------
>>>> From: Philippe Van Dyck <pvdyck at gmail.com>
>>>> Date: Wed, Feb 3, 2010 at 6:15 PM
>>>> Subject: CloudCacheStore Bug
>>>> To: infinispan -Dev List <infinispan-dev at lists.jboss.org>
>>>> 
>>>> 
>>>> Hi all,
>>>> 
>>>> there is a bug in CloudCacheStore that makes me feel like I am the only one using it ;-)
>>>> 
>>>> in CR4 : if you change the "-" sign to "A" in getBucketName ... you need to do the opposite somewhere (or call it every time) ;-)
>>>> 
>>>> WDYT ?
>>>> 
>>>> p
>>>> 
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> 
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> --
>>> Manik Surtani
>>> manik at jboss.org
>>> Lead, Infinispan
>>> Lead, JBoss Cache
>>> http://www.infinispan.org
>>> http://www.jbosscache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> --
>> Manik Surtani
>> manik at jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>> http://www.infinispan.org
>> http://www.jbosscache.org
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20100204/22e16c12/attachment-0002.html 


More information about the infinispan-dev mailing list