[infinispan-dev] ISPN-699 - proper cancellation of cache store operations

Galder Zamarreño galder at redhat.com
Wed Oct 20 09:13:06 EDT 2010


Hi,

Re: https://jira.jboss.org/browse/ISPN-699

I'm trying to figure out what the best way to solve this issue is. Basically, the problem is that when cache manager is stopped, EvictionManagerImpl cancels with interruption the evictionTask and I'm seeing issues with cacheStore.purgeExpired() not responding to cancellation properly. This results in Marshaller being stopped and then eviction thread trying to purge the cache store. Obviously, once the marshaller is stopped, nothing can be read any more.

I've tried to simply protect cacheStore.purgeExpired() call around a Thread.currentThread().isInterrupted() call but this is not enough because we could have hundreds of buckets to check for purging, and the interruption could happen while looping through them. Now, I don't see the point of plaguing the code with Thread.currentThread().isInterrupted() checks, it'd be pointless. Instead, I wanted to share other ideas to solve this issue:

1. EvictionManagerImpl could wait for any ongoing eviction task to finished. This could be potentially lengthy if for example the cache store has hundreds or thousands of buckets, and we don't want for stop requests to block at all.

2. The main problem comes from the fact that the marshaller is being requested to read something but it can't do it anymore since it's shutting down. An alternative would be for the ConstantObjectTable to return null under the situation that is stopped and the thread is interrupted. This might work fine since if the bucket read returns null, it skips the bucket, but it's not optimal. If you have 1000s of buckets, it is going to continue looping through them, so you'd need something else, and this logic would need to be replicated in all cache stores.

3. COT returning null might not be a good idea. Instead, ObjectTable readObject() method could be changed to declare that it can throw an InterruptedException. This would force any caller to deal with this situation, including all current cache stores. I think this is the only way we can enforce cache stores to behave properly to interruption situations like the one mentioned above. Otherwise, we have to start doing esoteric things in all cache stores to check what a null return from unmarshalling means, or try to double guess that an IOException might be wrapping an IE. 

I'm currently leaning towards 3.

Thoughts? 
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list