[infinispan-dev] Fixing ISPN-2384 (data loss with concurrent activation/passivation)

Galder Zamarreño galder at redhat.com
Tue Oct 30 14:23:44 EDT 2012


Hi all,

Re: https://issues.jboss.org/browse/ISPN-2384

I've created a unit test in https://github.com/galderz/infinispan/commit/01230d40df6f26720039986916c38de8be33b44b

That was the easy part :). How to fix it is no so clear. In pseudo-code, the race condition happens when:

1. T1. passivate entry X to cache store
2. T2. retrieve X from memory
3. T2. activation interceptor removes X from store
4. T1. evicts X from memory

The end result is that X is gone from both memory and cache store.

I've thought of several ways of fixing it, but not convinced with any:

One way to fix this is by making step 1 & 4 atomic, and I was hoping to pigyback on the segment lock, but for that to work, step 2 (data container get() op) would need to wait for this segment lock, which would be detrimental for performance.

Another way would be if activation only happened if the data was retrieved from the cache store (and not from memory) since removing from cache store when the value came from memory is rather pointless. The problem with this solution is that the source of the data is not currently ship around in the interceptor stack. IOW, the activation interceptor doesn't know if the data came from memory or the cache store. This could be potentially recorded in the cache entry stored in the context, but requires some refactoring and it could still be vunerable to this sequence of events:

1. T1. passivate entry X to cache store
2. T2. retrieve X from cache store
3. T2. store X in memory
4. T2. activation interceptor removes X from store
5. T1. evicts X from memory

So, to get around this sequence of events, since the the store acquires a segment lock, passivate and evict X could be done within the segment lock, and that would make sure that the evict happens before the storage in memory:

1. T1. acquire segment lock
2. T1. passivate entry X to cache store
3. T2. retrieve X from cache store
4. T1. evicts X from memory
5. T1. releases segment lock
6. T2. acquires segment lock
7. T2. store X in memory
8. T2. activation interceptor removes X from store
9. T2. releases segment lock

I think this could work, but I was wondering if you could see other potential solutions?

Cheers,
--
Galder Zamarreño
galder at redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org




More information about the infinispan-dev mailing list