[infinispan-dev] ClusteredGet and direct operations on a cacheloader
Mircea Markus
mircea.markus at jboss.com
Wed Apr 22 10:17:46 EDT 2009
Manik Surtani wrote:
>
> On 22 Apr 2009, at 14:30, Mircea Markus wrote:
>
>> Manik Surtani wrote:
>>> Mircea
>>>
>>> Regarding direct loading from a cache loader in the
>>> ClusteredGetCommand, this is an issue since
>>
>>> a) it does not acquire any locks to put this value back in the data
>>> container, and
>> cache loaders are read only and stores use bucket locking, it's a
>> different locking level though...
>
> I was referring to cache-level locking. E.g., doing a load from a
> loader and then a put into the data container without any locking from
> the LockManager is a big no-no. :-)
>
>>> b) probably as a consequence of a), doesn't bother to put any looked
>>> up value in the data container at all.
>>>
>>> This is inefficient since multiple CGCs on the same key (coming from
>>> different cache instances in a cluster) would cause the owning cache
>>> to load this several times repeatedly from a loader.
>> At the same time, unique calls for different keys would increase the
>> memory footprint... though I agree with your point of view overall:
>> after all remote cache can be considered just another cache user.
>>>
>>> I see this problem with DIST since I use CGC to load an entry from a
>>> remote cache and if L1 caching is disabled on the requesting cache,
>>> multiple uses of an entry will result in this entry being read off a
>>> loader repeatedly. Which is no good. :-)
>>>
>>> So I think the CGC's perform() method should actually be doing a
>>> cache.get() - or something very similar - where the entire
>>> interceptor chain is invoked and locks are acquired, entries moved
>>> to the data container if loaded, etc., and metrics on cache
>>> hits/misses properly updated. The challenges here are, of course:
>>>
>>> 1. The ping-pong effect. So we need to make sure any clustered
>>> cache loader or the DistributionInterceptor do not try and load this
>>> from elsewhere in the cluster. Easily solved with the isOriginLocal
>>> flag on the context.
>> yes, +1
>>> 2. Making sure we get an InternalCacheEntry and not just a value
>>> to return. This is trickier since ICEs are not exposed via public
>>> APIs at all. Perhaps we need a separate, non-replicable Command for
>>> this purpose - maybe a subclass of GetKeyValueCommand
>>> (GetCacheEntryCommand?) which (a) cannot be replicated and (b) will
>>> return the ICE.
>> I see, it gets a bit complicated now... TBH, I think the 'problem' is
>> in the fact that our interceptors are a bit too 'fat' - with this
>> interceptor approach we "spread" the ClusterGet logic in too many
>> places:
>> - in the ClusterGet command, that does the call
>> - in the DistributionInterceptor, where you have to add the
>> additional if, to check that the call is local
>> - (other interceptors will have to do that as well).
>>
>> Having thinner interceptors would move the logic in one place only
>> (e.g. LockingInterceptor is thin, as all locking logic is in
>> LockManager, and all the calls are delegated to it, no 'fat' logic in
>> interceptor itself) would make the code easier to follow and change
>> (and solve the above issue, with not having access to
>> InternalCacheEntry).
>> So, the code in ClusterGetCommand might look something like this:
>>
>> perform() {
>> lockManager.lock(key);
>> if (dataContainer.get(key) != null) return...;
>> if (loaderManager != null) return loaderManager.get(key);
>> lockManager.release(key);
>> }
>
> I think this is evil (!) because it effectively duplicates the
> sequence of events determined by the interceptors. Plus in introduces
> brittleness and a lack of scalability and customisability since it
> short-circuits what other interceptors do. E.g., what if I introduce
> a new interceptor between LockInterceptor and CacheLoaderInterceptor
> which is meant to decrypt data from the store since what is loaded is
> encrypted, and I cannot do this in the store since the key is on the
> caller's thread (maybe a JAAS Principal)? Already this also breaks
> any cache statistics on loads, etc being collected by the JMX
> interceptors.
>
> In general I strongly disagree with duplicating this sequence
> elsewhere. The interceptors are self-contained and are designed to
> ignore and pass on invocations they are not interested in.
>
>> I think the code is much slef-explanatory than a plain cache.get(),
>> that does lots of things not imediately visible.
>
> That's the whole point. It should *not* be visible as to what happens
> during a get() since this should be decided on configuration, custom
> interceptors being injected, etc. There is no way you know this at
> compile time to hard code this sequence into a single
> command.perform() method.
>
>> Also if he wants to add another lookup from somewhere else, it's easier.
>
> Why should there ever be other lookups from elsewhere? This only
> leads to potential for breakage.
>
>> There is no code duplication as well, as all the work is delegated to
>> the managers.
>
> Not duplication of work but definite duplication of sequence of events.
>
you are right :)
Guess the GetCacheEntryCommand is the way to go, as you suggested.
> Cheers
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
More information about the infinispan-dev
mailing list