[infinispan-dev] Entry grouping feature

Fri Dec 4 06:15:42 EST 2009

On Thu, Dec 3, 2009 at 5:49 PM, Manik Surtani <manik at jboss.org> wrote:

>> Correct me if I'm wrong, but aside from AtomicMap specifics, this is
>> essentially a Map of Maps, isn't it? Which makes this approach similar
>> to what Sanne was offering a couple of emails ago.
>
> Not really - AtomicMaps allow for "group-level" locking.  So the entire entry is locked for
> write if you are updating any single key in the AtomicMap (hence the name). AtomicMaps
> also have another characteristic that they expose Deltas for replication to reduce network
> traffic.

I don't disagree - that's why I've said "aside from AtomicMap
specifics". My point was - as cache reference(s) internal map(s), this
effectively makes a map of maps. But this is irrelevant anyway, as you
have mentioned so isn't worth discussing any more anyway.

> Interceptors are pluggable - both via XML and via configuration, as well as at runtime.
> When I mentioned these, I was suggesting that you maintain some metadata in the
> cache, which acts as 'group information'.  But this would require your deducing the group
> name on the fly (perhaps a function of the key used?)  If the group name has to be
> explicitly passed in then this approach wont work.

I was thinking of similar approach myself: each key would be always
wrapped into a "group decorator" object. Each such decorator would
then be maintained in a separate {group name => list of keys} map.
This might be doable assuming there are facilities to efficiently
maintain such a map.

By saying "efficiently maintain" I mean, for example, removing group
map entry when gets evicted from LRU cache because of idleness.
Moreover, this has to be independent from cache specifics (LRU vs
LFU), clustering and so on.

That's more or less how OSCache grouping feature works. Does it sound
like a sensible Infinispan scenario?

> But anyway, lets think about why you need such "grouping" in the first place.

I have deliberately tried to avoid burdening this mailing list with my
own problems, but since you've asked :)

I have a large graph of computational-expensive objects (pieces of XML
text) which I store in the cache. The algorithm traverses the graph,
assembles necessary nodes in the right order and spits out the result.

There can be multiple invariants of same object in the graph, e.g.
there can be different XML representations of "John Smith" profile.
When caching all these different invariants, I put them under same
unique group so if underlying object changes, flushing all of these
invariant entries is a matter of calling a single method.

This is a very skimmed description; the whole thing can be much more
complex, e.g.

A ==uses==> B ==uses==> invariant(A) ==uses==> invariant(B)

Current grouping functionality is just what I need and fits perfectly
for the job. Maintaining separate caches for different objects/trees
(or possibly using JTA) would add lots of unnecessary complexities to
already complicated algorithm.

> separately in the cache but still maintain relational knowledge?  If so, the approach to
> doing this is the JPA-like API [1] which will maintain references for you, cascade
> removals, etc.

Thanks for the pointer. My first impression by looking at the JIRA
issue & the doc says this isn't entirely what I'm after - but I'll
definitely have a deeper look.

> And if you do wish to help build this, it would be much appreciated. :)

Absolutely! But as I said, I'd need a bit of guidance and advice -
e.g. where to stick the methods, what is the best way to achieve this,
etc. I could probably start prototyping this and see how it goes.

Regards,
Mindaugas