[infinispan-dev] Writing a custom CacheStore: MarshalledEntryFactory

Dan Berindei dan.berindei at gmail.com
Tue Jul 15 16:35:55 EDT 2014


On Tue, Jul 15, 2014 at 5:49 PM, Sanne Grinovero <sanne at infinispan.org>
wrote:

> On 15 July 2014 14:15, Dan Berindei <dan.berindei at gmail.com> wrote:
> >
> >
> >
> > On Tue, Jul 15, 2014 at 12:51 AM, Sanne Grinovero <sanne at infinispan.org>
> > wrote:
> >>
> >> Hi all,
> >> I was toying with a custom CacheStore experiment, and am having some
> >> friction with some of the new SPIs.
> >>
> >> So interface org.infinispan.marshall.core.MarshalledEntryFactory<K, V>
> >> is an helper to use in the CacheStorei implementation, which exposes
> >> three methods:
> >>
> >>    MarshalledEntry<K,V> newMarshalledEntry(ByteBuffer key, ByteBuffer
> >> valueBytes, ByteBuffer metadataBytes);
> >>    MarshalledEntry<K,V> newMarshalledEntry(Object key, ByteBuffer
> >> valueBytes, ByteBuffer metadataBytes);
> >>    MarshalledEntry<K,V> newMarshalledEntry(Object key, Object value,
> >> InternalMetadata im);
> >>
> >> In my CacheStore - and I suspect most efficiency minded
> >> implementations - I don't care about the value Object but I express a
> >> specific physical layout for the metadata, so to run for example an
> >> efficient "purge expired" task.
> >> So, the key is given, the value Object needs to be serialized, but the
> >> InternalMetadata I can map to specific fields.
> >>
> >> Problem is at read time: I don't have a marshalled version of the
> >> Metadata but I need to unmarshall the value.. there is no helper to
> >> cover for this case.
> >>
> >> Wouldn't this interface be more practical if it had:
> >>
> >> Object unMarshallKey(ByteBuffer);
> >> Object unMarshallValue(ByteBuffer);
> >> InternalMetadata unMarshallMetadata(ByteBuffer);
> >> MarshalledEntry newMarshalledEntry(Object key, Object value,
> >> InternalMetadata im);
> >
> >
> > I guess the idea was that MarshalledEntry unmarshalls the key, value, and
> > metadata lazily. Even a purge listener may only be interested in the key,
> > and in that case case we can avoid unmarshalling the value.
> >
> > I think you can do what you want by stuffing the bytes for everything in
> the
> > MarshalledEntry, and unmarshalling the data via
> > MarshalledEntry.getMetadata().
>
>
> That's what I did but it's far from optimal, so I'm proposing the
> improvement.
>

Could you expand a bit on why this is suboptimal? The way I see it, you
have to support any custom Metadata implementation anyway, so you have to
either read and deserialize the entire metadata, or store a duplicate of
the expiration timestamp somewhere else.

In LevelDB we store the timestamps in a separate DB, doing 2 LevelDB writes
for every store write (with expiration), and I would be very interested in
an alternative solution that only wrote to 1 DB.


> >> Also, I'd get rid of generics. They are not helping at all, I can
> >> hardly couple my custom CacheStore implementation to the end user's
> >> domain model, right?
> >
> >
> > I can see the user writing a custom JDBC store that stores the object's
> > properties into separate columns and thus supporting a single type. But
> it's
> > a pretty specialized case, and the user can very well do the casts
> himself.
>
> Exactly
>
> >
> > Maybe Paul and Will have more stuff to add here, they been discussing
> about
> > generics in the cache store SPIs around
> > https://github.com/infinispan/infinispan/pull/2705
> >
> >>
> >> I was also quite surprised that other existing CacheStore
> >> implementations don't have this limitation; peeking in the
> >> JDBCCacheStore to see how this is supposed to work, it seems that
> >> essentially it duplicates the data by serializazing the
> >> InternalMetadata in the BLOB but also stored an Expiry column to query
> >> via SQL. I was interested to see how the Purge method could be
> >> implemented efficiently, and found a "TODO notify listeners" ;-)
> >
> >
> > I believe the reason why we don't support purge listeners in the JDBC
> store
> > is that we don't want to fetch the entries from the database at all. We
> > can't ask CacheNotifier whether there are any listeners registered ATM,
> we
> > need that to avoid the overhead when there are no listeners.
>
> I understand the reason, still it's wrong isn't it ;-)
> Having to load each entry for the "maybe there's a listener" case is
> definitely silly, we should inform the CacheStore instance if
> notifications are needed or not, and if they need just the key or the
> whole entry, no need for metadata.
> And for CacheStore instances which don't respect this we should at
> least document the limitation, or open a JIRA to get it done. Would
> you agree if I opened issues for each weirdness I'm finding in the
> existing CacheStore implementations? I've hit some more in the
> meantime.
>
>
Sure. I'm not too enthusiastic about changing the persistence SPI, but we
should at least have a discussion on the pros and cons of the choices we
are making there.


>
> >
> >>
> >> All other JDBC based stores serialize buckets in groups, REST store
> >> doesn't do purging, LevelDB also does duplication for the metadata,
> >> Cassandra is outdated and doesn't do events on expiry.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20140715/dbda9166/attachment.html 


More information about the infinispan-dev mailing list