[infinispan-dev] [ISPN-78] Alternative interface for writing large objects

Tue Mar 29 06:49:43 EDT 2011

I've started working on ISPN-78 - Large Object Support - closely 
following Manik's design document 
http://community.jboss.org/wiki/LargeObjectSupport. As a starting point 
I'm currently trying to implement

OutputStream writeToKey(K key),

which, for the time being, I chose to declare on AdvancedCache rather 
that on Cache proper.

While thinking about the implications, I stumbled upon a few questions 
which may well be owing to my lack of knowledge about Infinispan's inner 
workings.

1. OutputStream writeToKey(K key) implies that the interaction between 
user code and Infinispan happens in the OutputStream returned. Contrary 
to existing methods, there would be no well defined *single* point where 
control passes from user code to Infinispan. Instead, a user would write 
a few bytes, passing control to Infinispan. Infinispan would buffer 
those bytes and return control to the user until a preconfigured chunk 
size is reached, whereupon Infinispan would probably issue a more or 
less standard request to store that chunk on some node. Rinse and repeat.

This is certainly doable but leaves me wondering where that proposed 
ChunkingInterceptor might come into play. It is my current understanding 
that interceptors, well, intercept commands, and in this scenario there 
could not be such a PutLargeObjectCommand as, as I said, there is no 
single point where control passes from user code to Infinispan. Instead, 
chunking would have to be done directly within the CacheOutputStream 
returned, and the only commands involved would be the more or less 
standard PutKeyValueCommands mentioned above. In this scenario there 
wouldn't be any PutLargeObjectCommand to encapsulate the whole process.

Does this make sense? If so, I'd prefer to have

void writeToKey(K key, InputStream largeObject)

instead. Thus, after calling this method control would be handed over to 
Infinispan until that LargeObject is stored, and we could have indeed 
have some PutLargeObject command to encapsulate the whole process.

2. For the mapping Key -> LargeObjectMetadata I would intuitively choose 
to use a dedicated "system" cache. Is this the Infinispan way of doing 
things? If so, where can I find some code to use as a template? If not, 
what would be an alternative approach that is in keeping with 
Infinispan's architecture?

3. The design suggests to use a fresh UUID as the key for each new 
chunk. While this in all likelihood gives us a unique new key for each 
chunk I currently fail to see how that guarantees that this key maps to 
a node that is different from all the nodes already used to store chunks 
of the same Large Object. But then again I know next to nothing about 
Infinispan's constant hashing algorithm.

4. Finally, the problem regarding eager locking and transactions 
mentioned in Manik's comment seems rather ... hairy. If we indeed forego 
transactions readers of a key just being written shouldn't be affected 
provided we write the LargeObjectMetadata object only after all chunks 
have been written. But what about writers?

I hope this makes sense. Otherwise, don't hesitate to point out where I 
went wrong and I will happily retreat to the drawing board.

Cheers,
Olaf