I've started working on ISPN-78 - Large Object Support - closely
following Manik's design document
http://community.jboss.org/wiki/LargeObjectSupport. As a starting point
I'm currently trying to implement
OutputStream writeToKey(K key),
which, for the time being, I chose to declare on AdvancedCache rather
that on Cache proper.
While thinking about the implications, I stumbled upon a few questions
which may well be owing to my lack of knowledge about Infinispan's inner
workings.
1. OutputStream writeToKey(K key) implies that the interaction between
user code and Infinispan happens in the OutputStream returned. Contrary
to existing methods, there would be no well defined *single* point where
control passes from user code to Infinispan. Instead, a user would write
a few bytes, passing control to Infinispan. Infinispan would buffer
those bytes and return control to the user until a preconfigured chunk
size is reached, whereupon Infinispan would probably issue a more or
less standard request to store that chunk on some node. Rinse and repeat.
This is certainly doable but leaves me wondering where that proposed
ChunkingInterceptor might come into play. It is my current understanding
that interceptors, well, intercept commands, and in this scenario there
could not be such a PutLargeObjectCommand as, as I said, there is no
single point where control passes from user code to Infinispan. Instead,
chunking would have to be done directly within the CacheOutputStream
returned, and the only commands involved would be the more or less
standard PutKeyValueCommands mentioned above. In this scenario there
wouldn't be any PutLargeObjectCommand to encapsulate the whole process.
Does this make sense? If so, I'd prefer to have
void writeToKey(K key, InputStream largeObject)
instead. Thus, after calling this method control would be handed over to
Infinispan until that LargeObject is stored, and we could have indeed
have some PutLargeObject command to encapsulate the whole process.
2. For the mapping Key -> LargeObjectMetadata I would intuitively choose
to use a dedicated "system" cache. Is this the Infinispan way of doing
things? If so, where can I find some code to use as a template? If not,
what would be an alternative approach that is in keeping with
Infinispan's architecture?
3. The design suggests to use a fresh UUID as the key for each new
chunk. While this in all likelihood gives us a unique new key for each
chunk I currently fail to see how that guarantees that this key maps to
a node that is different from all the nodes already used to store chunks
of the same Large Object. But then again I know next to nothing about
Infinispan's constant hashing algorithm.
4. Finally, the problem regarding eager locking and transactions
mentioned in Manik's comment seems rather ... hairy. If we indeed forego
transactions readers of a key just being written shouldn't be affected
provided we write the LargeObjectMetadata object only after all chunks
have been written. But what about writers?
I hope this makes sense. Otherwise, don't hesitate to point out where I
went wrong and I will happily retreat to the drawing board.
Cheers,
Olaf