[infinispan-dev] [ISPN-78] Alternative interface for writing large objects

Mon Apr 4 04:09:39 EDT 2011

Hi Olaf,

See below for comments:

On Mar 29, 2011, at 12:49 PM, Olaf Bergner wrote:

> I've started working on ISPN-78 - Large Object Support - closely 
> following Manik's design document 
> http://community.jboss.org/wiki/LargeObjectSupport. As a starting point 
> I'm currently trying to implement
> 
> OutputStream writeToKey(K key),
> 
> which, for the time being, I chose to declare on AdvancedCache rather 
> that on Cache proper.
> 
> While thinking about the implications, I stumbled upon a few questions 
> which may well be owing to my lack of knowledge about Infinispan's inner 
> workings.
> 
> 1. OutputStream writeToKey(K key) implies that the interaction between 
> user code and Infinispan happens in the OutputStream returned. Contrary 
> to existing methods, there would be no well defined *single* point where 
> control passes from user code to Infinispan. Instead, a user would write 
> a few bytes, passing control to Infinispan. Infinispan would buffer 
> those bytes and return control to the user until a preconfigured chunk 
> size is reached, whereupon Infinispan would probably issue a more or 
> less standard request to store that chunk on some node. Rinse and repeat.
> 
> This is certainly doable but leaves me wondering where that proposed 
> ChunkingInterceptor might come into play. It is my current understanding 
> that interceptors, well, intercept commands, and in this scenario there 
> could not be such a PutLargeObjectCommand as, as I said, there is no 
> single point where control passes from user code to Infinispan. Instead, 
> chunking would have to be done directly within the CacheOutputStream 
> returned, and the only commands involved would be the more or less 
> standard PutKeyValueCommands mentioned above. In this scenario there 
> wouldn't be any PutLargeObjectCommand to encapsulate the whole process.

Hmmmm, the initial step in writeToKey() is to create an map entry for the metadata, so the internal writeToKey() could indeed create a PutLargeObjectMetadataCommand and pass that down the interceptor stack, or more simply have a ChunkingInterceptor that implements visitPutKeyValue...() that would keep an eye for a transaction call that puts an LargeObjectMetadata, and at that point, the interceptor could return a new specialised outputstream...etc. The first suggestion would be more useful if you expect other normal cache commands such as get...etc to deal with large object related cache calls in a different way, but I don't think that's the case here since all the interaction would be via the Output/Input stream.

> 
> Does this make sense? If so, I'd prefer to have
> 
> void writeToKey(K key, InputStream largeObject)
> 
> instead. Thus, after calling this method control would be handed over to 
> Infinispan until that LargeObject is stored, and we could have indeed 
> have some PutLargeObject command to encapsulate the whole process.
> 
> 2. For the mapping Key -> LargeObjectMetadata I would intuitively choose 
> to use a dedicated "system" cache. Is this the Infinispan way of doing 
> things? If so, where can I find some code to use as a template? If not, 
> what would be an alternative approach that is in keeping with 
> Infinispan's architecture?

Yeah, this information would be stored in an internal cache. There're several examples of such caches such as the topology cache for Hot Rod servers. When the server is started, it creates a configuration for this type of cache (i.e. REPL_SYNC....) and then it's named in a particular way...etc.

> 
> 3. The design suggests to use a fresh UUID as the key for each new 
> chunk. While this in all likelihood gives us a unique new key for each 
> chunk I currently fail to see how that guarantees that this key maps to 
> a node that is different from all the nodes already used to store chunks 
> of the same Large Object. But then again I know next to nothing about 
> Infinispan's constant hashing algorithm.

I think there's a service that will generate a key mapped to particular node, so that might be a better option here to avoid all chunks going to the same node. I think Mircea might be able to help further with this.

> 
> 4. Finally, the problem regarding eager locking and transactions 
> mentioned in Manik's comment seems rather ... hairy. If we indeed forego 
> transactions readers of a key just being written shouldn't be affected 
> provided we write the LargeObjectMetadata object only after all chunks 
> have been written. But what about writers?

Hmmmmm, I don't understand your question.

> 
> I hope this makes sense. Otherwise, don't hesitate to point out where I 
> went wrong and I will happily retreat to the drawing board.
> 
> Cheers,
> Olaf
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache