[infinispan-dev] [ISPN-78] Alternative interface for writing large objects

Thu Mar 31 01:46:03 EDT 2011

Am 30.03.11 02:32, schrieb Elias Ross:
> I think it'd be BEST if you could support both models. I would add:
>
> interface Cache {
>    /**
>     * Returns a new or existing LargeObject object for the following key.
>     * @throws ClassCastException if the key exists and is not a LargeObject.
>     */
>    LargeObject largeObject(K key);
> }
OK, I'll keep that on my todo list, yet for the time being I'v opted to 
start with implementing void writeToKey(K key, InputStream largeObject).
>> This is certainly doable but leaves me wondering where that proposed
>> ChunkingInterceptor might come into play.
> I would think ideally you don't need to create any new commands. Less
> protocol messages is better.
It is my understanding that PutKeyValueCommand will *always* attempt to 
read the current value stored under the given key first. I'm not sure if 
we want this in our situation where that current value may be several GB 
in size. Anyway, it should be easy to refactor if reusing 
PutKeyValueCommand should prove viable.
>> 3. The design suggests to use a fresh UUID as the key for each new
>> chunk. While this in all likelihood gives us a unique new key for each
>> chunk I currently fail to see how that guarantees that this key maps to
>> a node that is different from all the nodes already used to store chunks
>> of the same Large Object. But then again I know next to nothing about
>> Infinispan's constant hashing algorithm.
> I wouldn't use UUID. I'd just store (K, #) where # is the chunk.
>
Since this is important and might reveal a fundamental misunderstanding 
on my part, I need to sort this out before moving on. These are my 
assumptions, please point out any errors:

1. We want to partition a large object into chunks since, by definition, 
a large object is too big to be stored in a single node in the cluster. 
It follows that it is paramount that no two chunks be stored in the same 
node, correct?

2. Constant hashing guarantees that any given key maps to *some* node in 
the cluster. There is no way, however, such a key's creator could know 
to what node exactly its key maps. In other words, there is no inverse 
to the hash function, correct?

3. The current design mandates that for storing each chunk the existing 
put(key, value) be reused, correct?

It follows that we have no way whatsoever of generating a set of keys 
that guarantees that no two keys are mapped to the same node. In the 
pathological case, *all* keys map to the same node, correct?
>> I would think a use case for this API would be streaming audio or
>> video, maybe something like access logs even?
>>
>> In which case, you would want to read while you're writing. So,
>> locking shouldn't be imposed. I would say, rely on the transaction
>> manager to keep a consistent view. If transactions aren't being used,
>> then the user might see some unexpected behavior. The API could
>> compensate for that.
>>
If I understand you correctly you propose two alternatives:

1. Use transactions, thus delegating all consistency requirements to the 
transaction manager.

2. Don't use transactions and change the API so that readers may be told 
that a large object they are interested in is currently being written.

Further, to support streaming use cases you propose that it should be 
possible to read a large object while it is being written.

Is that correct?

Hmm, I need to think about this. If I understand Manik's comment and the 
tx subsystem correctly each transaction holds its *entire* associated 
state in memory. Thus, if we are to write all chunks of a given large 
object within the scope of a single transaction we will blow up the 
originator node's heap. Correct?

So many questions ...

Cheers,
Olaf