[infinispan-issues] [JBoss JIRA] Commented: (ISPN-78) Large object support

Tuesday, 19 April 2011

    [
https://issues.jboss.org/browse/ISPN-78?page=com.atlassian.jira.plugin.sy...
] 

Trustin Lee commented on ISPN-78:
---------------------------------

1) Is it a good idea to put the chunks into {{Cache}}?  What about storing them in a
separate store specialized for large objects?  For example, a user might want to store
ordinary entries in heap while large objects in disk.  Maybe a user could specify two
store implementations when creating a {{Cache}} - one for non-large objects and the other
for large objects.

2) What about an interface like this to put a large object?

{code}
public interface LargeObjectWriter {
  void write(byte[] data, int offset, int length) throws IOException;
  void transferFrom(InputStream in) throws IOException;
  void cancel();
  void finish();
}

public interface Cache {
  ...
  LargeObjectWriter writeToKey(K key);
}

Cache c = ...;
LargeObjectWriter writer = c.writeToKey("largeObject");
boolean success = false;
try {
  writer.transferFrom(new FileInputStream("largeObject.iso");
  success = true;
} finally {
  if (success) {
    writer.finish(); // Upload is final
  } else {
    writer.cancel(); // Discard the stream - writeToKey has no effect on the cache i.e.
any chunks and entries written so far are removed from the cache.
  }
}
{code}

3) Because a user can upload the large object later as long as the current transaction is
valid, the method name '{{writeToKey}}' doesn't sound right - the method
itself does not write anything.  So, I think it might be a good idea to give a better name
to {{writeToKey()}}.  Maybe {{Cache.newLargeObject(K key)}}?

4) A user might want to access the list of chunks and access them individually (e.g. range
access).  Therefore, {{Cache.get(K key)}} should not throw CacheException but return a
{{LargeObjectMetadata}} and the {{LargeObjectMetaData}} should provide a way to retrieve
the list of chunks.  For example:

{code}
interface LargeObjectMetadata {
  LargeObjectChunkRange[] getChunkKeys(int offset, int length);
  String[] getChunkKeys();
}

interface LargeObjectChunkRange {
  String chunkKey;
  int offset; // first chunk might have non-zero offset
  int length; // last chunk might have a smaller value than the actual length
}
{code}

...
 Large object support
 --------------------

                 Key: ISPN-78
                 URL: https://issues.jboss.org/browse/ISPN-78
             Project: Infinispan
          Issue Type: Feature Request
          Components: Core API
            Reporter: Manik Surtani
            Assignee: Olaf Bergner
             Fix For: 5.1.0.BETA1, 5.1.0.Final

 if each VM is allocated a 2GB heap and you have a 100 nodes in a grid with 1 redundant
copy for each key, you have a theoretical addressable heap of 100GB.  But you are limited
by (half) the heap of a single VM per entry, since entries are stored whole.
 E.g., cache.put(k, my2GBObject) will fail since you need at least 2GB for the object +
another 2GB for its serialized form.
 This gets worse when you try cache.put(k, my10GBObject).  This *should* be possible if we
have a theoretical 100GB heap.
 Potential solutions here are to fragment large objects, and store each fragment under
separate keys.  Another approach would be to directly stream objects to disk. etc.  Needs
thought and design, possibly a separate API to prevent 'pollution" of the more
simplistic API.  (JumboCache?)
 Re: fragmenting, issues to overcome:
 How many chunks to fragment into?  Max size of each key could be configured, but how do
we determine the size of an Object?  VM instrumentation?  Or perhaps the JumboCache only
stores byte[]'s?   
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] Commented: (ISPN-78) Large object support