[infinispan-issues] [JBoss JIRA] Issue Comment Edited: (ISPN-78) Large object support

Tuesday, 29 March 2011

    [
https://issues.jboss.org/browse/ISPN-78?page=com.atlassian.jira.plugin.sy...
] 

Olaf Bergner edited comment on ISPN-78 at 3/29/11 8:00 PM:
-----------------------------------------------------------

Wouldn't a non streaming API somehow defeat this feature's sole purpose? It is my
understanding that it is driven by the desire to store objects the size of which exceeds
that of every single JVM in the cluster, correct?

If that is indeed the case, it would be *impossible* to store such an object without first
fragmenting it into pieces each of which fits into a single node's heap. So we need a
streaming API to read a large object from some place - think file - outside the current
JVM's address space in order not to blow up that JVM's heap. It is out of the
question that this approach considerably complicates matters, but I don't see a way
around it.

Ah, I see. Stupid me. Streaming to disk means writing a large object through directly to
the cache store, without actually caching it in memory. Might make sense in some cases,
though it could be argued if in this case Infinispan would really act as a
"cache". We would still need a streaming API, though.

Anyway, IMHO we should first concentrate on implementing the "fragmentation
approach" and then check back whether streaming to disk might make sense. One step at
a time.

      was (Author: O.Bergner):
    Wouldn't a non streaming API somehow defeat this feature's sole purpose? It is
my understanding that it is driven by the desire to store objects the size of which
exceeds that of every single JVM in the cluster, correct?

If that is indeed the case, it would be *impossible* to store such an object without first
fragmenting it into pieces each of which fits into a single node's heap. So we need a
streaming API to read a large object from some place - think file - outside the current
JVM's address space in order not to blow up that JVM's heap. It is out of the
question that this approach considerably complicates matters, but I don't see a way
around it.

...
 Large object support
 --------------------

                 Key: ISPN-78
                 URL: https://issues.jboss.org/browse/ISPN-78
             Project: Infinispan
          Issue Type: Feature Request
          Components: Core API
            Reporter: Manik Surtani
            Assignee: Manik Surtani
             Fix For: 5.1.0.BETA1, 5.1.0.Final

 if each VM is allocated a 2GB heap and you have a 100 nodes in a grid with 1 redundant
copy for each key, you have a theoretical addressable heap of 100GB.  But you are limited
by (half) the heap of a single VM per entry, since entries are stored whole.
 E.g., cache.put(k, my2GBObject) will fail since you need at least 2GB for the object +
another 2GB for its serialized form.
 This gets worse when you try cache.put(k, my10GBObject).  This *should* be possible if we
have a theoretical 100GB heap.
 Potential solutions here are to fragment large objects, and store each fragment under
separate keys.  Another approach would be to directly stream objects to disk. etc.  Needs
thought and design, possibly a separate API to prevent 'pollution" of the more
simplistic API.  (JumboCache?)
 Re: fragmenting, issues to overcome:
 How many chunks to fragment into?  Max size of each key could be configured, but how do
we determine the size of an Object?  VM instrumentation?  Or perhaps the JumboCache only
stores byte[]'s?   
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] Issue Comment Edited: (ISPN-78) Large object support