Re: [infinispan-dev] Memory is the new Disk, Disk is the new Tape

Monday, 27 April 2009

Definitely something cool and interesting.  And an impl could be  
fairly easy to do as well.  Perhaps this needs to be a separate  
module, I wouldn't want to add this to infinispan-core as it would  
bloat the jar.  infinispan-fs ?  :-)

The requisite FUSE plugins could also be written to allow native  
mounting of such an FS.  Great potential if backed with an Amazon S3  
cache store.

On 24 Apr 2009, at 15:45, Bela Ban wrote:

...
 Do we plan to provide some sort of IO interface to Infinispan ? The 

 idea is to let users write large files to the grid and read them as  
 well.

 We could think of implementing all of the classes of java.io.* (NIO  
 as well ?) and layering them on top of Infinispan.

 I'm thinking an implementation e.g. of a file stream could chunk the  
 stream up into blocks of 1000 bytes. Each block has an ID and the  
 IDs for all blocks of a given file are stored in the inode.

 The inode is referenced by the filename and contains the list of  
 block IDs. Each block could potentially be stored on a different  
 cluster node and - depending on repl-count - be stored multiple  
 times in the cluster.

 Example: reading a file:
 - The impl of the stream locates the inode based on a consistent  
 hash over the filename
 - The file ptr is at 0
 - When reading 100 bytes, the first ID is fetched from the inode and  
 its block is read from a cluster node, again based on the consistent  
 hash(block-ID)
 - When all the data has been read from the block, we have to fetch  
 the next block and so on

 For random access files, we'd compute the block ID based on the  
 fileptr % block-size, this would be simple to implement.

 If we did this, we could truly have an in-memory cluster filesystem  
 and handle files larger than the physical memory on any given box !  
 As a second line of defense, e.g. when the entire cluster crashes,  
 we could still stream the modified blocks back to disk, but this  
 could be done asynchronously.

 Note that the Gridblocks project once did this, but they stored all  
 blocks on disk, and their project was not about in-memory file  
 systems.

 Google File System does something similar, but their data is also  
 stored on disk.

 I don't know what Amazon's BigTable does, have to read up on it.

 WDYT ?

 -- 
 Bela Ban
 Lead JGroups / Clustering Team
 JBoss - a division of Red Hat

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Memory is the new Disk, Disk is the new Tape