On 7/12/11 4:30 PM, Yuri de Wit wrote:
Hi Bela,
> Interesting... I guess that loader would have to know the mapping
of
> files to chunks, e.g. if a file is 10K, and the chunk size 2k, then a
> get("/home/bela/dump.txt.#3") would mean 'read the 3rd chunk from
> /home/bela/dump.txt' from the file system and return it, unless it's in
> the local cache.
Correct. This is exactly how I implemented the store after looking
into the existing FileCacheStore. It is basically a base
FileCacheStore extending LockSupportCacheStore and two subclasses:
FileMetadataCacheStore and FileDataCacheStore. The first subclass
returns Metadata entries and the second one returns byte[] chunks.
OK
> This requires that your loader knows the chunk size and the
> mapping/naming between files and chunks...
Right now I am setting the preferred chunk size in the
FileDataCacheStore properties in my config file and when I instantiate
the GridFilesystem (traversing the configs in
getCacheLoaderManagerConfig) I pass in the same value as the default
chunk size there
OK, makes sense.
> Hmm. Perhaps the mapping can be more intuitive ? Maybe instead of
the
> chunk number, the suffix should incorporate the index (in bytes), e.g.
> /home/bela/dump.txt.#6000 ?
Interesting. This could be a bit more reliable, but it wouldnt
eliminate the need to define the chunk size. In theory, OutputStream
chunk size could be different than input chunking and the former could
be client driven and the latter loader driven. However, I am not sure
the actual benefits and maybe a single chunk size for the cluster
could be good enough.
Yes, I agree. If you wanted different chunk sizes for different files,
you could always store the chunk size in the metedata for a given file
though.
Chunking writes is a bit more complex since you don't want to
write
chunk #5 and have the client node stop writing to the OutputStream,
for instance (or multiple clients writing at the same time). For now I
have disabled write chunking (worse case scenario the slaves are
read-only and writes only through master), but I could envision a
protocol where the chunks are written to a temp file based on a unique
client stream id and triggered by an OutputStream.close(). A close
would push a 'closing' chunk with or without actual data that would
replace the original file on disk. The locking scheme on
LockSupportCacheStore would make sure there is some FS protection and
the last client closing the stream would win if multiple clients are
writing to the same file at once (or maybe an explicit lock using the
Cache API?).
OK
Another issue I found is with creating directories, but most likely
with my rewrite. A new GridFile could become a folder or a file so the
Metadata must have no flags set to (1) mimic the behavior of a real
File and to (2) make sure the impl can properly implement mkdir(),
mkdirs() and exists().
Yep, as I said the impl is incomplete...
I have cloned the infinispan project on github and would be happy to
commit the changes somewhere there so you could take a peak, if
interested.
Interested yes, but I have no time to look at this... :-( I'm busy
working on JGroups 3.0, which should be beta1 soon...
I hope though that your changes go into the Infinispan Git repo, and
maybe you should publish an article about this on Infoq... ?
One last note is regarding configuration. It seems that the metadata
cache has to use full replication (or at least it would make the most
sense) and the data cache has to use distribution mode.
Yes, that's the idea. Metadata should be small(ish), so full replication
is warranted. This of course also depends on what we cram into metadata;
if it becomes too big, or we have many small files, then it might make
sense to switch to distribution. Anyway, at the end of the day, this is
a configuration issue and doesn't require code changes.
--
Bela Ban
Lead JGroups (
http://www.jgroups.org)
JBoss / Red Hat