[infinispan-issues] [JBoss JIRA] Issue Comment Edited: (ISPN-1103) Soft schema-based storage

Friday, 13 May 2011

    [
https://issues.jboss.org/browse/ISPN-1103?page=com.atlassian.jira.plugin....
] 

Randall Hauch edited comment on ISPN-1103 at 5/13/11 6:04 PM:
--------------------------------------------------------------

I completely agree that storing metadata as a JSON structure is a brilliant approach.
I've added several features the list in the description.

Storing the jsonObject as a string might work, especially if it is simply loaded and
stored as an atomic unit. However, there are several advantages to storing it in a BSON
(or BSON-like) representation, and all boil down to the fact that any schema-aware service
will need to access the document contents for indexing, validation, computing differences
(for DeltaAware functionality), and even applying changes (e.g., something similar to
MongoDB's [atomic operations|http://www.mongodb.org/display/DOCS/Atomic+Operations]).
Using a BSON representation will also make it more natural to handle binary values within
the document.

Now, if we did store the document internally as BSON, we actually don't need to store
the metadata separately. For example, we can store a single document with this structure:

{code}
  {
    "document" : /* user's document */
    "metadata" : {
      "schema-ref" : blah-blah
      ...
    }
{code}

If represented as BSON, then the user's document is actually a nested BSON object
stored under the "document" property name. This approach means that all the
differencing, atomic operations, serialization, etc. functionality will work without
having to distinguish between a user's document and a system metadata document. The
resulting value class would be:

{code}
class SchematicValue {
    org.bson.BSONObject json;
}
{code}

The downside of doing this is the increase in size of the {{SchematicValue}} class, from 2
references to a 1 reference plus an extra BSONObject implementation. The
{{org.bson.BasicBSONObject}} extends {{LinkedHashMap<String,Object>}}, but we could
probably provide an optimized implementation for the top-level that still implemented the
{{org.bson.BSONObject}} interface but more directly. Is it worth the apparent simplicity?

      was (Author: rhauch):
    I completely agree that storing metadata as a JSON structure is a brilliant approach.
I've added several features the list in the description.

Storing the jsonObject as a string might work, especially if it is simply loaded and
stored as an atomic unit. However, there are several advantages to storing it in a BSON
(or BSON-like) representation, and all boil down to the fact that any schema-aware service
will need to access the document contents for indexing, validation, computing differences
(for DeltaAware functionality), and even applying changes (e.g., something similar to
MongoDB's [atomic operations|http://www.mongodb.org/display/DOCS/Atomic+Operations]).
Using a BSON representation will also make it more natural to handle binary values within
the document.

Now, if we did store the document internally as BSON, we actually don't need to store
the metadata separately. For example, we can store a single document with this structure:

{code}
  {
    "document" : /* user's document */
    "metadata" : {
      "schema-ref" : blah-blah
      ...
    }
{code}

If represented as BSON, then the user's document is actually a nested BSON object
stored under the "document" property name. This approach means that all the
differencing, atomic operations, serialization, etc. functionality will work without
having to distinguish between a user's document and a system metadata document. The
resulting value class would be:

{code}
class SchematicValue {
    BSON json;
}
{code}

...
 Soft schema-based storage
 -------------------------

                 Key: ISPN-1103
                 URL: https://issues.jboss.org/browse/ISPN-1103
             Project: Infinispan
          Issue Type: Feature Request
          Components: Core API
            Reporter: Manik Surtani
            Assignee: Manik Surtani
             Fix For: 5.1.0.BETA1, 5.1.0.Final

 This JIRA is about storing metadata alongside values.  Perhaps encapsulating values as
SchematicValues, which could be described as:
 {code}
   class SchematicValue {
     String jsonMetadata;
     String jsonObject;
   }
 {code}
 Metadata would allow for a few interesting features:
 * Extracting of lifespan and timestamp data if manipulated over a remote protocol (REST,
HotRod, etc)
 * Content type for REST responses
 * Timestamps and SHA-1 hashes, useful for for HTTP headers (e.g., ETag, Cache-control,
etc.)
 * Validation information (may not be processed by Infinispan, but can be used by client
libs)
 * Classloader/marshaller/classdef version info
 * General structure of the information stored
 * Reference to the schema for this document
 * Storage of older versions 
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] Issue Comment Edited: (ISPN-1103) Soft schema-based storage