[
https://issues.jboss.org/browse/ISPN-1103?page=com.atlassian.jira.plugin....
]
Randall Hauch commented on ISPN-1103:
-------------------------------------
The design has been evolving, and I've been pushing (overwriting) new versions of the
branch. Here's a summary of the basic design:
The primary goal is to enable storing dynamically-structured values with metadata, and to
also enable describing the structure of each value (and metadata) using a schema-based
approach. [
JSON|http://json.org] documents provide an excellent way to offer structure
that is extremely flexible, while [JSON
Schema|http://json-schema.org/] offers a way to
define the structure of JSON documents in a way that can be easily validated. (Note that a
JSON Schema is just a JSON document that conforms to the JSON meta-schema, which is rich
enough to be self-describing. It's actually a very nice specification.)
Manik originally suggested storing the metadata and value (henceforth referred to as
'content') as strings, but doing so would mean that in order to access any
information within the metadata or content, the JSON strings would first need to be parsed
into an in-memory representation. Plus, if the content is to be modified, the JSON
document would need to be modified and written as a string before being stored. This
parsing and writing would become prohibitive.
Since Infinispan is essentially an large heap of memory, it makes far more sense to
represent the content and metadata as _in-memory documents_, as long as the in-memory
representation were compatible with JSON, were easy to use, and could be validated using
JSON Schemas. Additionally, if the representation also supported
[
BSON|http://bsonspec.org] data types (e.g., binary values, UUIDs, dates, regular
expressions, etc.), more types of user-content could be supported (including just raw
binary data). These in-memory documents could at any time be read from or written to JSON
or BSON formats. Having the schematic values be _delta-aware_ with _fine-grained locking_
(see ISPN-1115) would provide significant advantages w/r/t performance and concurrency.
(Note that efficient support for delta-aware means that the schematic value can capture
the changes made to the documents by client application and use those changes as the
delta, rather than having to compare the changed document to a prior version to compute
the changes.)
Using an in-memory representation also means that the content and metadata need not be
stored as separate objects, but could instead be represented by a single document that is
conceptually:
{code}
{
"metadata" : {
/* metadata as a nested document */
}
"content" : /* user's content, as a nested document or binary value */
}
{code}
This is the approach taken by the current design. The primary packages are:
* org.infinispan.schematic
* org.infinispan.schematic.document
* org.infinispan.schematic.internal.*
The first two packages contain the public API, whereas all implementation-specific classes
are contained within the "internal" packages.
The primary API interfaces are:
* SchematicDb - similar to Cache but tailored to make it easy for users to store a content
document (or binary value) with a metadata document. Each SchematicDb has a JSON Schema
library, and providing a map-reduce-based validation mechanism. Internally this uses a
Cache<String,SchematicEntry>.
* SchematicEntry - the value actually stored within Infinispan, and which contains a
content object (that is a Document or a Binary value) and a metadata Document. There are
methods for getting a mutable interface to the content document and metadata documents.
Since tracking the MIME type of the content is likely very common, the SchematicEntry
interface provides methods for getting and setting the MIME type (which is actually stored
in the metadata.
* Document - an immutable interface to an in-memory document
* EditableDocument - a mutable interface to an in-memory document
* Json - utility class for parsing JSON formatted streams/files into Document instances,
and for writing Document instances as JSON
* Bson - utility class for parsing BSON formatted streams/files into Document instances,
and for writing Document instances as BSON
* JsonSchema - utility class for working with JSON Schemas
* Various interfaces for reprenting JSON/BSON values: Array, Binary, Symbol, Timestamp,
Code, CodeWithScope
The current status is that this works for LOCAL mode, but additional work is required
before DISTRIBUTED and REPLICATED modes will work correctly with delta-aware and
[fine-grained locking|MODE-1115].
As always, feedback is appreciated.
Soft schema-based storage
-------------------------
Key: ISPN-1103
URL:
https://issues.jboss.org/browse/ISPN-1103
Project: Infinispan
Issue Type: Feature Request
Components: Core API
Reporter: Manik Surtani
Assignee: Randall Hauch
Priority: Critical
Fix For: 5.1.0.BETA2, 5.1.0.FINAL
This JIRA is about storing metadata alongside values. Perhaps encapsulating values as
SchematicValues, which could be described as:
{code}
class SchematicValue {
String jsonMetadata;
String jsonObject;
}
{code}
Metadata would allow for a few interesting features:
* Extracting of lifespan and timestamp data if manipulated over a remote protocol (REST,
HotRod, etc)
* Content type for REST responses
* Timestamps and SHA-1 hashes, useful for for HTTP headers (e.g., ETag, Cache-control,
etc.)
* Validation information (may not be processed by Infinispan, but can be used by client
libs)
* Classloader/marshaller/classdef version info
* General structure of the information stored
* Reference to the schema for this document
* Storage of older versions
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira