[JBoss JIRA] (ARTIF-683) Switch to RDBMS, Hibernate ORM, Hibernate Search, and Hibernate 2nd Level Cache as the persistence solution
by Randall Hauch (JIRA)
[ https://issues.jboss.org/browse/ARTIF-683?page=com.atlassian.jira.plugin.... ]
Randall Hauch updated ARTIF-683:
--------------------------------
Description:
Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
Ideas:
1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, RocksDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
3.) Distributed but strongly consistent database: RocksDB, CockroachDB. These are newer, but can (theoretically) scale larger than relational, and because they replicate data it might be more durable or at least recover faster in the event of failure. On the other hand, this may be more difficult for enterprises to adopt
3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
was:
Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
Ideas:
1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, RocksDB, CockroachDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
> Switch to RDBMS, Hibernate ORM, Hibernate Search, and Hibernate 2nd Level Cache as the persistence solution
> -----------------------------------------------------------------------------------------------------------
>
> Key: ARTIF-683
> URL: https://issues.jboss.org/browse/ARTIF-683
> Project: Artificer
> Issue Type: Feature Request
> Reporter: Brett Meyer
> Assignee: Brett Meyer
>
> Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
> We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
> Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
> Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
> Ideas:
> 1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
> 2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, RocksDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
> 3.) Distributed but strongly consistent database: RocksDB, CockroachDB. These are newer, but can (theoretically) scale larger than relational, and because they replicate data it might be more durable or at least recover faster in the event of failure. On the other hand, this may be more difficult for enterprises to adopt
> 3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
> 4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
> In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ARTIF-683) Switch to RDBMS, Hibernate ORM, Hibernate Search, and Hibernate 2nd Level Cache as the persistence solution
by Randall Hauch (JIRA)
[ https://issues.jboss.org/browse/ARTIF-683?page=com.atlassian.jira.plugin.... ]
Randall Hauch updated ARTIF-683:
--------------------------------
Description:
Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
Ideas:
1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
3.) Distributed but strongly consistent database: RocksDB (a variant of LevelDB), CockroachDB. These are newer, but can (theoretically) scale larger than relational, and because they replicate data it might be more durable or at least recover faster in the event of failure. On the other hand, this may be more difficult for enterprises to adopt
3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
was:
Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
Ideas:
1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, RocksDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
3.) Distributed but strongly consistent database: RocksDB, CockroachDB. These are newer, but can (theoretically) scale larger than relational, and because they replicate data it might be more durable or at least recover faster in the event of failure. On the other hand, this may be more difficult for enterprises to adopt
3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
> Switch to RDBMS, Hibernate ORM, Hibernate Search, and Hibernate 2nd Level Cache as the persistence solution
> -----------------------------------------------------------------------------------------------------------
>
> Key: ARTIF-683
> URL: https://issues.jboss.org/browse/ARTIF-683
> Project: Artificer
> Issue Type: Feature Request
> Reporter: Brett Meyer
> Assignee: Brett Meyer
>
> Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
> We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
> Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
> Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
> Ideas:
> 1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
> 2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
> 3.) Distributed but strongly consistent database: RocksDB (a variant of LevelDB), CockroachDB. These are newer, but can (theoretically) scale larger than relational, and because they replicate data it might be more durable or at least recover faster in the event of failure. On the other hand, this may be more difficult for enterprises to adopt
> 3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
> 4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
> In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ARTIF-683) Switch to RDBMS, Hibernate ORM, Hibernate Search, and Hibernate 2nd Level Cache as the persistence solution
by Brett Meyer (JIRA)
Brett Meyer created ARTIF-683:
---------------------------------
Summary: Switch to RDBMS, Hibernate ORM, Hibernate Search, and Hibernate 2nd Level Cache as the persistence solution
Key: ARTIF-683
URL: https://issues.jboss.org/browse/ARTIF-683
Project: Artificer
Issue Type: Feature Request
Reporter: Brett Meyer
Assignee: Brett Meyer
Artificer currently uses ModeShape + Infinispan + JDBC as its storage. Back when Artificer was a simple S-RAMP impl, JCR made a lot of sense. The S-RAMP spec is essentially a hierarchical artifact repo that maintains the node metadata and relationships between them. However, the "hierarchical" bit is overstated -- it's limited to a primary artifact and its derived artifact (ex -- primary: XSD, derived: type declarations). So, the hierarchy is at most 2 levels and could be represented by a simple relationship or one-to-one foreign key. The only time the hierarchical structure is helpful is when we look up an artifact by its UUID (due to a specific tree structure we use). But otherwise, I think it's a bit of a misnomer.
We're now extending well beyond S-RAMP. In addition to an artifact/metadata/info repo, we're trying to position the project as a more general repo for multiple projects and service information. Most importantly, the relationship requirements will expand the most. As such, I'm thinking we'd be better served by alternatives.
Note that this is essentially a read-intensive system. Writes do of course occur, but they're almost always *additions*. Nodes are rarely updated once created. Locking and isolation should be used, but can be extremely optimistic. Also note that most artifacts have files with them. That currently uses a local filesystem store through ISPN, but could certainly be NAS.
Additional fuel for the fire: many enterprise-level development shops have millions of artifacts, exponentially higher once derivation kicks in. Further, many have multiple relationships defined.
Ideas:
1.) Switch to RDBMS + Hibernate ORM + Hibernate Search + Hibernate 2nd Level Caching. Although the structure originally looked JCR-specific, it may make a lot more sense as a relational DB. HSearch is a no brainer -- the full-text search capability would be vastly improved, right out of the box. And the RDBMS + in-memory-cache would be perfect for the read-intensive environment and scalability.
2.) Graph databases: Neo4j (w/ or w/o Hibernate OGM), OrientDB, RocksDB, CockroachDB, etc.. The concern here is mainly horizontal scaling and, from what I understand, their (lack of adequate) clustering support. But, it's definitely an option.
3.) Stick with MS + ISPN, but use Cassandra behind it (instead of JDBC). Arguably, this wouldn't really change things and could potentially end up worse.
4.) Tinkerpop/Blueprints (graph API). Hawkular is using this. However, from what I've heard elsewhere, it's a horrible standard. Solutions that attempt to implement it end up in a state of twisted adaptation, resulting in performance hits.
In the end, I'd argue that #1 is the best from enterprise-level, scalability, reliability, and configurability standpoints.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ARTIF-657) Remove updateContent capability
by Brett Meyer (JIRA)
[ https://issues.jboss.org/browse/ARTIF-657?page=com.atlassian.jira.plugin.... ]
Brett Meyer updated ARTIF-657:
------------------------------
Fix Version/s: 1.1.0.Final
1.0.0.Beta1
> Remove updateContent capability
> -------------------------------
>
> Key: ARTIF-657
> URL: https://issues.jboss.org/browse/ARTIF-657
> Project: Artificer
> Issue Type: Bug
> Reporter: Brett Meyer
> Assignee: Brett Meyer
> Fix For: 1.1.0.Final, 1.0.0.Beta1
>
>
> In the middle of ImpactAnalysisDemo, after all artifacts and relationships have been created, run updateContent on the XSD, using the same .xsd file. The demo will still run successfully. But when finished, pull up the UI and hit one of the Part 'parameter" artifacts, then hit their Relationship tab. It fails, most likely because the relationship wasn't re-created.
> Not sure how to handle that. Should the relationships really be re-generated? If so, what happens if, for example, a WSDL Part uses a type that's been modified or removed?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months
[JBoss JIRA] (ARTIF-674) Separate from the S-RAMP spec
by Brett Meyer (JIRA)
[ https://issues.jboss.org/browse/ARTIF-674?page=com.atlassian.jira.plugin.... ]
Brett Meyer closed ARTIF-674.
-----------------------------
Resolution: Won't Fix
> Separate from the S-RAMP spec
> -----------------------------
>
> Key: ARTIF-674
> URL: https://issues.jboss.org/browse/ARTIF-674
> Project: Artificer
> Issue Type: Task
> Reporter: Brett Meyer
> Assignee: Brett Meyer
> Fix For: 2.0
>
>
> For 2.0, we might consider separating from the S-RAMP spec. IMO, the spec should be largely considered dead and abandoned. RH seems to be the only entity still "using" it. Further, many pieces are starting to hold things back, in addition to adding needless complexity. Instead of focusing on conformance, use only "the good parts". Claim that the project is "loosely based on S-RAMP".
> Ideas:
> REMOVE
> - Atom binding: Instead, use pure JSON REST
> - The web UI services: With the server services using JSON REST, we'd no longer need an additional service layer specifically for the web UI.
> - All "s-ramp" namespaces
> KEEP
> - Model schemas and bindings.
> - Query core syntax
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
9 years, 8 months