Re: [hibernate-dev] [OGM] Mapping of component types in document stores

Tuesday, 19 July 2016

Assuming we switch to the new mapping approach by default in 6.
Is there a way via mapping to use the old approach? Or would that
require some new annotation?

I understand the evolution arguments but frankly in a plain mongodb
document, I would have skipped the single property name. It's forced
upon me by Java here (at least the Enum case).

On Tue 2016-07-12 13:35, Guillaume Smet wrote:
...
 Hi,

 For the sake of completeness, here is the mapping obtained with Morphia:
 { "_id" : ObjectId("5784ca2612d0226cb309666d"), "className"
:
 "TestEntity", "embeddeds" : [ { "singleProperty" :
"value1" }, {
 "singleProperty" : "value2" } ], "embedded" : {
"singleProperty" :
 "value" }, "collectionOfStrings" : [ "string1",
"string2" ] }
 They are basically following the POLA and follow the Java mapping for
 the MongoDB one.

 Btw, to be complete, here are the reasons why I would like to change
 it (I agree we have to maintain compatibility with older databases
 but, as Sanne, I think it should be the new default):
 1/ POLA: I would expect my datastore mapping to follow my Java mapping
 2/ related to 1/: I wouldn't expect to have to migrate my data when I
 simply add a property to an existing embeddable
 3/ remove special cases in our code, especially special cases present
 in the dialects
 4/ I don't think we are completely consistent with this behavior.
 Typically, if I take StoryGame from our tests and remove all the
 properties but one from OptionalStoryBranch, I end up with the
 following:
 - in the datastore: "chaoticBranches" : [ "[VENDETTA] assassinate the
 leader of the party", "[ARTIFACT] Search for the evil artifact" ] -
 this is what we expect, only one property, we remove the property
 level
 - in the native query generated by our JPA query "FROM StoryGame story
 JOIN story.chaoticBranches c WHERE c.evilText = '[ARTIFACT] Search for
 the evil artifact'": where={ "chaoticBranches.evilText" :
"[ARTIFACT]
 Search for the evil artifact"}
 -> so our JPQL queries don't work if we only have one property in the
 embedded. We might also want to special case this but I really don't
 think it's a good idea.

 While this discussion might seem to come out of the blue, it's in fact
 related to OGM-893 and another special casing we do. See my comment
 here:
https://hibernate.atlassian.net/browse/OGM-893?focusedCommentId=79245&...
 . The mapping is changing when we add a @Column with a name for a
 property of an embedded in a collection element.

 -- 
 Guillaume

 On Tue, Jul 12, 2016 at 12:18 PM, Sanne Grinovero <sanne(a)hibernate.org&gt; wrote:
 > On 12 July 2016 at 11:13, Gunnar Morling <gunnar(a)hibernate.org&gt; wrote:
 >>> I'd be concerned about schema evolution:
 >>
 >> Yes, that's the main argument; as said, I can see that.
 >>
 >>> I'd see more value in making this the default, and have an "higher
 >>> level" configuration property which is like "read like OGM 5.0
used to
 >>> store it".
 >>
 >> I wouldn't like changing such default in a 5.x release. For 6, ok, why not,
 >> if you all think that's better.
 >
 > ok
 >
 >>
 >>> Even better, we'd provide tooling which migrates an existing database.
 >>
 >> Sure, migration support is on the roadmap ;)
 >>
 >>
 >>
 >>
 >>
 >> 2016-07-12 11:06 GMT+01:00 Sanne Grinovero <sanne(a)hibernate.org&gt;:
 >>>
 >>> On 12 July 2016 at 10:55, Gunnar Morling <gunnar(a)hibernate.org&gt;
wrote:
 >>> > Hi,
 >>> >
 >>> > We had an interesting discussion on how to map element collections of
 >>> > component types with a single column to document stores such as
MongoDB.
 >>> >
 >>> > E.g. assume we have
 >>> >
 >>> >     @Entity
 >>> >     public class Person {
 >>> >
 >>> >         public String name;
 >>> >
 >>> >         @ElementCollection
 >>> >         public List<Status> statusHistory;
 >>> >     }
 >>> >
 >>> >     @Embeddable
 >>> >     public class Status {
 >>> >         public String name;
 >>> >     }
 >>> >
 >>> >
 >>> > Currently, that's mapped to documents like this:
 >>> >
 >>> >     {
 >>> >         "name"  : "Bob";
 >>> >         "statusHistory" : [
 >>> >             "great",
 >>> >             "mediocre",
 >>> >             "splendid"
 >>> >         ]
 >>> >     }
 >>>
 >>> "great", "mediocre", etc.. are values of the `name`
property?
 >>>
 >>> >
 >>> > I.e. if the component type has a single column, we omit the field name
 >>> > in
 >>> > the persistent structure. Whereas if there are multiple columns,
it's
 >>> > added
 >>> > so we can properly read back such documents:
 >>> >
 >>> >
 >>> >     {
 >>> >         "name"  : "Bob";
 >>> >         "statusHistory" : [
 >>> >             { "name" : "great", "date" :
"22.06.2016" },
 >>> >             { "name" : "mediocre", "date"
: "15.05.2016" },
 >>> >             { "name" : "splendid", "date"
: "12.04.2016" }
 >>> >         ]
 >>> >     }
 >>> >
 >>> > The question now is, should we also create such array of
sub-documents,
 >>> > each containing the field name, in the case where there only is a
single
 >>> > column. As far as I remember, the current structure has been chosen
for
 >>> > the
 >>> > sake of efficiency but also simplicity (why deal with sub-documents if
 >>> > there only is a single field?).
 >>> >
 >>> > Guillaume is questioning the sanity of that, arguing that mapping this
 >>> > as
 >>> > an element collection of a component type rather than string should
 >>> > mandate
 >>> > the persistent structure to always contain the field name.
 >>>
 >>> I agree, but maybe for other reasons.
 >>> I'd be concerned about schema evolution: if I add a new attribute to
 >>> the `Status` class, say a "long timestampOfChance" for the sake of
the
 >>> example,
 >>> as a developer I might want to consider this a nullable value as I'm
 >>> aware that my existing database didn't define this property so far.
 >>>
 >>> I wouldn't be happy to see failures on loading existing stored values
 >>> for Status#name : such mapping choices have to be very consistent.
 >>>
 >>> >
 >>> > We cannot change the default as we are committed to the MongoDB
format,
 >>> > but
 >>> > if there is agreement that it's useful, we could add an option to
enable
 >>> > this mapping.
 >>>
 >>> So many mapping options :-/
 >>>
 >>> I'd see more value in making this the default, and have an "higher
 >>> level" configuration property which is like "read like OGM 5.0
used to
 >>> store it".
 >>> Even better, we'd provide tooling which migrates an existing database.
 >>>
 >>> >
 >>> > I kind of see how this format simplifies migration (in case another
 >>> > field
 >>> > is added after a while), but personally I still like the more compact
 >>> > looks
 >>> > of the current approach. Having an option for it works for me.
 >>> >
 >>> > Any thoughts?
 >>> >
 >>> > --Gunnar
 >>> > _______________________________________________
 >>> > hibernate-dev mailing list
 >>> > hibernate-dev(a)lists.jboss.org
 >>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
 >>
 >>
 > _______________________________________________
 > hibernate-dev mailing list
 > hibernate-dev(a)lists.jboss.org
 > https://lists.jboss.org/mailman/listinfo/hibernate-dev
 _______________________________________________
 hibernate-dev mailing list
 hibernate-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/hibernate-dev 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] [OGM] Mapping of component types in document stores