[hibernate-dev] [OGM] Mapping of component types in document stores

Tue Jul 19 13:27:32 EDT 2016

On 19 July 2016 at 18:14, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
> Assuming we switch to the new mapping approach by default in 6.
> Is there a way via mapping to use the old approach? Or would that
> require some new annotation?
>
> I understand the evolution arguments but frankly in a plain mongodb
> document, I would have skipped the single property name. It's forced
> upon me by Java here (at least the Enum case).

Good point, I guess I would have as well.

So the problem is not the mapping in se, but the fact that OGM takes
its own decisions and doesn't "see" the evolution of the schema over
time.

When a developer makes this choice explicitly, and later figures he
needs to add new properties, he has nobody else to blame than himself
and will have to deal with his own problem. The problem here is that
the might not be aware of the problem.

Rails applications have always encouraged to deal with app evolution
explicitly, adding schema migrations to the app changesets in version
control (and helping to apply them as needed). Wondering if Hibernate
should help more with this too, and consequentially be able to
automatically do something smarter in situations like this one.

Schema evolution is concerning me a bit with the Hot Rod dialect as
well: we're generating a protobuf schema automatically, however really
the strong point of protobuf is evolution, so my "protobuf generator"
feels naive as it doesn't take a pre-existing schema into account.

-- Sanne

>
> On Tue 2016-07-12 13:35, Guillaume Smet wrote:
>> Hi,
>>
>> For the sake of completeness, here is the mapping obtained with Morphia:
>> { "_id" : ObjectId("5784ca2612d0226cb309666d"), "className" :
>> "TestEntity", "embeddeds" : [ { "singleProperty" : "value1" }, {
>> "singleProperty" : "value2" } ], "embedded" : { "singleProperty" :
>> "value" }, "collectionOfStrings" : [ "string1", "string2" ] }
>> They are basically following the POLA and follow the Java mapping for
>> the MongoDB one.
>>
>> Btw, to be complete, here are the reasons why I would like to change
>> it (I agree we have to maintain compatibility with older databases
>> but, as Sanne, I think it should be the new default):
>> 1/ POLA: I would expect my datastore mapping to follow my Java mapping
>> 2/ related to 1/: I wouldn't expect to have to migrate my data when I
>> simply add a property to an existing embeddable
>> 3/ remove special cases in our code, especially special cases present
>> in the dialects
>> 4/ I don't think we are completely consistent with this behavior.
>> Typically, if I take StoryGame from our tests and remove all the
>> properties but one from OptionalStoryBranch, I end up with the
>> following:
>> - in the datastore: "chaoticBranches" : [ "[VENDETTA] assassinate the
>> leader of the party", "[ARTIFACT] Search for the evil artifact" ] -
>> this is what we expect, only one property, we remove the property
>> level
>> - in the native query generated by our JPA query "FROM StoryGame story
>> JOIN story.chaoticBranches c WHERE c.evilText = '[ARTIFACT] Search for
>> the evil artifact'": where={ "chaoticBranches.evilText" : "[ARTIFACT]
>> Search for the evil artifact"}
>> -> so our JPQL queries don't work if we only have one property in the
>> embedded. We might also want to special case this but I really don't
>> think it's a good idea.
>>
>> While this discussion might seem to come out of the blue, it's in fact
>> related to OGM-893 and another special casing we do. See my comment
>> here: https://hibernate.atlassian.net/browse/OGM-893?focusedCommentId=79245&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-79245
>> . The mapping is changing when we add a @Column with a name for a
>> property of an embedded in a collection element.
>>
>> --
>> Guillaume
>>
>> On Tue, Jul 12, 2016 at 12:18 PM, Sanne Grinovero <sanne at hibernate.org> wrote:
>> > On 12 July 2016 at 11:13, Gunnar Morling <gunnar at hibernate.org> wrote:
>> >>> I'd be concerned about schema evolution:
>> >>
>> >> Yes, that's the main argument; as said, I can see that.
>> >>
>> >>> I'd see more value in making this the default, and have an "higher
>> >>> level" configuration property which is like "read like OGM 5.0 used to
>> >>> store it".
>> >>
>> >> I wouldn't like changing such default in a 5.x release. For 6, ok, why not,
>> >> if you all think that's better.
>> >
>> > ok
>> >
>> >>
>> >>> Even better, we'd provide tooling which migrates an existing database.
>> >>
>> >> Sure, migration support is on the roadmap ;)
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> 2016-07-12 11:06 GMT+01:00 Sanne Grinovero <sanne at hibernate.org>:
>> >>>
>> >>> On 12 July 2016 at 10:55, Gunnar Morling <gunnar at hibernate.org> wrote:
>> >>> > Hi,
>> >>> >
>> >>> > We had an interesting discussion on how to map element collections of
>> >>> > component types with a single column to document stores such as MongoDB.
>> >>> >
>> >>> > E.g. assume we have
>> >>> >
>> >>> >     @Entity
>> >>> >     public class Person {
>> >>> >
>> >>> >         public String name;
>> >>> >
>> >>> >         @ElementCollection
>> >>> >         public List<Status> statusHistory;
>> >>> >     }
>> >>> >
>> >>> >     @Embeddable
>> >>> >     public class Status {
>> >>> >         public String name;
>> >>> >     }
>> >>> >
>> >>> >
>> >>> > Currently, that's mapped to documents like this:
>> >>> >
>> >>> >     {
>> >>> >         "name"  : "Bob";
>> >>> >         "statusHistory" : [
>> >>> >             "great",
>> >>> >             "mediocre",
>> >>> >             "splendid"
>> >>> >         ]
>> >>> >     }
>> >>>
>> >>> "great", "mediocre", etc.. are values of the `name` property?
>> >>>
>> >>> >
>> >>> > I.e. if the component type has a single column, we omit the field name
>> >>> > in
>> >>> > the persistent structure. Whereas if there are multiple columns, it's
>> >>> > added
>> >>> > so we can properly read back such documents:
>> >>> >
>> >>> >
>> >>> >     {
>> >>> >         "name"  : "Bob";
>> >>> >         "statusHistory" : [
>> >>> >             { "name" : "great", "date" : "22.06.2016" },
>> >>> >             { "name" : "mediocre", "date" : "15.05.2016" },
>> >>> >             { "name" : "splendid", "date" : "12.04.2016" }
>> >>> >         ]
>> >>> >     }
>> >>> >
>> >>> > The question now is, should we also create such array of sub-documents,
>> >>> > each containing the field name, in the case where there only is a single
>> >>> > column. As far as I remember, the current structure has been chosen for
>> >>> > the
>> >>> > sake of efficiency but also simplicity (why deal with sub-documents if
>> >>> > there only is a single field?).
>> >>> >
>> >>> > Guillaume is questioning the sanity of that, arguing that mapping this
>> >>> > as
>> >>> > an element collection of a component type rather than string should
>> >>> > mandate
>> >>> > the persistent structure to always contain the field name.
>> >>>
>> >>> I agree, but maybe for other reasons.
>> >>> I'd be concerned about schema evolution: if I add a new attribute to
>> >>> the `Status` class, say a "long timestampOfChance" for the sake of the
>> >>> example,
>> >>> as a developer I might want to consider this a nullable value as I'm
>> >>> aware that my existing database didn't define this property so far.
>> >>>
>> >>> I wouldn't be happy to see failures on loading existing stored values
>> >>> for Status#name : such mapping choices have to be very consistent.
>> >>>
>> >>> >
>> >>> > We cannot change the default as we are committed to the MongoDB format,
>> >>> > but
>> >>> > if there is agreement that it's useful, we could add an option to enable
>> >>> > this mapping.
>> >>>
>> >>> So many mapping options :-/
>> >>>
>> >>> I'd see more value in making this the default, and have an "higher
>> >>> level" configuration property which is like "read like OGM 5.0 used to
>> >>> store it".
>> >>> Even better, we'd provide tooling which migrates an existing database.
>> >>>
>> >>> >
>> >>> > I kind of see how this format simplifies migration (in case another
>> >>> > field
>> >>> > is added after a while), but personally I still like the more compact
>> >>> > looks
>> >>> > of the current approach. Having an option for it works for me.
>> >>> >
>> >>> > Any thoughts?
>> >>> >
>> >>> > --Gunnar
>> >>> > _______________________________________________
>> >>> > hibernate-dev mailing list
>> >>> > hibernate-dev at lists.jboss.org
>> >>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >>
>> >>
>> > _______________________________________________
>> > hibernate-dev mailing list
>> > hibernate-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev