I would like to discuss the problem of _id in MongoDB and how to map that in Hibernate
OGM.
MongoDB is a bit psycho-rigid in how it uniquely identifies a document. A special property
named _id is used for that and must be unique across a collection. It is also strongly
recommended to let MongoDB generate this id (a UUID essentially).
In the MongoDb dialect we have not settled on how to use _id. and I would like to clarify
that. Today we use `dbObject.put(ID_FIELDNAME, key.getColumnValues()[0])` but that is only
correct if the id property is mapped to a single column. (ie `key.getColumnNames().length
== 1`)
## Use _id as a OracleDB rowid
We could decide to use _id as a purely internal identifier for a document and basically
never ever rely on it. All queries and lookup with use the identifier columns and their
value to find a document.
That has the benefit of not having to deal with _id but I don't know if that's an
OK practice in MongoDB or if it's not recommended at all as it would lead to costly
lookups. Anybody familiar with MongoDB can shime in?
## Map _id when we have a identifier mapped on one column
In this case, I will only discuss the case where an id is mapped to a single column.
We could decide to map the id column value to both the id column and to _id. That creates
some duplication but OGM would be happy and MongoDB's queries could be efficient.
Alternatively, we could decide to completely ignore the id column name and use _id for
this. The TupleShapshot would then be responsible for binding the id column name to the
value stored in _id. My concern with the alternative is that someone reading the data from
the mongodb store will not find the JPA id column but rather see _id. On the other hand it
seems to be the norm in the MongoDB land.
### Identifiers mapped on several columns
In this case, we have three approaches that can be combined:
1. treat _id as rowid (see avove)
2. map id values as a complex object and put that in _id eg { "_id": {
"firstname": "Emmanuel", "lastname": "Bernard"} }
Note that we can then decide to bind the id columns as top level attributes of the
document as well.
Do you guys have any thoughts on the best approach?
Show replies by date