[hibernate-dev] [OGM] storing the column names in the entity keys for K/V stores

Gunnar Morling gunnar at hibernate.org
Wed Nov 26 05:19:33 EST 2014


2014-11-25 14:30 GMT+01:00 Emmanuel Bernard <emmanuel at hibernate.org>:

> Hi,
>
> With OGM-452 behind us which brings one cache per “table”, we now have
> another decision in front of us.
>
> Should we use a synthetic key for the cache key (say a
> PersistentEntityKey class containing the array of column names and the
> array of column values)?
> Or should we use the natural object key?
>
> == Natural entity key
>
> In the latter, things gets complicated quickly, let me explain:
>
> === Simple case
>
> For simple cases, the id is a simple property and the fit is very
> natural
>
> [source]
> --
> @Entity
> class User {
>     @Id String name;
>     ...
> }
>
> //corresponds to
> cache.put(name, mapRepresentingUser);
> --
>
> === Embedded id
>
> If the identifier is an embedded id, you have several choices that all have
> drawbacks.
>
> 1. use the embedded id class as key `cache.put( new Name("Emmanuel",
> "Bernard"), mapRepresentingUser );`
> 2. use an array of property values `cache.put( new Object[] {"Emmanuel",
> "Bernard"}, mapRepresentingUser );`
>

Will that work at all? Does ISPN really work with value equality for
array-typed keys?

In a normal hash map you wouldn't get the value back as new Object[] {
"Emmanuel", "Bernard"}.equals( new Object[] {"Emmanuel", "Bernard"} ) is
false. So you would have to put the key into a wrapper whose equals method
uses Arrays.equals() internally.


> 3. use a Map<String,Object> corresponding to the array `cache.put( new
> HashMap<String,Object>( {{ "firstname" -> "Emmanuel", "lastname"->"Bernard"
> } ), mapRepresentingUser );
> 4. use an synthetic key `cache.put( new PersistentEntityKey( new String[]
> {"firstname", "lastname" }, new String[] { "Emmanuel", "Bernard" } ),
> mapRepresentingUser);`
>
> In 1, the problem is that we lose the proper data type abstraction
> between the object model and the data stored. `Name` is a user class.
>
> In 2, I think the model is somewhat acceptable but a bit arbitrary.
>
> In 3, I suspect the map is pretty horrific to serialize - that could be
> solved by a externalizer. But more importantly the order of the id
> columns is lost - even though it might be recoverable with
> EntityKeyMetadata?
>
> In 4, we expose the person querying the grid to our OGM specific type.
>

The current implementation puts a PersistentEntityKey designed as you
describe into the cache, but the externalizer only writes the column name
and value arrays. This should be readable without knowing the PEK type,
right? Of course you need to know the structure of the persisted key in
order to read it back.

Now Davide's idea was to only write the column value array, as the column
names are not really needed (assuming that one cache never contains entries
from several tables). This seems sensible to me unless I'm missing some
special case. The persisted form would be basically the one from 2., only
that there is a wrapper used at the API level.


> Aside from this, it is essentially like 4.
>
> === Entity key approach
>
> I really like the idea of the simple case be mapped directly, it makes
> for *the* natural mapping one would have chosen. But as I explained, it
> does not scale.
> In the composite id case, I don't really know what to chose between 2, 3
> and 4.
>
> So, should we go for the simple case if we can? Or favor consistency
> between the simple and complex case?

And which of the complex case do we favor?
>

My preference would be 4, with the proposed change of only writing the
column values. For the "simple case" we'd then could either store an array
of size 1 or just the single value itself, wrapping it into an array when
reading it back. I guess that'd require an instanceof call during read
back. Not sure whether that's good or bad, probably I'd just always store
the array.


>
> == Association
>
> In the case of associations, it becomes a bit trickier because the
> "simple case" where the association key is made of a single column is
> quite uncommon. Association keys are one of these combinations:
>
> * the fk to the owning entity + the index or key of the List or Map
> * the fk to the owning entity + the fk to the target entity (Set)
> * the fk to the owning entity + the list of columns of the simple or
> * embedded type (Set)
> * the fk to the owning entity + the surrogate id of the Bag
> * all columns in case of a non id backed bag
>
> All that to say that we are most of the time in the complex case of
> EntityKey with one of the 4 choices.
>
> Any thoughts and preferences?
>
> Emmanuel
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev


More information about the hibernate-dev mailing list