[hibernate-dev] [OGM] Ogm mass indexer, how to convert Tuple/EntityKey to Entity/Id?

Mon Mar 4 09:33:41 EST 2013

The Hibernate Search / ORM approach does iterate on the primary keys to get a
consistent snapshot of the state to be reindexed, but subsequent phases avoid
the "iterator" approach as it makes parallel execution very hard.

With OGM/Infinispan I think the natural solution is to use Map/Reduce, and
that would be simpler than the multiple-phases (stream) approach we
are forced to use on ORM.

Depending of the underlying OGM backend, some might be able to support an
efficient Map/Reduce operation, some other might have different approaches so
the interface proposed by Davide is to provide something that could be
implemented
by each backend "optimally": we avoid expectation of all backends to support
Map/Reduce directly, but to provide at least some form of "iteration"
(which is not
an Iterator) of all data.

Indeed the GridDialect would need to work on "Tuples", while Hibernate Search
only digests entities, so the consumer of this GridDialect would need to use
the OGM mapping engine itself to perform the transformation; but this is again
code that needs to be coded only once and can be shared across backends.

Davide needs advice to transform the Tuple into entities; he could use a Session
and transform keys, but given the nature of our backends it seems more suited
to iterate on the data directly rather than iterate on the keys only.

Our idea is not to "feed" the existing MassIndexer implementation but
to implement
a new one, which shares the same last phase (consumption of Lucene Documents
from multithreaded producers); this would be an extremely trivial one-phase
processor invoking the DocumentBuilder, provided we have some way to have
the GridDialect expose (to avoid "iterate") all data.

Sanne

On 4 March 2013 10:50, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
> The mass indexer does not work at the resultset level so mixing tuples
> and mass indexer seems wrong to me.
>
> Have you considered something like
>
>     Iterator<Tuple> getAllTuplesFrom(String... tableNames);
>
> And then expose an Iterator<Object> (ie entities) to the mass indexer?
> I mean we could make it work with your proposed consumer scheme but I
> find it unnecessarily complex and it might make stream / flow style
> processing impossible. I can be wrong but I'd like to see your arguments
> first.
>
> OgmLoader.getRowFromResultSet shows how to get a Object[] from a Tuple.
> OgmLoader.getRow is at the heart of it.
>
> But the process of initializing an entity involves several phases, so
> the best bet is to look at OgmLoader.load and look at what happens
> globally.
>
> In the end, to answer your question, there is no method to do what
> you want today, it's more or less the bottom half of OgmLoader.load.
>
> What about associations BTW?
>
> On Fri 2013-03-01 15:00, Davide D'Alto wrote:
>> Hello,
>> I'm trying to create a mass indexer that could work with OGM.
>> The idea is to have a way to scan all the element of a certain type in
>> the data store and index them, this way it would be possible to create
>> an index starting from an existing populated data store.
>>
>> The first prototype idea is to add a method to the GridDialect, something like:
>>
>> GridDialect#forEachTuple(Consumer consumer, String... tableName)
>>
>> Where the Consumer is an interface with a method Consumer#consume(Tuple tuple)
>>
>> The consumer will execute the indexing of the found tuple.
>>
>> The problem that I have now is how to convert the Tuple to the
>> corresponding entity so that I can index it using hibernate search.
>> An alternative idea would be to use the EntityKey and obtain the id
>> instead of using the Tuple.
>>
>> Is there a method somewhere that I can use to obtain an entity from a Tuple?
>>
>> Thanks,
>> Davide
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev