Hi Jonathan,
some help would be awesome!
I had briefly discussed this problem with the team on IRC, and our general feeling was that implementing the StatelessSession for OGM is neither enough (as it would fail just after opening one, as it requires Criteria scrolling too) nor the best approach.
In one form or another, it's clear that to implement this we need some capability to list/scroll all entities without making use of the Lucene indexes. Hibernate OGM drives the different NoSQL dialects using an internal contract "org.hibernate.ogm.dialect.GridDialect"; you can find a concurrentHashMap based implementation in the core module, an EHCache based implementation in the ehcache module, and so on.
Now the main problem is that GridDialect doesn't provide any form of "list all entities" functionality: clearly we should add one. This limitation is driven by the fact that we started addressing key/value stores, in which a "list all" is often inefficient, conceptually a pure Key/Value store would not implement it at all, but in practice since that's very limiting each implementation I'm aware of provides some way to force it to list all entries.
The second step for this issue would be to implement a custom MassIndexer which makes use of this GridDialect; I think a proof of concept of such a MassIndexer would be needed before the new GridDialect functionality is set in stone, so I would suggest to implement this first, implementing the GridDialect's new method only for the core module (which is trivial, being a hashmap). Just yesterday I finished a new MassIndexer for Infinispan which is much simpler than the one in Hibernate Search, I'd suggest you take a look in this commit for inspiration: https://github.com/Sanne/infinispan/commit/f1cd3f8cf411d976ae537ecf6f815342309211fa
Notable differences with Hibernate Search:
it doesn't use pipelines but a specific map/reduce service available in Infinispan. In your case a simple iteration on the GridDialect new method (scroll() ?) should be a good starting point.
the ids are encoded differently in the index. In your case you better keep that code as in Hibernate Search.
We can then review your MassIndexer, and then I'll be glad to help implementing the new GridDialect contract on the other dialects.
I think in the first round we'd be happy to implement a very simple MassIndexer, we can later on try improve it's efficiency.
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
Hi Jonathan,
some help would be awesome!
I had briefly discussed this problem with the team on IRC, and our general feeling was that implementing the StatelessSession for OGM is neither enough (as it would fail just after opening one, as it requires Criteria scrolling too) nor the best approach.
In one form or another, it's clear that to implement this we need some capability to list/scroll all entities without making use of the Lucene indexes. Hibernate OGM drives the different NoSQL dialects using an internal contract "org.hibernate.ogm.dialect.GridDialect"; you can find a concurrentHashMap based implementation in the core module, an EHCache based implementation in the ehcache module, and so on.
Now the main problem is that GridDialect doesn't provide any form of "list all entities" functionality: clearly we should add one. This limitation is driven by the fact that we started addressing key/value stores, in which a "list all" is often inefficient, conceptually a pure Key/Value store would not implement it at all, but in practice since that's very limiting each implementation I'm aware of provides some way to force it to list all entries.
The second step for this issue would be to implement a custom MassIndexer which makes use of this GridDialect; I think a proof of concept of such a MassIndexer would be needed before the new GridDialect functionality is set in stone, so I would suggest to implement this first, implementing the GridDialect's new method only for the core module (which is trivial, being a hashmap). Just yesterday I finished a new MassIndexer for Infinispan which is much simpler than the one in Hibernate Search, I'd suggest you take a look in this commit for inspiration:
https://github.com/Sanne/infinispan/commit/f1cd3f8cf411d976ae537ecf6f815342309211fa
Notable differences with Hibernate Search:
We can then review your MassIndexer, and then I'll be glad to help implementing the new GridDialect contract on the other dialects.
I think in the first round we'd be happy to implement a very simple MassIndexer, we can later on try improve it's efficiency.
For any question, you're very welcome on IRC (http://hibernate.org/community/irc), the dev mailing list (https://lists.jboss.org/mailman/listinfo/hibernate-dev) or this same JIRA.