On 1 March 2012 13:46, Guillaume Smet <guillaume.smet(a)gmail.com> wrote:
Hi Sanne,
On Thu, Mar 1, 2012 at 1:09 PM, Sanne Grinovero <sanne(a)hibernate.org> wrote:
> I'm aware of the issue, and I wouldn't mind some help on it: it would
> be better to fix the MassIndexer design to use a limited set of
> threads than to expose more metadata information, don't you agree?
>
> The reason for me to not have given a high priority to HSEARCH-598 are
> 1) there are easy workarounds
I agree *IF* we provide an API to get the root indexed entities. The
most simple workaround so far is to work entity by entity and it would
be nice to be able to do so easily.
This is how we do it currently:
for (Class<?> clazz :
getIndexedRootEntities(fullTextEntityManager.getSearchFactory(),
Object.class)) {
LOGGER.debug(String.format("Reindexing %1$s.",
clazz));
ProgressMonitor progressMonitor = new ProgressMonitor();
Thread t = new Thread(progressMonitor);
t.start();
MassIndexer indexer =
fullTextEntityManager.createIndexer(clazz);
indexer.batchSizeToLoadObjects(batchSize)
.threadsForSubsequentFetching(fetchingThreads)
.threadsToLoadObjects(loadThreads)
.cacheMode(CacheMode.NORMAL)
.progressMonitor(progressMonitor)
.startAndWait();
progressMonitor.stop();
t.interrupt();
LOGGER.debug(String.format("Reindexing %1$s
done.", clazz));
}
with getIndexedRootEntities(...):
/**
* @see MassIndexerImpl#toRootEntities
*/
protected Set<Class<?>> getIndexedRootEntities(SearchFactory
searchFactory, Class<?>... selection) {
if (searchFactory instanceof SearchFactoryImplementor) {
SearchFactoryImplementor searchFactoryImplementor =
(SearchFactoryImplementor) searchFactory;
Set<Class<?>> entities = new
HashSet<Class<?>>();
// first build the "entities" set containing all indexed
subtypes
of "selection".
for (Class<?> entityType : selection) {
Set<Class<?>> targetedClasses =
searchFactoryImplementor.getIndexedTypesPolymorphic(new Class[] {
entityType });
if (targetedClasses.isEmpty()) {
String msg = entityType.getName() + " is not
an indexed entity or
a subclass of an indexed entity";
throw new IllegalArgumentException(msg);
}
entities.addAll(targetedClasses);
}
Set<Class<?>> cleaned = new
HashSet<Class<?>>();
Set<Class<?>> toRemove = new
HashSet<Class<?>>();
//now remove all repeated types to avoid duplicate loading by
polymorphic query loading
for (Class<?> type : entities) {
boolean typeIsOk = true;
for (Class<?> existing : cleaned) {
if (existing.isAssignableFrom(type)) {
typeIsOk = false;
break;
}
if (type.isAssignableFrom(existing)) {
toRemove.add(existing);
}
}
if (typeIsOk) {
cleaned.add(type);
}
}
cleaned.removeAll(toRemove);
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Targets for indexing job: {}",
cleaned);
}
return cleaned;
} else {
throw new IllegalArgumentException("searchFactory should be
a
SearchFactoryImplementor");
}
}
I understood you where doing something crazy like that :D
My question is, this additional API would be pointless assuming we fix
HSEARCH-598?
Or we add an option like:
fullTextSession.createIndexer().processTypesSequentially().startAndWait();
> 2) I have no tests
>
> Writing an automated test is in this case I think 90% of the
> complexity; if you would like to do that, I think I could fix it
> quickly or at least propose many possible solutions.
> Ideally for this to be included in 4.1.0.Final we should finish it
> next week.. do you think you could help with it?
Not at the moment. You might have seen that I'm far less active than
last month, mostly because I have a lot of work at my company - and at
the moment on a project using Solr instead of Hibernate Search - but
still based on Hibernate. We might be able to help on this front and
others later this year as Hibernate Search is an important part of our
work.
Ok, thanks anyway.
--
Guillaume