[infinispan-issues] [JBoss JIRA] (ISPN-3767) MassIndexer breaks search feature with one node cluster
Romain Pelisse (JIRA)
jira-events at lists.jboss.org
Tue Dec 3 11:52:06 EST 2013
[ https://issues.jboss.org/browse/ISPN-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928265#comment-12928265 ]
Romain Pelisse commented on ISPN-3767:
--------------------------------------
>Romain, you do not say on which version you're running. The first branch you sent was based on an old 6 (mid October) and the >second on 5.2.4. I would recommend migrating to 6.0.0.Final.
Sorry, I've missed the field while creating the issue. Fixed. (5.2.4)
>Also, let's keep the mass indexing on a one-node cluster being broken and indexing slowness as separate issues.
Yes, those are the two issues.
>Looking at the first branch, it's quite old, so I applied the same changes on current version of DistributedMassIndexingTest. >Running it with 1 or 2 nodes fails the asserts right after the index rebuild. Running with 3, 4, 5 ... nodes works. A quick analysis >leads me to believe it's caused by indexing not being fully synchronous. MassIndexer.start() should only return after the index >was rebuilt, but unfortunately it seems to return right after reindexing work was performed but without any guarantee it was >actually >flushed. I'll fix this during this week.
Awesome !
>Right now, a quick workaround to confirm this : replace enqueueAsyncWork with doWorkInSync at >https://github.com/infinispan/infinispan/blob/master/query/src/main/java/org/infinispan/query/impl/massindex/IndexingReducer.java#L37
>This is not the clean fix, just a workaround to confirm the root of the problem.
Ok, good to know, I may need that workaround later this week.
>Should I also look at branch number 2 or is this clarified? The commit message being "modify DistributedMassIndexingTest to >show slowlyness of indexing" makes me think it's about a different issue
Well, this was intended to show the "slow import with indexing" issue, but it turns out that the unit case displays very,very bad performance on indexing. Perhaps run it and see how it behave on your system (it might just be my machine).
> MassIndexer breaks search feature with one node cluster
> -------------------------------------------------------
>
> Key: ISPN-3767
> URL: https://issues.jboss.org/browse/ISPN-3767
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.4.Final
> Reporter: Romain Pelisse
> Assignee: Adrian Nistor
> Priority: Minor
>
> Hi,
> Trying to cope with the extreme slowliness of put() operation with indexing [1], I've tried to use the MassIndexer, to create the index AFTER adding all the data in the grid. This actually works pretty well, but, when running in a "single node" grid, it "breaks" the search, which always returns 0 result to any kind of query.
> I've modified one of the test suite of InfiniSpan to reproduce the issue:
> https://github.com/rpelisse/infinispan/tree/mass-indexer-breaks-search-with-mono-instance
> Once this branch is checked out, just run :
> $ cd ./query
> $ mvn clean -Dtest=org.infinispan.query.distributed.DistributedMassIndexingTest test
> Note: MassIndexer being implemented using the Map/Reduce algorithm, it requires to run within a cluster (ie several instances of ISPN).
> [1] http://stackoverflow.com/questions/10090361/infinispan-very-slow-for-loading-data-with-indexing-can-it-be-made-faster
> If run within a single node cluster, the MassIndexer
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list