[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-867) input stream support

adam (JIRA) noreply at atlassian.com
Sun Aug 28 12:13:05 EDT 2011


    [ http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=43365#comment-43365 ] 

adam commented on HSEARCH-867:
------------------------------

revisiting this, and happy to open a separate ticket, there does seem to be another, deeper issue here using the LazyField model.  If you pass in the readers, there's no control over the number of readers that may be open at a given moment.  If I limit the following:

* hibernate.search.worker.batch_size
* hibernate.search.worker.thread_pool.size
* hibernate.search.worker.buffer_queue.max

I still get errors from Lucene that I have too many files open, even though:
* if I add a reader.close() immediately after adding the field to the document, I get closed stream errors
* if I add a finalize() method that closes the stream I still get a too many files open error.
* changing modes from lazy=true to lazy=false does nothing
* disabling async mode does nothing

Is there another parameter that can be used?

> input stream support
> --------------------
>
>                 Key: HSEARCH-867
>                 URL: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-867
>             Project: Hibernate Search
>          Issue Type: Improvement
>          Components: analyzer, integration
>    Affects Versions: 3.4.0.Final
>            Reporter: adam
>            Priority: Minor
>
> The current hibernate search functionality is not optimized for dealing with large text contents.  Two use cases:
> 1. indexing an external PDF that's 100MB where an @Field is set on a getter
> 2. indexing a @Lob field
> in both cases, the method must return a string, or a base class, which might mean that you have an InputStream that's 50MB, which gets concatenated into a string, and then passed to an analyzer bundled into a Reader object.  I'm unclear what HibernateSearch is doing when the getter for the @Field annotation is called, but it would be ideal if it could use a reader instead of a string 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the hibernate-issues mailing list