[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-867) input stream support
adam (JIRA)
noreply at atlassian.com
Mon Aug 29 17:27:05 EDT 2011
[ http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=43386#comment-43386 ]
adam commented on HSEARCH-867:
------------------------------
I've been testing on a server. Moving the reader and stream initialization directly into the LazyField, and passing in a List of URIs does appear to solve the too many files open issue, at least in this case. I haven't profiled this yet, but seems like this is going to be the "best" performance possible.
thanks
> input stream support
> --------------------
>
> Key: HSEARCH-867
> URL: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-867
> Project: Hibernate Search
> Issue Type: Improvement
> Components: analyzer, integration
> Affects Versions: 3.4.0.Final
> Reporter: adam
> Priority: Minor
>
> The current hibernate search functionality is not optimized for dealing with large text contents. Two use cases:
> 1. indexing an external PDF that's 100MB where an @Field is set on a getter
> 2. indexing a @Lob field
> in both cases, the method must return a string, or a base class, which might mean that you have an InputStream that's 50MB, which gets concatenated into a string, and then passed to an analyzer bundled into a Reader object. I'm unclear what HibernateSearch is doing when the getter for the @Field annotation is called, but it would be ideal if it could use a reader instead of a string
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the hibernate-issues
mailing list