[
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-867?pag...
]
adam commented on HSEARCH-867:
------------------------------
Sanne,
thanks for your comments. I'm trying to optimize an issue locally where we will be
pulling in multiple files (sometimes large). Hence, trying to avoid string concatenation
due to the memory issue. What I've found is that the Fieldable class (according to the
documentation) should happily work with a reader if it's given one instead of the
String. Documenting what I did (as this get's indexed in google)
# changing the stringValue to return null and implementing a reader works well
# using a SequenceInputStream allows me to wrap the FileInputStreams into a single reader
# changing the FieldBridge to process a reader and pass it to the LazyField
input stream support
--------------------
Key: HSEARCH-867
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-867
Project: Hibernate Search
Issue Type: Improvement
Components: analyzer, integration
Affects Versions: 3.4.0.Final
Reporter: adam
Priority: Minor
The current hibernate search functionality is not optimized for dealing with large text
contents. Two use cases:
1. indexing an external PDF that's 100MB where an @Field is set on a getter
2. indexing a @Lob field
in both cases, the method must return a string, or a base class, which might mean that
you have an InputStream that's 50MB, which gets concatenated into a string, and then
passed to an analyzer bundled into a Reader object. I'm unclear what HibernateSearch
is doing when the getter for the @Field annotation is called, but it would be ideal if it
could use a reader instead of a string
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira