I see,
Do you keep your IndexReader opened while the data is moved to your
client?
Because the document id is not guaranteed to stay constant.
It seems to me that we can avoid such a feature for now, you can
return getResultSize() + a list of the 20 top objects + score for
example. Such a structure would be serializable.
Let's see how it goes.
On 11 juin 07, at 23:46, John Griffin wrote:
-----Original Message-----
From: Emmanuel Bernard [mailto:emmanuel@hibernate.org]
Sent: Monday, June 11, 2007 10:06 AM
To: John Griffin
Cc: hibernate-dev(a)lists.jboss.org
Subject: Re: [hibernate-dev] hsearch-6 serializable Hits object
How do you deal with serialized Hits?
You raise a "LazyInitializationException" like when a user access an
unavailable document?
Same for the serializable HitIterator?
--------------
I don't deal with lazy initialization. I use the low-level TopDocs
object.
This gives me a document number and score only. Hence it gives me good
response times even across a 1-million document index. This way the
size of the document is not a consideration for memory constraints.
Once I have the TopDocs object I just retrieve the paged amount of
documents as the User asks for them. 10, 20 30, etc. at a time. I also
maintain state with this TopDocs object by passing it back and forth
between client/server with pointers to current locations. We do limit
the size of the TopDocs object to 200. (somewhat arbitrary)
TopDocs also contains a count of the number of actual hits that
resulted
from the search. I use this to warn the user that his search should be
narrowed if necessary. Here's an ex. Our employeecomments table has 1
million records. A search for the word 'labor' returns a count of
157,000+
hits but we only return the top 20 along with the TopDocs object
(hidden
for state) and a statement that 'maybe you should refine your search a
little';>) (we do tell them the amount of hits though). Also we
limit the
actual size of the TopDocs Object to the top 200 records.
--------------
Which approach are you following to expose the data? christian's
proposal?
In christian's proposal, what is really needed for serialization is
Hit, but then Hit is not thing but a Document and a score wich can be
projected as well.
--------------
My SerializableHits object is really nothing more than a rebuild of
a Hit
and score into serializable components. I use the doc number from the
TopDocs to obtain the document and use the TopDocs score. The entire
reason behind this is to allow Lucene classes we had originally
located
locally to our application to be moved to a remote server that
deals solely
with Lucene. Now we have our application on clustered JBoss which
accesses
remote servers that solely deal with Lucene (add, update, delete,
search).
--------------
I don't see the need for a lazy Hit in this scheme, hence no need for
a serializable version.
BTW who is consuming Hit(s) aside for the user's application? does
Lucene has (public) APIs consuming Hits?
--------------
In my case the application is the only consumer. I've never looked at
Lucene in that way i.e. if it were the consumer. I hope this helps.
Let me know what you think.
--------------
Emmanuel
On 7 juin 07, at 22:47, John Griffin wrote:
> Do we want to convert the Lucene Hits object to a serializable
> format so it could be accessed remotely? I'm have developed this at
> work and have a 'SerializablelHits' object for this purpose. Is
> this overkill right now? Change later? Thoughts?
>
>
>
> John G.
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/hibernate-dev