OK, if we want to do all of this we will hate to start very quickly. In fairness, I'm
not sure we can even do all of this so let's make sure these are prioritized
accordingly.
I could not find the expected deadlines for AS 7 / Cre 4 but we are probably talking about
June here: ie very soon.
Some more comments inline.
On 20 avr. 2011, at 10:21, Sanne Grinovero wrote:
Hi,
About changing contracts, we don't get this chance very often so we
should make sure we don't miss any.
I have some favourites I'd like to discuss:
- work list sent to backend
-- As you know Lucene dropped all guarantees about serializability,
supporting stuff like JMS requires a format change; especially the
NumericField is not working right now as it was never serializable
(HSEARCH-681)
+1
-- Lucene is being more flexible about updates, I don't think we
should keep remapping an "update" operation as a delete+add operation,
but transmit the "update operation" and let the backend figure out
what's best.
I guess we could do that. we need to make sure collections "updates" play well
in that mix.
- DirectoryProvider
-- make a "DirectoryManager" instead, which is able to provide
factories for both IndexReader an IndexWriters
-- add utility methods like "getName()", wish I had that in some
cases to provide better error messages. This leads me to think that
instead of trying to foresee all needed methods, the extension point
should not be the DirectoryManager interface directly, but have people
plug in different aspects.
That might be better also since it reduces the scope, it's easier to design the
contract.
-- this is needed to support both Instantiated indexes and to make
good use of all new so called "Near-Real-Time" Lucene improvements.
- ReaderProvider
-- (assuming should a thing would still exist): I think it would be
very nice if the responsibility of such a provider would be to provide
the IndexReader for a single index. currently it has to provide a
"multiReader" on each different index, making some implementations
very tricky (seems I got it right in SharingBufferReaderProvider, but
I recently had some other interesting ideas which revelaed quite
dounting after a draft: take responsibility of the FieldCache expiry
directly, to be able to plug different cache implementations, we
control the lifecycle and we can be much smarter).
ok, we might be able to do that in a 4.1 if need be.
- backends and workers
-- I'd like to make it possible to configure different backends per
index. currently a backend is global, while in some cases (extreme) it
would have been hand to configure even single shards to different
backends. So really a backend should be something coupled to the
"DirectoryManager" mentioned before. Question is, at what level is
sharding going to work, it could work as a multiplexing
DirectoryManager.
Can you remind us the use case behind heterogeneous backends. There was one but I forgot.
-- defaults to change:
- remove the notions of transactional / batch IndexWriter setting,
was deprecated since long enough.
ok easy
- make the FullTextEventLister final (people still extent and
replace
it to better control when an entity is to be indexed, but I hope we
can solve that as well)
Well it will be in a private package anyways
- default to NumericField for numeric properties
- set exclusive_index_use=true by default, benefits are far too high
and some optimizations I was thinking of are impossible if this is
disabled.
I'm not sure I agree with that. It seems that such a default would bite a non careful
user too easily.
-- bridges
- It happened many times that we couldn't do X or optimize Y as "user
bridge might read/write any field"; I think we should stop exposing
the o.a.lucene.Document - especially since we change the format of
messages to the backend - and make sure to expose something as good
and as flexible. Need some thinking on this: we can't expose Document
but we want to make sure people won't ever miss advanced features for
which such a bridge was a nice "advanced api". Or we split the
conteps, having a less-powerful API and a more advanced one, which
could be named, and operate on the Document itself but inside the
backend rather than in the DocumentBuilder (so the name could be used
in the message to the backend to point to some transformer to apply
for final touches - it could be a customization of the implementation
which applies the message in our own format to the
o.a.lucene.Document)
I don't think I follow you, can you expand on what you think.
BTW I'm a bit concerned about the "serializablilty" of what would be needed
to be passed around if you move FieldBridge operations in the backend.
- at some point, we'll need to track also which entity properties are
being "read" by a custom ClassBridge/DynamicBoost, to better check for
index dirtyness. Might be done by proxying the entity, or just having
the implementation declare by which properties it's affected: in this
case, an API change is needed but this can possibly be postponed.
proxying does not solve all use cases. If a suer has a transient getter that reads data
from two other getters, you don't get that info via proxying.
this is just out the top of my head, I'm sure I forgot to break some
interface ;)
I'll give you some time to think about it, then I'll insert the
proposals which survived in the wiki & JIRA.
(needles to say, no objections on your proposals)
Cheers,
Sanne
2011/4/20 Emmanuel Bernard <emmanuel(a)hibernate.org>:
> Hi,
>
> We have had in our road map an Hibernate Search 3.5 before Hibernate 4. Hibernate 4
is the release where the following should happen:
> - split packages into API, SPI and private packages
> - use JBoss Logging
> - be compliant with Core 4
> - break whatever contract we need to break to open up the future
> - split dependency between the core of Hibernate Search and Hibernate Core
>
> Do you see more task for 4?
>
> Since Hibernate Core 4 seems to be doing alright and that the time pressure will be
strong to get Hibernate Search aligned, I propose to skip 3.5 entirely and focus on 4. We
did not that that many new features planned anyways for 3.5, it was more a consolidation
release.
>
> Even with skipping 3.5, the 4 release will be a lot of work. We should start early.
Any objection or comment?
>
> Changing contracts
> We have had a few contracts that we wanted to change to make way for future
improvements:
> - should a bridge know about the field it changes (make the optimization more
efficient)
> - rework the backend to let IndexReader and IndexWriter communicate
> - rework the backend to support instantiated IndexReaders
>
> Can you help collect the list of changes you would like to see happening?
>
> I would like to get this work started asap, this is really the unknown quantity and
we tend to be slow to converge on the things
>
> Split packages in API/SPI/private packages
> Hibernate 4 is the ideal time to properly split stuff into API, SPI, private. Moving
classes to private packages is the least impacting move for users as these should not be
used. The API / SPI split is sometimes difficult to do so if you have a doubt in an area,
ask on the ML or on IRC and we can discuss it together. If you need an example, check out
the query engine. It is relatively clean now.
>
> We might have to break a few user APIs which is fine but I don't expect too many
will be necessary:
> - make sure to discuss it when you plan to do one
> - list them in the migration guide
>
> I'd say that the package splitting should be done when you have a change and when
you work in a specific area. It's more a background task.
>
> Be compliant with Core 4
> We can do this one a bit later in the cycle to give time for core to mature.
>
> Split dependency between Hibernate Search and Hibernate Core
> I think in practice we are not too far. This work should be done in parallel to the
package splitting. If you look at the query engine, we do have specific hibernate
packages. We also have a HibernateHelper class of all low level Hibernate contracts like
unproxying, initializing etc. We should use that class everywhere instead of relying on
the direct Hibernate Core contracts. That will help up to move this class as an
implementable contract.
> The next step potentially is to actually move Hibernate Core specific code into a
separate package.
>
> I don't have much opinion on this but we should definitively discuss it.
>
> Use JBoss Logging
> I tend to think we should do this migration late in the game. WDYT?
>
> New features
> Do you want any new feature per se? I think this would be a great time to get the
community involved to back new features and fix bugs while we do the grunt work for 4. So
if you know some shy people motivated or if you are one of them, stand up :)
>
> Note: I have create a vague copy of this email in
http://community.jboss.org/wiki/PlansforHibernateSearch4
> We can discuss via email but be sure to add the feedback or list of todos in the wiki
as well for posterity.
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>