API changes in FieldBridge
by Sanne Grinovero
We often had trouble applying strong optimisations because of the
flexibility of the FieldBridge API, since 4.0 is coming it's time to
decide if these limitations are going to stay, or if we have to change
the contract.
As far as I remember we can classify these limitations in two main areas:
1# when extracting stored values, we need to know which fields are needed
(more details on HSEARCH-213), in short:
- On entity-targeting queries we can't use a FieldSelector if the ID
is using a unknown TwoWayFieldBridge, which might use multiple
Lucene-Fields.
- On projection queries we can't use a FieldSelector if ANY field
uses a TwoWayFieldBridge
- Same scenarios, we can't use a FieldCache.
For these cases we have
HSEARCH-904 - Have TwoWayFieldBridge report which fields should be
loaded from a Document
=== Proposals ?
shall we add a "Iterable<String> getNeededFieldNames()" ?
2# when creating a o.a.l.Document, we want to know
- which entity properties are going to affect the end result
- if external input (current timestamp?) are going to affect it too
this is important to make better use of dirty-ness information of the
data, to see if we can skip the indexing operations at all, especially
if re-indexing is going to reload a significant subgraph via
@IndexedEmbedded and similar.
N.B. this same optimisation should apply to @DynamicBoost and
@AnalyzerDiscriminator
For this one we have:
HSEARCH-764 ClassBridge, DynamicBoost and AnalyzerDiscriminator should
report affecting properties
=== Proposals ?
Define on the interface as well:
a)
boolean allowSkipOnUnchangedFields() // can return false to disable
any optimisation, like to add the current date to the index or
externally loaded data
Iterable<String> getInputPropertyNames() //this is the hard point:
very unsafe stuff.
b)alternatively we could not pass the entity directly but interecept
it and see which properties are being used in practice:
- quite more powerful if the fieldBridge has some branching logic
which leads it to use different fields according to other state
- hard to intercept field accessors: would we limit it to getter
using entities only?
- would still need the global toggle "boolean allowSkipOnUnchangedFields()"
- since it won't really change the API - besides the boolean switch -
could be implemented later.
Please comment, we had this ideas around for some time but the time as
come to write down how it's going to work.
Sanne
13 years, 3 months
HHH-3244 supporting unicode literal type
by Strong Liu
Hi there,
as you know JDBC 4.0 now support unicode literal type, like NCHAR, NVARCHAR, NCLOB, LONGNVARCHAR.
and there are lots of jiras asking we support these new types.
it is easy to create hibernate types for these jdbc type, for example, i have a UnicodeStringType, so
@Entity
class Person{
@Id
Long id;
@Type('unicode-string-type')
String name;
}
Query query = session.createQuery("from Person p where p.name = :name")
query.setUnicodeString('name','刘少壮')
is this okay? and any suggestions?
-----------
Strong Liu <stliu(a)hibernate.org>
http://hibernate.org
http://github.com/stliu
13 years, 3 months
IdentifierGeneratorFactory as a service?
by Emmanuel Bernard
Hi all,
I have the need to implement https://hibernate.onjira.com/browse/HHH-6091 (Let people customize identifier generator strategy mappings programmatically in Hibernate 4)
and I am wondering why IdentifierGeneratorFactory is not a service and if it would make sense for me to implement it as a service. The aternatives would be:
- some hardcoded property based class like in 3.6
- some service dedicated to overriding default id generators that would be called by DefaultIdentifierGeneratorFactory
Any preference?
I also need to implement [HHH-6091](https://hibernate.onjira.com/browse/HHH-6091) Let people customize identifier generator strategy mappings programmatically in Hibernate 4
for which I am considering using the service approach as well.
Let me know if I'm on track
Emmanuel
13 years, 3 months
Developer meetings day/time
by Steve Ebersole
Due to a change in schedule I can no longer do the meetings on Monday at
the same time. To keep it on Monday we would need to move it back an
hour which really is not reasonable for Strong or anyone in Europe for
that matter. Assuming we keep it at about the same time, Wednesday is
also out for me. Which leaves Tuesday, Thursday and Friday. So far the
consensus from talking to people on IRC was Tuesday. Does that work for
everyone?
--
steve(a)hibernate.org
http://hibernate.org
13 years, 3 months
[Search] Sharding and access to (subsets) of index readers and Lucene directories in HS 4.0
by Elmer van Chastelet
Hi all,
Yesterday I had a discussion with Sanne on irc [3] about the new api to
access index readers in HS4.0. We couldn't complete our discussion
yesterday, so let's continue here.
As explained in the forum [1], there is currently no good solution for
getting a reader with a subset of the indexes in a sharded environment.
Currently two basic ideas came to mind:
A - Have a SearchFactory.openIndexReader(Class<?> c,
FullTextFilterImplementor...): This is similar to how the IndexManager's
are gathered at query time, and is probably therefore easy to understand
B - (to be further reviewed) Have something like
searchFactory.indexReaders().withShardingOptions( X, Y
).includeType(Class<?> z).openIndexReader(). This also adds the ability
to get an IndexReader for multiple classes. But we need to think about
the .withShardingOptions (or something similar), what input should we
support here? Sharding properties are mostly based on some entity
property(/ies), probably easy to be encode as String. The (custom)
sharding strategy may use such String to select the proper index
managers. Using a String object for identifying which index managers to
use looks fine to me. It will be compatible with current implementation
of custom sharding strategies where one might use the Lucene document at
addition time, or if an entity instance will also be passed (see
discussion [2]), the properties of that entity can probably encoded to
some String. And if HS will cover the mapping/have support for Strings
as identifiers for sharding instead of a user defined mapping of the
index (integer) in the array of IndexManagers, that would be awesome :)
(Relieves the pain of having some mapping that should be stored
somewhere, which I currently do).
Still, we need to know the use cases there might be, i.e. which
flexibility the API should offer.
As is also mentioned in [1], there is currently no direct access to the
index managers, so getting a FSDirectory is currently not possible in
4.0alpha1. I think HS should support this to offer the flexibility to
work on the Lucene indexes directly (for example, to build an auto
completion/spell check index from an existing index)
Let's start by setting up some requirements?
---------
*1 Have access to IndexReader for one class
*2 Have access to IndexReader with a subset of IndexManagers based on
sharding strategy. Sharding strategies are mostly based on some
propert(y/ies) of an entity instance, which can likely be encoded to
some String.
*3 Have access to index directories (FSDirectory/...). Unlike previous
versions (< HS4.0) it would be nice if this uses the ShardingStrategy
instance in use, so mapping is completely and exclusively done in a
ShardingStrategy
* ...
---------
Please extend/modify the list of requirements if you think something is
missing/incorrect and drop your ideas/thoughts about the mentioned
ideas.
Elmer
[1] https://forum.hibernate.org/viewtopic.php?p=2448000#p2448000
[2]
http://www.mailinglistarchive.com/html/hibernate-dev@lists.jboss.org/2011...
[3] IRC log:
<elmervc> sannegrinovero, have you read/did you have time to think about
https://forum.hibernate.org/viewtopic.php?p=2448000#p2448000
<sannegrinovero> hi elmervc , yes I've read it. my next thing on the
todo is to make some prototype, as I'm not happy with the current ideas:
<sannegrinovero> elmervc, are you blocked by this? the workaround is
very simple
<sannegrinovero> generally, I'm wondering if we can avoid having to
expose the DirectoryProviders. I would want them gone from the public
API, but of course limitations like this are not acceptable.
<elmervc> sannegrinovero, I'm branching this migration, so it's not
really blocking. But I would like to try the new H core/search, so for
that to work I need access to the subset of indices
<elmervc> What workaround were you thinking about ?
<elmervc> Just construct an index reader/FSDirs myself using 'hardcoded'
paths ?
<sannegrinovero> nono that's ugly..
<sannegrinovero> elmervc, all logic to open this IR is in
org.hibernate.search.impl.ImmutableSearchFactory.openIndexReader(Class<?>...)
<sannegrinovero> elmervc, and it's just a couple of lines to change ;)
<sannegrinovero> the problem is more how to make it easy to consume
<elmervc> Ok, I'll look into that :)
<elmervc> Using filters is not a good idea?
<sannegrinovero> yes I liked your suggestion. but is it enough ?
<sannegrinovero> and how would the methods look like?
<sannegrinovero> (i.e. the signature)
<elmervc> SearchFactory.openIndexReader(Class<?> c,
FullTextFilterImplementor[] filters) , or what do you mean?
<sannegrinovero> I'd prefer SearchFactory.openIndexReader(Class<?> c,
FullTextFilterImplementor... filters)
<elmervc> But I'm not sure if this covers all use cases of sharding
<sannegrinovero> elmervc, the methods don't need necessarily be defined
on the SearchFactory. We can think of something like
searchFactory.indexReaders().withShardingOptions( X, Y
).includeType(Class<?> z).openIndexReader() .. how does that look like?
<sannegrinovero> I'm just tossing out some ideas, but then we should
bring this up to the mailing list.
<elmervc> the .includeType , do you mean that multiple classes can be
included?
<sannegrinovero> yes
<sannegrinovero> basically the indexReaders() method would open a
context, private to this invocation chain only. (i.e. not affecting
other threads invoking .indexReaders() )
<elmervc> Sounds cool. But then we need to think about
the .withShardingOptions, or something similar. For transparancy it's
best to have something similar to the methods in the ShardingStrategy
interface
<elmervc> Or something similar to what is done @ querytime, i.e.
FullTextFilterImplementors
<elmervc> The point is, we need to know what other use cases one might
have
<elmervc> That's related to how sharding is done, i.e. ... might be a
field in the doc , full text filter, ...
<elmervc> (doc = doc to be added)
<sannegrinovero> yes exactly I need use cases to understand this, that's
why your feedback is very much appreciated :)
<elmervc> sannegrinovero, For example, our sharding strategy is based on
some field in an entity that is added to the Lucene Document (actually,
it has a @Field anno, and this field is removed from the Lucene Document
in the shardingstrategy.getDirectoryProviderForAddition(...)
<sannegrinovero> elmervc, lol that prooves another discussion I had
recently in proposing that we should pass the entity instance and not
the document to the sharding strategy.
<elmervc> It might be usefull indeed, but in our case it's easier to use
a Field in the doc, because that field will always have the same name,
i.e. we can reuse the same sharding strategy.
<sannegrinovero> elmervc, this discussion is very interesting but I'm
busy in other chats now which I can't postpone. Could you please
synthesize this and send a mail to the developer list?
13 years, 3 months
[HSEARCH] 4.0.0.Beta 1 planning
by Emmanuel Bernard
Hi Sanne, Hardy and all,
With Alpha 2 out last week it's time to enter the beta cycle.
To get a CR1 as planned, we need to get a Beta1 out to users by September 14th. That's going to be the only Beta.
Can you take the lead on this? During this period I will be off.
I trust you can decide what should be in and what should be left out. The only hard line is the date (14/09, max 15/09). I am still online for two days, we can chat about the feature list.
Emmanuel
13 years, 3 months
AS7 management console and Hibernate management statistics/operations...
by Scott Marlow
I'm wondering what Hibernate related information would be good to
include in the AS7 management console. Perhaps for each
SessionFactory/EntityManagerFactory, we could show statistics and some
factory/EntityManagerFactory operations.
I'm leaning towards not using JMX internally to collect the Hibernate
statistics/invoke factory operations.
Please respond here or on IRC, if you have feedback or are interested in
helping to get Hibernate/JPA in the AS7 management console.
Scott
13 years, 3 months