January 2015 - hibernate-dev - Jboss List Archives

[Search] Native Java serialization support

by Hardy Ferentschik

Hi, I would like to summarize a discussion we had on IRC to get some more feedback and come to a decision on how to move forward. I am currently in the need of extending our serialization support for the distributed Search deployment scenarios. Basically we are serializing our different LuceneWork instances from slave to master in this case. This includes things like Lucene's Document instances, which are part of add/update operations. Historically, this needs arose with Lucene dropping all serialization support for their classes, so we were forced to implement our custom serialization. To do so we defined an SPI (org.hibernate.search.indexes.serialization.spi.*) and provided two implementations, one based on native Java serialization and one based on Avro [1]. The two implementations are provided as separate artifacts (the serialization/java and serialization/avro modules in our build) and theoretically it should be possible to switch between them by exchanging jar files. I am saying theoretically, since I found out during my recent work, that the Java serialization module is broken at several places. In its current state it would not work (I guess we never noticed since the default is Avro and we do not even document the possibility to change implementation. However, it also shows that no one has even tried). The question is, what do we do now? Do we want two implementations and should the Java serialization be fixed and then extended with the new functionality (btw, I need to serialize DocValues now) or is it time to drop this module, reducing the amount of code we have to maintain and making it a bit easier to implement new serialization requirements. With dropping the module I mean to remove the serialization/java module leaving everything else in place. So you still can write your own serialization implementation, however, we provide no alternative to our preferred choice of Avro (which is afaik considerably faster than native Java serialization which was one of the driving factors of using it). I think on IRC we already "kind of" agreed that we should drop native Java serialization. I just wanted to put it out once more for everyone to comment/vote. --Hardy [1] http://avro.apache.org/

9 years, 10 months

1
0
0 / 0

[Search] Deprecating the @Key annotation

by Sanne Grinovero

Gunnar made a nice patch[1] to simplify usage of parameterized, cacheable FulltextFilters: these no longer require the user to create a custom key to identify the parameterset. Documentation wise, we're struggling in finding a good explanation on why someone might want to still use the @Key annotation - besides backward compatibility. It would be nice to simply remove the references to @Key in the documentation, and avoid trying to explain when to choose between the two alternatives. This would imply to deprecate the annotation.. any objection? Sanne 1 - https://github.com/hibernate/hibernate-search/pull/775

9 years, 10 months

3
3
0 / 0

Some migration pains HSearch 5

by Marc Schipperheyn

So, I've started migrating our production environment to HSearch 5 at long last. Some of the initial pains that may warrant some documentation love: * @IndexedEmbedded basically inverts the default because before HSearch 5, the default was essentially: @IndexedEmbedded(includeEmbeddedObjectId=true), whereas now it's essentially: @IndexedEmbedded(includeEmbeddedObjectId=false). Inverting defaults seems like a dangerous upgrade choice to me. * I use a lot of indexedEmbedded(includePaths="id") style includes. public class MyClass{ @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name="userId", nullable=false, insertable=false, updatable=false) @DocumentId public Long getId() { return id; } } I always queried these as follows: qb.keyword().onField("id").matching(myLongId).createQuery() where the Long would implicitly converted to a String. and the term query would be +id:1 Now, it becomes a NumericRangeQuery based on the fact that I'm passing a Long. But DocumentIds apparently are still strings by default. And this query will fail to deliver results. It makes most sense to me to convert DocumentId to NumericFields and adding @NumericField to it seems to fix it, but I'm not sure if this could create problems in other areas since this is the documentId. Anyway, this is undocumented. * The way to access a MutableSearchFactory has changed and is not documented. This is more of an edge case That's it for now. Cheers, Marc

9 years, 10 months

2
1
0 / 0

Hibernate ORM 4.2.18.Final released

by Gail Badner

I am having problems creating a weblog entry on in.relation.to. I will send another announcement with details when that is resolved. Gail Badner Red Hat, Hibernate ORM

9 years, 10 months

1
0
0 / 0

[HSEARCH] *.next Jira versions

by Gunnar Morling

Hi, There are many (new?) versions in Jira such as 3.1.next, 3.2.next etc. Are those all needed? 5.x makes sense to me, and maybe 4.5.next, but all the old ones? All these unreleased versions make it a bit unwieldy when assigning a fix version to an issue. Thx, --Gunnar

9 years, 10 months

3
5
0 / 0

[Hibernate Search] Donating some of our source code to Infinispan

by Sanne Grinovero

All, we've discussed several times the issues we have because of the circular dependency between Hibernate Search and Infinispan. Normally the pain point we aim to address with such discussions is the need to carefully coordinate between releases [of Search and Infinispan] in our quest for a final stable release, and brings some release pressure. More recently it has also become a compatibility problem, as the two projects target different platforms/environments, and have different life-cycle expectations - creating more maintenance work such as lots of back-porting of patches.. distracting from our goals. We already discussed some solutions, but none too convincing. I have a new fresh proposal which I feel is more interesting: # New plan 1) the module "/infinispan" from the Hibernate Search source tree is moved to the Infinispan project into some "hibernate-search-directory" Maven module. 2) Hibernate Search drops any dependency to Infinispan Reminder: this "/infinispan" module we have contains just a couple of classes, and represents the "DirectoryProvider" implementation which integrates with the Hibernate Search autodiscovery and creates a Directory instance (whose implementation always lived in Infinispan). Most of the code is about applying configuration properties, integration tests. It's a very simple plan, but has some valuable consequences: - Search can move on without ever needing to wait for Infinispan - and vice versa. - There is no longer a circular dependency - Search would be in control of its Lucene dependency, and can upgrade as needed. We could experiment with different Lucene versions without necessarily wait for Infinispan to solve compatiblity issues first - and often more complex as that means then for Infinispan to be able to guarantee a compatibility with a *range* of Lucene versions, to include both the target Lucene and the version currently consumed via Infinispan Query / Hibernate Search Engine. - Infinispan can *opt* to stick with an older version of Search - or update - provided it can satisfy both a) integration with possible changes to the Search SPI b) an update for possible new requirements of the new Lucene version (the Lucene Directory might need compatibility fixes) - The dependency structure would better match the one as provided by our Enterprise Products. For example, it's the JDG distribution - not Hibernate Search - to provide these integration bits, so it makes more sense to build the Directory against the specific version of Infinispan than against the specific version of Search. - It will be easier to have new Infinispan code take advantage of features exposed on the Infinispan Lucene Directory - if all changed parts are in the same [Infinispan] repository. This is actually being a problem right now, as it's holding back a POC meant to deliver great improvements with the Directory implementation. And not least we'll have faster builds ;-) # Drawbacks First one is we'll probably have our users need to change some details of how they build. Infinispan might need to fix some SPI related code before being able to upgrade, but historically our SPI has been extremely stable. The real problem is that when such a thing happens, after we've released a version of Search there might be some time before a compatible version of Infinispan is made available as well. In practice this means the gap of time in which we have to catch up on API changes is "exposed" to end users wanting to use our latest and possibly blocked - but while they would then see the tip of the iceberg of our integration, I believe it would still take the same amount of waiting time in terms of calendar dates in which the working duo is available to them - as with the current model in such a situation we need to wait for the same Infinispan release to happen before we can release ours. So: same time, but we'd have a leaner process, and possibly quicker releases for all users not interested in that - or just benefits in all those scenarios in which we don't break APIs which is very common. I've not identified other problems, so my opinion is that these are well worth the benefits. # Consequences for our users Not much. Even today we expect our users to depend on several jar files provided by the Infinispan team; this would be just one more. Opens some questions though: A) Should the Maven group id be changed? I'd expect it to be transferred to "org.infinispan" group at least, and probably need a better artifact id too. B) License. Our code is LGPL, most of Infinispan is ASL - but not all of it. So I expect it would be possible to keep the existing license at least for now, and defer eventual license changes as a separate step (if people feel need for any change at all). C) Documentation. Besides the needed updates in Maven coordinates / download sources, I don't expect much of a difference: we'd still explain how to set this integration up. D) Distribution. Today we distribute this module, and its dependencies, in our release bundles. Which implies we distribute a copy of various Infinispan jars. I think we should drop these from our distribution - even though it might seem counter-intuitive: while it might seem convenient to have these included, the whole point of the change would be that there would be more flexibility in which versions of Infinispan would work with Search. And actually the integration tests and this specific knowledge would be responsibility of Infinispan. Am I failing to see a more critical issue? How would you all feel about our code being transferred to the different project? Sanne

9 years, 11 months

4
9
0 / 0

Search 5 migration pains: MultiFieldQueryParser and Numeric Fields

by Sanne Grinovero

As reported on SO [1], it's not a straight forward migration for those who embraced the convenience of using a MultiFieldQueryParser, or one of the other Lucene provided parsers. In the specific example, I think the right answer would be to use the programmatic API or our DSL.. but let's consider the use case in which you want to parse user input, from a text input in your application? Using the parser is quite convenient in such a case, as people can express boolean operators and field names, while keeping the UI very simple. Wouldn't it be nice to have a custom "parser" in our DSL, which essentially mimicks the functionality of the MultiFieldQueryParser but takes advantage of our indexing metadata - like we do for the HQL Parser? Sanne 1 - http://stackoverflow.com/questions/28138308/hibernate-search-5-0-numeric-...

9 years, 11 months

3
5
0 / 0

Re: [hibernate-dev] [Hibernate Search] Donating some of our source code to Infinispan

by Sanne Grinovero

On 26 January 2015 at 09:52, Tristan Tarrant <tristan(a)infinispan.org> wrote: > On 26/01/2015 10:45, Hardy Ferentschik wrote: >> >> >> A) Should the Maven group id be changed? I'd expect it to be >> transferred to "org.infinispan" group at least, and probably need a >> better artifact id too. >> +1 At least group id needs top change. > > Yes. >> >> >>> B) License. Our code is LGPL, most of Infinispan is ASL - but not all >>> of it. So I expect it would be possible to keep the existing license >>> at least for now, and defer eventual license changes as a separate >>> step (if people feel need for any change at all). >> >> Would it not be easier to change to ASL? > > All of the embeddable Infinispan code is ASL. The exception is for the > server itself which is LGPL (being based on WildFly which is LGPL itself). > So I'd rather not confuse the issues any more. > > Tristan As long as we're clear that wherever you expose/include Search and parser code it's LGPL ;-) And JPACacheStore, and .. So I don't think it's a very simple story to explain to users today, one more module wouldn't be a significant change, especially since it's not public API! And remember you're already using the same code today, with the same strings attached. If any this should make it easier to change license. I'm fine to change this to ASL, but I'd prefer we could move forward with it without needing to involve legal matters as a blocking process. As mentioned above, this would make it far easier to develop some improvements which would highly benefit both projects in the short time. Sanne

9 years, 11 months

1
0
0 / 0

Search: upgrade to Apache Lucene 5

by Sanne Grinovero

Apache Lucene 5 is in candidate release now, and might be released before the end of the month. I've been testing it this weekend... and it's still Java7 compatible! Sorry I got confused when I previously mentioned it would require Java8: they did indeed switch "trunk" branch to require Java8, but "trunk" is meant to become 6.0, while branch "5.0" was already branched before that change in requirements. So we could propose a timely update to Lucene 5 without necessarily waiting for it to be a good time for Hibernate Search to upgrade to Java8 / major release. The initial "damage" is accounted as about 600 compile errors; I could resolve approx 230 already as they were trivial. The remaining ones require some more care & investigation, but seems like we could get it done in 1-2 weeks, if we were to put our focus on this... it definitely looks better than the migration 3 -> 4. Changes would again (obviously) affect users wherever they use "native" Lucene APIs, but it seems like that this time we could do a zero-changes in our APIs.. so we could do this in a minor release without violating our "backwards compatibility policy" but I guess that's arguable, as while our APIs would be "drop-in" compatible, the user application wouldn't work with all the changes in Lucene code. We would need to rewrite areas relating to: - Faceting - Filters and filter stacking - Custom Collectors (i.e. most code of Spatial) - FieldCache - Some IndexWriter code related to I/O errorhandling locking - Analyzer A more extensive preview of changes is documented here: http://people.apache.org/~anshum/staging_area/lucene-solr-5.0.0-RC1-rev16... We might want to start working soon on a new major already? Whatever we do, we can't allow our users needing to wait as long as we did for 4, especially since the upgrade is not as nasty. Sanne

9 years, 11 months

1
0
0 / 0

[OGM] Transaction type "RESOURCE_LOCAL", still JTA is required

by Gunnar Morling

Hi, For a demo I have an OGM application which defines a persistence unit with transaction type RESOURCE_LOCAL. I thus assumed I wouldn't have to add a JTA implementation to the class path, but actually I'm getting a CNFE (see [1] for the complete trace): ClassNotFoundException: Could not load requested class : com.arjuna.ats.jta.TransactionManager Indeed Arjuna is what we use as TM by default. It is set by OGM's JtaPlatform implementation which in turn is used by transactions created by OGM's default TransactionFactory [2]. Unless I'm doing something wrong configuration-wise, I feel that requiring a JTA implementation for a non-transactional backend such as MongoDB is confusing and may make users ask whether OGM is doing the right thing. Would it be feasible to to provide an "OGM local" TransactionImplementor + TransactionFactory to be used in such cases where the store does not support transactions (so no rollbacks etc.), but we'd "only" need a trigger for writing out changes to the datastore? Any thoughts? --Gunnar [1] https://gist.github.com/gunnarmorling/ba193caecb7d5cdbd0a4 [2] https://github.com/hibernate/hibernate-ogm/blob/master/core/src/main/java...

9 years, 11 months

3
7
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

hibernate-dev January 2015