March 2015 - hibernate-dev - Jboss List Archives

The current Hibernate Search sprint: lots of topics!

by Sanne Grinovero

All, let me clarify the general goal of this sprint. I don't expect to celebrate with a 5.2.0.Final this time, but I'd aim at getting some of the long standing big tasks done, and finish these three weeks with a 5.2.Beta1. We need to organize in several parallel significant themes. There are some "big" themes going on which you need to be aware of beyond the granularity of JIRA. Your help in properly inspecting these with experiments and then break them down in smaller tasks is what I'm needing the most right now. I'd highly appreciate if each of you could take on leadership of one of these themes, and get at least one other team member as primary reviewer and brainstorming mate. These are the primary themes: - the Faceting refactoring - led by Hardy - the dynamic types work - led by me - Hibernate ORM 5 compatibility and testing - almost done - getting rid of the Infinispan module - led by Gustavo - a discussion with the wildFly team about how to share the module structure / build / definitions (more on this soon) - Lucene 5 - R&D: explore better clustering strategies, better master election (or no-master architectures) - Better integration with ORM's Multi-tenancy - being quite requested recently - Davide? If we really could upgrade both ORM and Lucene to 5, then we could promote this to a new major release. Of course I'm dreaming and that's not going to happen in practice - not least that would require an ORM 5.0.0.Final. So what I'm expecting is that we explore the needs for these, and you help me identify which steps are needed to get these both upgraded in the near future. That means we might be raising more issues than solving them, but that's good as it clarifies which atomic, self contained and consistent steps we then need to perform to get there. I'm currently working on ORM5 tasks, will soon share some PRs of things which could already be merged, but of course the final step won't be applied as we're not really going to upgrade yet - unless we agree we're only releasing betas until ORM is final too. For Lucene 5: the work which Hardy is doing is essential: - update the Faceting code - move our code to use the new FieldDocs After that, the upgrade won't be that bad (not as hard as Lucene 4) I just created some JIRAs as "container" for these larger themes, just please keep in mind that I'm not setting the version to be "5.2" as they will probably span multiple releases. The goal should not be to resolve them, but to start them and split them up in subtasks which can be merged already. I'm pretty sure that several resulting sub-tasks can be merged already. There is a new label in JIRA: "current_sprint", so we can identify them all even though they are not marked to be fixed for version 5.2. The "R&D" tasks are not in JIRA at all, I'm still gathering requirements - still we'll need to dedicate some time to experimentation and brainstorming. I realize these are many parallel paths to work on; we're many experienced devs though, and these should be workable in parallel. If each of you can take some leadership on an area I hope we can close them all by the next iteration (except probably the R&D task). === That said on the larger themes, there is of course a list of traditional tasks which will shape the 5.2 improvements. These are marked "5.2" on JIRA; some are trivial, like missing javadoc or a paragraph of documentation but need some figuring out to craft the right docs. Let me comment these briefly to see if any picks your interest. # HSEARCH-1848 Replace the Infinispan Directory provider with the one distributed by the Infinispan project As discussed: we'll remove the module, but need to make sure we can plug in the one distributed by Infinispan. Needs Infinispan to release it first. # HSEARCH-1214 Review SearchFactory initialization For our own sake of mind.. the boot process is hard to understand. I have some ideas, and there are many things to keep in mind so I'll probably try to take this myself but otherwise I'll transfer my brain dump.. best over voice. # HSEARCH-1472 Broaden collection of built in IndexManager implementations to simplify choice of sensible configurations As discussed at the team meeting. The goal is to simplify configuration and documentation, prevent sick configuration choices. # HSEARCH-1474 MassIndexer needs to avoid being timed out by the TransactionManager This is high value and long standing, but complex. Gunnar started working on a test. # HSEARCH-1536 Improve the test suite around MoreLikeThis (association, custom fieldbridge, class bridges) There are several open tasks around MLT. This is the warmup point to finalize it MLT... I didn't schedule the other tasks for this sprint. # HSEARCH-1589 ServiceManager closes services too aggressively A sensible optimisation, probably easy. Beware: concurrency and bootstrap related. # HSEARCH-1654 Disable merge policy during Massindexing A great performance optimisation for mass indexing people. I think it's trivial, but to be verified you'll need to setup a relatively long run - we have a repository with instructions to reindex the Wikipedia # HSEARCH-1681 Index optimisation should commit to publish the performed optimisation Trivial to do - one liner - but not so trivial to test for. # HSEARCH-1684 ResultTransformer ignores transformList on tuples No idea, needs to be looked at to make Marc S. happy. # HSEARCH-1708 Using DistanceSortField does not verify the field parameter passed to the constructor # HSEARCH-1711 EntityIndexingInterceptor executes on different part of the hierarchy # HSEARCH-1729 Document the Infinispan configuration property `metadata_writes_async` This was not documented as it's an highly experimental property. I was hoping we could run some more tests, but I won't have the time for that at the moment, so either someone volunteers for the test, or we keep it a secret, or decide to document it with warnings. # HSEARCH-1762 Improve javadocs of builtin bridges # HSEARCH-1773 org.hibernate.search.backend.impl.WorkVisitor not exported by engine osgi bundle Or find some alternative way... but whatever the solution we need to get OSGi as "done" status. # HSEARCH-1783 Reproduce transaction timeouts during mass indexing Gunnar already on it. # HSEARCH-1793 CriteriaObjectInitializer causes too many object loads in cross hierarchy queries This one is nasty, we should get rid of it. # HSEARCH-1803 Infinispan integration test search in the wrong node since we're removing the code.. we need to apply this as https://issues.jboss.org/browse/ISPN-5339 # HSEARCH-1804 Boost on IndexedEmbedded properties This really should just work as the user requests # HSEARCH-1811 WIldcard with multiple fields Another sensible usability improvement # HSEARCH-1812 Documentation doesn't clearly explain how one obtains the existing SearchIntegrator Start a documentation section "integrators and framework developers" ? # HSEARCH-1815 Clarify the need to depend on an implementation of SerializationProvider Apparently we don't state one will be needed ;) # HSEARCH-1816 Explicitly validate the version of Hibernate ORM A usability improvement, as proposed on the mailing list. +1 for Gunnar's ideas. # HSEARCH-1826 Make it possible to test Hibernate Search with preview builds of Hibernate ORM 5 I'm working on this one. # HSEARCH-1828 Clarify documentation about ways to disable Hibernate Search # HSEARCH-1839 FieldBridge instance initialization might use reference access to the booting framework This is needed by the jBPM / Drools teams. At least the programmatic configuration should be trivial. # HSEARCH-1844 Review which components should no longer be tagged as experimental # HSEARCH-1847 Create a FSDirectory extension which doesn't ever sync to disk Requested by Infinispan - might become an urgent requirement soon, better have this ready.

11 years, 3 months

1
0
0 / 0

Hibernate ORM 5.0.0.Beta1 release

by Steve Ebersole

Just released 5.0.0.Beta1 : http://in.relation.to/Bloggers/HibernateORM500Beta1Release

11 years, 3 months

2
2
0 / 0

[Hibernate Search] Repository Notice: migrating Infinispan integration

by Sanne Grinovero

All, please do not make changes (or propose patches) to any sources for the Maven module org.hibernate:hibernate-search-infinispan In other words, anything under the path /infinispan in the repository. We're currently working to move this module to the Infinispan project, at the following repository: https://github.com/infinispan/infinispan Of course we'll still maintain and love our integration, it's just much easier to maintain if it is released together with the Infinispan core modules. We'll also migrate some of the integration tests. Thanks, Sanne

11 years, 3 months

1
0
0 / 0

ORM Jenkins Builds

by Steve Ebersole

I was curious why it took so long to run the master ORM jobs on the CI machine compared to running the job locally. Locally I run `clean test` at the root prject quite often and it takes roughly 9-10 minutes. The master CI jobs generally take 45-50 minutes to complete. So I enabled "Gradle build profiling" in our job. The results were surprising in terms of ratios. I figured findBugs, checkStyle etc probably added significant times to the build. But I was shocked how much it added. BTW, you can view these profile reports in {root}/build/reports/profile... So hibernate-core, overall took 17m22.19s to run for one job. Of that, 12.5 was findBugs! checkStyle as actually "reasonable" at just under 30s. The ratios were similar across all modules. The aggregatedJavadoc task took a shade over 2m. Considering that these jobs are run on ever check-in (and eventually it would be great to auto-run them against PRs too), plus the fact that we aren't even failing the build for the majority of findBug/checkStyle hits I think we should define these jobs a little differently. Its not just the time it takes. Yes we all hate to wait. But it's also the CI resources taken up. I'm going to put some thought into this after the 5.0 Beta release, but I wanted to get some thoughts and feedback in the meantime. Things to consider.

11 years, 3 months

2
7
0 / 0

Re: [hibernate-dev] Bytecode enhanced, Reference Cached immutable Entities

by Sanne Grinovero

[adding the mailing list] Generally speaking, looks like we agree on the direction: EntityEntry needs to be an interface, and some clever logic to select the appropriate implementations. In your draft you're having a single EntityEntryFactory as a global service; I'm wondering if we shouldn't have the possibiliy to have a different factory implementation per Entry type.. more on this below. What is your primary differentiator between 'SharedEntityEntry' and 'StatefulEntityEntry' ? For our purposes I'd have used different names, but since there's no javadoc yet I wonder if you had different intentions. Personally I'd have chosen something like "ImmutableEntityEntry" and "MutableEntityEntry", there the Mutable one is a rename of the existing implementation, and the Immutable would be a slimmed down version which might not need fields such as: - loadedState (not needed for readonly) - version (what would be the point) - .. A concern I have is to avoid ever needing to "promote" an ImmutableEntityEntry into a MutableEntityEntry: it's easy to mark an existing instance of ImmutableEntityEntry as READ_ONLY, but there is no going back if the entity entry was initially loaded as READ_ONLY. One could think of swapping the existing entityentry, but that could get hairy and defeats the point of optimising object allocations. Is there a strong guarantee which we can rely on, that if an EntityEntry is marked READ_ONLY at first load, noone will ever need to re-mark it as mutable? If not, the current check in DefaultEntityEntryFactory basing the choice on the current status of the Entity might not be enough, we might need to be a bit more conservative and only based that on getPersister().isMutable() ? The READ_ONLY point which you're leveraging for this specific optimisation seems to be key for the specific optimisation we have in mind at this point; but generalizing the concept it seems to me that the choice of EE implementation to use for a specific Entity type will be a consistent choice for the lifecycle of the EntityPersister, and depending on immutable flags on the EntityPersister. Which is why I'm suggesting that the EntityPersister should have a dedicated EntityEntryFactory. Making the EntityEntryFactory a global service would force to go through all the checks of the EE implementation choice each time, while the choice should always be the same. I wouldn't argue to save a couple of simple "if" evaluations, but it's very possible that some more clever EntityEntryFactory implementations than this current draft might need to do more work, for example consult more Services to call back into OGM metadata. Not least, having a per-type EntityEntryFactory would make it possible to refer to it from some EntityEntry implementations and save some memory around the common state. Concurrency Since the goal is to share the ImmutableEntityEntry instance among multiple threads reading it, I'd rather make sure that it is really immutable. For example it now holds onto potentially lazy initialized fields, such as getEntityKey(). If it's not possible to make it really immutable (all final fields), we'll need to make it threadsafe and question the name I'm proposing. LockMode From a logical perspective of users, one might think that an entity being "immutable" doesn't necessarily imply I can't lock on it.. right? I'm not sure how far it could be a valid use case, but these patches are implying that one can't lock an immutable entity anymore, as the lock state would be as immutable as anything else on the EntityEntry. Are we good with that? Alternatively one might need to think to separate the lock state handling from the EntityEntry. On smaller details: # org.hibernate.engine.internal.SharedEntityEntry is hosted in an .internal package, I don't think it's right to refactor all the public API javadoc which was referring to EntityEntry to now refer to the internal implementation. # things like EntityEntryExtraState should probably get moved to .internal packages as well now - we couldn't do that before without breaking either encapsulation or APIs. In terms of git patches, the complexity of the changeset risks to get a bit our of hand. What about we focus on creating a clean pull request which focuses exclusively on making EntityEntry an interface, and move things to the right packages and javadoc? You'd have a trivial EntitEntryFactory, and we can then build the evolution on top of that, not least maybe helping out by challenging some points in parallel work. These are the things I'd leave for a second iteration: - add various implementations of EntityEntry iteratively, as needed - the strategy such a Factory would use the pick an implementation - ultimately, make it possible for an integrator to override such a Factory For example with Hibernate OGM we might want to override / re configure the factories to use custom EntityEntry implementations - requirements are not fully clear at this point but it seems likely. The priority being to define the API as that would be a blocker for 5.0, we have then better choices to leave more smarter and advanced EntityEntry implementations for the future; we'd still need to implement at least the essential ones to make sure the API of the EntityEntryFactory has all the context it needs. Thanks, Sanne On 24 March 2015 at 09:27, John O'Hara <johara(a)redhat.com> wrote: > Steve, > > Have you had chance to look at this? Do you have any comments/observations? > > Thanks > > John > > > On 17/03/15 09:24, John O'Hara wrote: > > Steve, > > I have been having a think about the EntityEntry interface, and have forked > a branch here: > > https://github.com/johnaoahra80/hibernate-orm/tree/EntityEntryInterface > > I know it is nowhere near complete, but was this the sort of idea you had in > mind? > > Thanks > > John > > > On 13/03/15 09:44, John O'Hara wrote: > > EntityEntry retains a reference to a persistenceContext internally that > org.hibernate.engine.spi.EntityEntry#setReadOnly makes calls to, is this > where the session reference is kept? As > org.hibernate.engine.spi.PersistenceConext is an interface could we have a > different implementation for this use case? e.g. an > ImmutablePersistenceContext that could be shared across sessions? > > For the bytecode enhancement, could we change the enhancer so that it adds > an EntityEntry interface with javassist. > ClassPool.javassist.ClassPool.makeInterface()() as opposed to adding a class > javassist.ClassPool.makeClass()? I need to have a look at javassit to > confirm what javassist.ClassPool.makeInterface() does. > > Thanks > > John > > On 12/03/15 18:52, Steve Ebersole wrote: > > It is possible. Although some of the changes are particularly painful. > Most of EntityEntry, if it is an interface, can be made to work with your > use case. org.hibernate.engine.spi.EntityEntry#setReadOnly I think is the > one exception, because: > 1) your use case needs it > 2) it expects the Session to be available internally (its not passed) > > The bigger thing I am worried about for you is the bytecode stuff, as that > ties very tightly with EntityEntry. >

11 years, 3 months

3
15
0 / 0

SessionFactory building APIs

by Steve Ebersole

I had not heard anything back in regards to this, so I wanted to ask one more time before I get ready to start cutting 5.0 pre-releases in a week or 2. I'd love to heard feedback of any kind about the new APIs, but specific things I know I personally question: 1) What do you think of the split in MetadataSources and MetadataBuilder? Does the aplit make sense? Or does it make more sense to combine them into one contract? 2) What do you think of all the overloaded methods named #with tacking different argument types, versus distinctly named methods? E.g. MetadataBuilder#with(ImplicitNamingStrategy), MetadataBuilder#with(PhysicalNamingStrategy), etc rather than MetadataBuilder#withImplicitNamingStrategy(ImplicitNamingStrategy), MetadataBuilder#withPhysicalNamingStrategy(PhysicalNamingStrategy) Also, I am not so sure about the term "with" anymore. I had chosen that at the time because I thought it flowed nicely with method chaining.

11 years, 3 months

3
11
0 / 0

Date/Time Support and timezones

by Steve Ebersole

As I start work on supporting Java 8 Date/Time types, I wanted to get everyone's opinion on handling OffsetDateTime, OffsetTime and ZonedDateTime with regards to timezone. Each represent a date/time in a particular timezone/offset (much like a Calendar). A few options: 1) Forego OffsetDateTime, OffsetTime and ZonedDateTime support and just stick with LocalDateTime, LocalDate and LocalTime. 2) Use the timezone/offset to pass along to the driver (for proper conversion); when reading back we'd have to read back based on the default timezone. This is essentially the old strategy used in CalendarType which I never really liked because its not reflexive. 3) Break them into a tuple of the store each piece. E.g., for OffsetDateTime the Tuple is a LocalDateTime (the Timestamp) and a TZ offset. So we'd store each individually in the database and be able to rebuild them in a fully reflexive manner. 4) Handle them using UTC or GMT at the JDBC level. This is essentially the same as (2)

11 years, 3 months

4
5
0 / 0

GSOC15 OGM

by Adam Stawicki

Do we still have enough time to suggest an additional idea? I think about picking, interesting form me, already existing JIRA issues like: [1] [2] [3] [4]. What do you think? [1] https://hibernate.atlassian.net/browse/OGM-728 [2] https://hibernate.atlassian.net/browse/OGM-768 [3] https://hibernate.atlassian.net/browse/OGM-785 [4] https://hibernate.atlassian.net/browse/OGM-786

11 years, 3 months

2
1
0 / 0

"Unified commit"

by Emmanuel Bernard

Interesting, Neo4J 2.2 now does unified commits like we do in Hibernate Search to get several transactions per disk flush in Lucene. http://neo4j.com/blog/neo4j-2-2-0-scalability-performance/

11 years, 3 months

2
1
0 / 0

Gradle build and multiple JDKs

by Steve Ebersole

Ran across an interesting proof-of-concept project for setting up a Gradle build to use multiple JDKs: https://github.com/rwinch/gradle-multi-jdk Curious what y'all think of this approach versus what we do know with AnimalSniffer...

11 years, 3 months

2
4
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

hibernate-dev March 2015