[hibernate-dev] Hibernate ORM SQL generation

Steve Ebersole steve at hibernate.org
Sun Aug 23 23:55:13 EDT 2015


Another point I want to discuss up from because it affects tree structure.
Specifically the idea of an "unbounded implicit inheritance" query.  These
are queries like "from java.lang.Object".  Queries where the from clause
pulls in "unmapped inheritance".  These are fine, to an extent.  Hibernate
has natively supported these since way back[1].

What is problematic is cases where we have more than one "unmapped
inheritance" reference.  E.g. "from java.lang.Object o1, java.lang.Object
o2".  In fact its the same difficulty as an unbounded cartesian product,
but here in terms of the number of SQL queries we need to produce/execute.

So I propose that we allow just one "unmapped inheritance" reference per
query.

[1] Reminder to self... another "strict JPQL compliance" consideration.

On Sat, Aug 22, 2015 at 1:16 PM Steve Ebersole <steve at hibernate.org> wrote:

> I got that initial refactoring pushed to my fork...
>
> On Fri, Aug 21, 2015 at 3:51 PM Steve Ebersole <steve at hibernate.org>
> wrote:
>
>> Just a heads up that I started a major refactoring of the antlr4 poc
>> project in preparation for starting to look at this next sql-gen step.
>>
>> First I am making it into a multi-module project.  We will have the
>> hql-parser module, but then also an orm-sql-gen module to be able to play
>> with that part.  This makes sure we are not blending orm concerns into the
>> pure hql parser.
>>
>> Also, I started working on splitting the "semantic query" model out into
>> a separate module as well.  There are a few reasons for this.  I wont go
>> into them all here.  The main one being that HQL is just one producer of
>> this semantic model.   Rather than another long name I went with the
>> acronym SQM (Semantic Query Model) here.  The top package being
>> org.hibernate.sqm.
>>
>> These changes already illustrated some tighter couplings then I had
>> intended, so it was a good exercise.  I'll push once I get those couplings
>> cleaned up.
>>
>> On Fri, Aug 21, 2015 at 2:35 PM andrea boriero <dreborier at gmail.com>
>> wrote:
>>
>>> I haven't seen it, I'm going to read it.
>>>
>>> On 21 August 2015 at 16:54, Steve Ebersole <steve at hibernate.org> wrote:
>>>
>>>> http://www.antlr2.org/article/1170602723163/treewalkers.html
>>>>
>>>> Not sure if y'all have seen this.  Its an old article advocating manual
>>>> tree walking (what we are facing here) over using generated tree walkers.
>>>>
>>>>
>>>>
>>>> On Wed, Aug 19, 2015 at 12:27 PM Steve Ebersole <steve at hibernate.org>
>>>> wrote:
>>>>
>>>>> I agree.  Its my biggest hang up with regard to using Antlr 4.
>>>>> Actually, its my only hang up with Antlr 4, but its a huge one.
>>>>>
>>>>> On Tue, Aug 18, 2015 at 9:30 AM andrea boriero <dreborier at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> yes Steve I'm more familiar with Antlr4 ( but not 3) and I gave a
>>>>>> look at your poc.
>>>>>>
>>>>>> Apart some problems to fully understand the semantic model (due to my
>>>>>> lack of a complete knowledge of the domain problem),
>>>>>> I agree with you about the simplicity and elegance of  the grammar
>>>>>> for HQL recognition and semantic model building.
>>>>>>
>>>>>> What I don't like it's the necessity to build our own semantic model
>>>>>> walker/s in order to produce the final SQL.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 14 August 2015 at 16:32, Steve Ebersole <steve at hibernate.org>
>>>>>> wrote:
>>>>>>
>>>>>>> We've had a few discussions about this in the past.  As 5.0 is
>>>>>>> getting
>>>>>>> close to Final (next week), its time to start contemplating our next
>>>>>>> major
>>>>>>> tasks.  The consensus pick for that has been the idea of a "unified
>>>>>>> SQL
>>>>>>> generation engine" along with a shared project for the semantic
>>>>>>> analysis of
>>>>>>> HQL/JPQL (and recently it was decided to include JPA Criteria
>>>>>>> interpretation here as well).
>>>>>>>
>>>>>>> The central premise is this.  Take the roughly 6 or 7 different
>>>>>>> top-level
>>>>>>> ways Hibernate generates SQL and combine that into one "engine"
>>>>>>> based on
>>>>>>> the input of a "semantic tree".  The mentioned HQL/JPQL/Criteria
>>>>>>> shared
>>>>>>> project will be one producer of such semantic trees.  Others would
>>>>>>> include
>>>>>>> persisters (for insert/update/delete requests) and loaders (for load
>>>>>>> requests).
>>>>>>>
>>>>>>> We have a lot of tasks for this overall goal still remaining.
>>>>>>>
>>>>>>> We still have to finalize the design for the HQL/JPQL/Criteria to
>>>>>>> semantic
>>>>>>> tree translator.  One option is to proceed with the Antlr 4 based
>>>>>>> approach
>>>>>>> I started a PoC for.  John has been helping me some lately with
>>>>>>> that.  The
>>>>>>> first task here is to come to a consensus whether Antlr 4 is the way
>>>>>>> we
>>>>>>> want to proceed here.  We've been over the pros and cons before in
>>>>>>> detail.
>>>>>>> In summary, there is a lot to love with Antlr 4.  Our grammar for HQL
>>>>>>> recognition and semantic tree building is very simple and elegant
>>>>>>> imo.  The
>>>>>>> drawback is clearly the lack of tree walking, meaning that we are
>>>>>>> responsible for writing by hand our walker for the semantic tree.
>>>>>>> In fact
>>>>>>> multiple, since each consumer (orm, ogm, search) would need to write
>>>>>>> their
>>>>>>> own.  And if we decide to build another AST while walking the
>>>>>>> semantic
>>>>>>> tree, we'd end up having to hand-write yet another walker for those.
>>>>>>>
>>>>>>> What I mean by that last part is that there are 2 ways we might
>>>>>>> choose to
>>>>>>> deal with the semantic tree.  For the purpose of discussion, let's
>>>>>>> look at
>>>>>>> the ORM case.  The first approach is to simply generate the SQL as
>>>>>>> we walk
>>>>>>> the semantic tree; this would be a 2 phase interpretation approach
>>>>>>> (input
>>>>>>> -> semantic tree -> SQL).  That works in many cases.  However it
>>>>>>> breaks
>>>>>>> down in other cases.  This is exactly the approach our existing HQL
>>>>>>> translator uses.  The other approach is to use a 3-phase translation
>>>>>>> (input
>>>>>>> -> semantic-tree -> semantic-SQL-tree(s) -> SQL).  This gives a hint
>>>>>>> to one
>>>>>>> of the major problems.  One source "semantic" query will often
>>>>>>> correspond
>>>>>>> to multiple SQL queries; that is hard to manage in the 2-phase
>>>>>>> approach.
>>>>>>> And not to mention integrating things like follow-on fetches and
>>>>>>> other
>>>>>>> enhancements we want to gain from this.  My vote is definitely for 3
>>>>>>> or
>>>>>>> more phases of interpretation.  The problem is that this is exactly
>>>>>>> where
>>>>>>> Antlr 4 sort of falls down.
>>>>>>>
>>>>>>> So first things first... we need to decide on Antlr 3 versus Antlr 4
>>>>>>> (versus some other parser solution).
>>>>>>>
>>>>>>> Next, on the ORM side (every "backend" can decide this individually)
>>>>>>> we
>>>>>>> need to decide on the approach for semantic-tree to SQL translation,
>>>>>>> which
>>>>>>> somewhat depends on the Antlr 3 versus Antlr 4 decision.
>>>>>>>
>>>>>>> We really need to decide these things ASAP and get moving on them as
>>>>>>> soon
>>>>>>> as ORM 5.0 is finished.
>>>>>>>
>>>>>>> Also, this is a massive undertaking with huge gain potentials for
>>>>>>> not just
>>>>>>> ORM.  As such we need to understand who will be working on this.
>>>>>>> Sanne,
>>>>>>> Gunnar... I know y'all have a vested interest and a desire to work
>>>>>>> on it.
>>>>>>> John, I know the same is true for you.  Andrea?  Have you had a
>>>>>>> chance to
>>>>>>> look over the poc and/or get more familiar with Antlr?
>>>>>>>
>>>>>> _______________________________________________
>>>>>>> hibernate-dev mailing list
>>>>>>> hibernate-dev at lists.jboss.org
>>>>>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>>>>>>
>>>>>>
>>>>>>
>>>


More information about the hibernate-dev mailing list