joining the party late, just read it all up.
I'm highly +1 to multiple phases, I think it's a requirement for e.g.
About ANTLR4 : it sounds better than ANTLR3 for maintenance reasons;
but I don't think we can grasp the full picture of all the things we
want to do and say "it's going to work" without going ahead quickly;
failure shouldn't scare us from experimenting: worst case we'll have a
better picture of what needs to be done.
My vote is to commit to ANTLR4 and try hard to make it work; we could
try keeping it isolated to potentially replace it but I'd rather focus
on implementing it all than to make it a replaceable engine; once we
have all ORM entry points pointing to the new engine we should have a
better picture to consider replacing it? I realize it's a risk, but
I'm confident that at worst it will take more work than expected:
since it's expecting hand coding of the walker, it's hardly going to
be a dead end.
Could we summarize the requirements? These are the ones I'm aware of,
trying to list them w/o too much detail as we all discussed them
- Let the database specific Dialects customize the SQL (way more than today)
- Let Criteria & Loaders generate the same structures (to get the
Dialect customization and caching capabilities)
- Let OGM generate non-SQL output (significantly different walkers -
including as Criteria output & co)
- Being able to Cache the parsing phase: e.g. reuse an intermediate
tree as an immutable threadsafe structure, BUT w/o making query
parameters parameter values
- Validate such a query w/o having parameters (tooling, bootstrap
validation of named queries)
- Diagnostics: attach comments & purpose explanation to nodes,
pretty-printing in the logs, etc.. as with today's parser.
I think that's the requirements summary?
A comment on implementation:
Could we consider the need for multiple SQL statements as a detail of
one of the last phases?
My crux with that is that for different, non SQL technologies (OGM,
but even ORM if we eventually map to custom stored procedures),
yet those concrete execution plans should also be cacheable.
So I think that highlights the need for either an additional phase, or
simply that "the bottom phase" can in turn generate more phases and
reuse the Query Plan Caching SPI as needed.. i.e. a variable phases
Finally, the holy grail of persistence would be to switch between
different storage technologies depending on the query semantics; a
simple example would be the interaction with 2nd level caches could be
smarter, a more complex one would be an OGM application using an
hybrid storage like a RDBMS and an in memory datagrid: when you know
some data can be found in either store, you have to choose were to
load it from and that will depend on what kind of query we're dealing
Another example is with Hibernate Search interaction: often people
want to filter a result both with Criteria and a full-text
Not sure if we want to think about hybrid storage in the early phase already..
On 26 August 2015 at 08:09, Gunnar Morling <gunnar(a)hibernate.org> wrote:
> The other approach is to use a 3-phase translation (input
> -> semantic-tree -> semantic-SQL-tree(s) -> SQL). This gives a hint to one
> of the major problems. One source "semantic" query will often correspond
> to multiple SQL queries; that is hard to manage in the 2-phase approach.
In which situations will this happen? I can see inheritance where a
HQL query targeting a super-type needs to be translated into a SQL
query per sub-type table. What others are there?
For the purposes of OGM this phase ideally would not be tied to SQL,
as we phase the same task with non-SQL backends in SQL. I.e. i'd be
beneficial to have input -> semantic-tree ->
semantic-output-query-tree(s) -> (SQL|non-SQL query). There
"semantic-output-query-tree(s)" would be an abstract representation of
the queries to be executed, e.g. referencing the table name(s). But it
would be unaware of SQL specifics.
hibernate-dev mailing list