[JBoss JIRA] (TEIID-5680) Improve performance of odata expand operations

Monday, 11 March 2019

    [
https://issues.jboss.org/browse/TEIID-5680?page=com.atlassian.jira.plugin...
] 

Steven Hawkins commented on TEIID-5680:
---------------------------------------

This isn't really an expand issue, it's just a general planning question.  At
least on master I see this is planned as expected if cardinality metadata is added to
FDBProducts and Diary.

The odata layer generates the query:

/*+ cache(ttl:300000 scope:USER) */ SELECT g0.idDiaryEntry, g0.AmountInG, /*+ MJ */
(SELECT ARRAY_AGG((g1.idCode, g1.product_name, g1.brands, g1.energy_100g) ORDER BY
g1.idCode) FROM my_nutri_diary.FDBProducts AS g1 WHERE g1.idCode = g0.fkIdProductCode)
FROM my_nutri_diary.Diary AS g0 WHERE ((g0.AddedDateTime >= CONVERT(?, timestamp)) AND
(g0.AddedDateTime <= CONVERT(?, timestamp))) AND (g0.MealNumber = CONVERT(?, string))
ORDER BY g0.idDiaryEntry LIMIT 0, 100 /* gKjgnveX1UoB */

For 12.0 (TEIID-4498) that is rewritten as a join:

/*+ cache(ttl:300000 scope:USER) */ SELECT g0.idDiaryEntry, g0.AmountInG, X__1.expr1 AS
expr3 FROM my_nutri_diary.Diary AS g0 LEFT OUTER JOIN (SELECT ARRAY_AGG((g1.idCode,
g1.product_name, g1.brands, g1.energy_100g) ORDER BY g1.idCode) AS expr1, g1.idCode FROM
my_nutri_diary.FDBProducts AS g1 GROUP BY g1.idCode) AS X__1 ON g0.fkIdProductCode =
X__1.idCode WHERE (g0.AddedDateTime >= ?) AND (g0.AddedDateTime <= ?) AND
(g0.MealNumber = ?) ORDER BY g0.idDiaryEntry LIMIT 100

Note that the aggregation is still being performed in the inline view rather than above
the join.  This is because the logic is allowing full multiple such aggregations (such as
what would be used with sibling expands) rather than seeing is that can be raised higher
with more grouping - there is room for improvement there as it would allow for the join to
be pushed.

With cardinality metadata on both Diary and FDBProducts, for example:

CREATE FOREIGN TABLE FDBProducts (
...
OPTIONS(NAMEINSOURCE '`FDBProducts`', UPDATABLE 'TRUE', CARDINALITY
600000);

The value doesn't have to be exact, it just needs to convey an approximate row count. 
With that I do see a good plan being created - it will use a dependent join to take the
key values from Diary and supply them as a predicate to FDBProducts:

 SELECT g_0.idCode, (g_0.idCode, g_0.product_name, g_0.brands, g_0.energy_100g) FROM
my_nutri_diary.FDBProducts AS g_0 WHERE g_0.idCode IN (<dependent values>)

There is still an improvement to be had there though.  The rule to push limit values
isn't being run until after the determination of the join type is made - we should
instead see that Diary is limited to 100 rows and choose a dependent join regardless of
the FDBProducts size.

I'd first rerun your scenario with the cardinality metadata, then we can determine if
you need additional changes.

...
 Improve performance of odata expand operations
 ----------------------------------------------

                 Key: TEIID-5680
                 URL: https://issues.jboss.org/browse/TEIID-5680
             Project: Teiid
          Issue Type: Enhancement
          Components: OData
            Reporter: Christoph John
            Assignee: Steven Hawkins
            Priority: Major
         Attachments: test2.txt

 Hello Ramesh and Steven,
 this is a follow up regarding an observation in the discussion from TEIID-5643. I thought
I open an extra issue for the topic as this seems not to be related to TEIID-5500. 
 As you already know, I am using SAPUI5 as frontend for ODATA requests. SAPUI5 supports
binding of a user interface control group (like a list with its list items) to a single
ODATA path at a time only. If the control group items require additional information which
is stored in a different table in the database, I have to expand those parameters in the
odata query.
 When doing so, I am running in a serious performance issue with TEIID, which would render
the approach of using sapui5 with Teiid infeasible if we cannot find a way to speedup the
issue. At the moment I have a small table with entries (table Diary with about 20 records)
for which the query extracts several items (just a single one in the example given below).
Now the filtered item is expanded with data from a larger table in the database
(FDBProducts with about 680.000 records). The whole query takes about 15s to be processed.
The query is given as:

https://morpheus.fritz.box/odata4/svc/my_nutri_diary/Diary?$select=Amount...
 I checked the output when using
  <logger category="org.teiid.CONNECTOR"><level
name="TRACE"/></logger>
 This shows the problem. It seems the join operation is not pushed down to the database
but the data are rather joined within Teiid. Teiid therefore downloads the entire dataset
of the large FDBProducts table, which makes the expand approach infeasible for real world
datasets with a certain size. So  my question is, if you can modify Teiid to push down the
entire join operation to the underlaying database (I assume this would be the most
efficient approach), or alternatively query just the items from the table to be joined
which where filtered from the first table if the first option is not possible?
 Thanks for your help.
  Christoph 

--
This message was sent by Atlassian Jira
(v7.12.1#712002)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009