[JBoss JIRA] (TEIID-5680) Improve performance of odata expand operations

Monday, 11 March 2019

    [
https://issues.jboss.org/browse/TEIID-5680?page=com.atlassian.jira.plugin...
] 

Steven Hawkins commented on TEIID-5680:
---------------------------------------

With 11.2.x the rewritten odata query is also as a join:

/*+ cache(ttl:300000 scope:USER) */ SELECT g0.idDiaryEntry, g0.AmountInG, X__1.expr1 AS
expr3 FROM my_nutri_diary.Diary AS g0 LEFT OUTER JOIN (SELECT ARRAY_AGG((g1.idCode,
g1.product_name, g1.brands, g1.energy_100g) ORDER BY g1.idCode) AS expr1, g1.idCode FROM
my_nutri_diary.FDBProducts AS g1 GROUP BY g1.idCode) AS X__1 ON g0.fkIdProductCode =
X__1.idCode WHERE (g0.AddedDateTime >= ?) AND (g0.AddedDateTime <= ?) AND
(g0.MealNumber = ?) ORDER BY g0.idDiaryEntry LIMIT 100

And with the cardinality hints, does produce an acceptable plan.

...
  I mean does Teiid behave differently with different orders of
magnitude for the cardinality or is there just something like small or large table
depending on a given threshold?  
There is some behavioral difference for "small" sizes - less than a single
batch, typically 256 rows.  Beyond that relative approximate sizes are all that is needed.
 Costing routines above the source node level are not based upon full column level
histograms, but you can refine things further by setting the column level DISTINCT_VALUES,
NULL_VALUE_COUNT, MAX_VALUE, and MIN_VALUE values as well.  Generally just setting the
table cardinality is all that is sufficient to correctly influence join planning.

...
  Is this something I need to update in the lifecycle when tables grow
larger or do I just make an educated guess how the future will look like? 
If the VDB imports the metadata itself, then it will pick up fresh estimates at import
time - which can be triggered by either not caching the source metadata or deleting the
metadata and reloading the vdb.

If the VDB specifies the metadata there is not a built-in facility yet that will attempt
to update it's costing statistics at runtime.  There are several facilities for that
including a custom metadata repository and the alter statement that can be run on an
ephemeral basis without a metadata repository to set the cardinality of table.  In our
openshift environment we will likely implement runtime update of costing metadata from
source and from query results as we'll have a well defined persistent store handy.

So if you fully specify the metadata as ddl, then it may need updated if the relative
sizes are no longer representative.

...
 Improve performance of odata expand operations
 ----------------------------------------------

                 Key: TEIID-5680
                 URL: https://issues.jboss.org/browse/TEIID-5680
             Project: Teiid
          Issue Type: Enhancement
          Components: OData
            Reporter: Christoph John
            Assignee: Steven Hawkins
            Priority: Major
         Attachments: test2.txt

 Hello Ramesh and Steven,
 this is a follow up regarding an observation in the discussion from TEIID-5643. I thought
I open an extra issue for the topic as this seems not to be related to TEIID-5500. 
 As you already know, I am using SAPUI5 as frontend for ODATA requests. SAPUI5 supports
binding of a user interface control group (like a list with its list items) to a single
ODATA path at a time only. If the control group items require additional information which
is stored in a different table in the database, I have to expand those parameters in the
odata query.
 When doing so, I am running in a serious performance issue with TEIID, which would render
the approach of using sapui5 with Teiid infeasible if we cannot find a way to speedup the
issue. At the moment I have a small table with entries (table Diary with about 20 records)
for which the query extracts several items (just a single one in the example given below).
Now the filtered item is expanded with data from a larger table in the database
(FDBProducts with about 680.000 records). The whole query takes about 15s to be processed.
The query is given as:

https://morpheus.fritz.box/odata4/svc/my_nutri_diary/Diary?$select=Amount...
 I checked the output when using
  <logger category="org.teiid.CONNECTOR"><level
name="TRACE"/></logger>
 This shows the problem. It seems the join operation is not pushed down to the database
but the data are rather joined within Teiid. Teiid therefore downloads the entire dataset
of the large FDBProducts table, which makes the expand approach infeasible for real world
datasets with a certain size. So  my question is, if you can modify Teiid to push down the
entire join operation to the underlaying database (I assume this would be the most
efficient approach), or alternatively query just the items from the table to be joined
which where filtered from the first table if the first option is not possible?
 Thanks for your help.
  Christoph 

--
This message was sent by Atlassian Jira
(v7.12.1#712002)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009