[teiid-issues] [JBoss JIRA] (TEIID-5680) Improve performance of odata expand operations

Mon Mar 11 15:35:00 EDT 2019

    [ https://issues.jboss.org/browse/TEIID-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706816#comment-13706816 ] 

Christoph John commented on TEIID-5680:
---------------------------------------

Hello Steven,
great. The query planning works as soon as I do it right :) The query no executes within 77ms rather than 15s. For the moment this is fast enough. I hope that the times stay similar once my Diary table grows.

I have a further question regarding automatically updates the cardinality values in the vdb. In the Teiid Designer Importer->Teiid Connection->Source Model I have found the option "Import Statistics". It's tooltip says 

"true will use database dependent logic to determine the cardinality if non is determined. Not yet supported by all databases - currently only supported by oracle and mysql"

I have used this feature now. It does create cardianlity values for the vdb. However,  it does not give me the cardinality value of 600.000 like you have proposed but a value of 100 instead. The other tables get values between 1 and 4. This does also not represent the exact count from the rows in the tables.

1. Question: Is the cardinality value just a relative/normalized value which has to be seen in relation to the other tables, hence it does not matter if it is just set to 100 as long as the difference to the other tables is high, or should I better set it to 600.000.

>> If the VDB imports the metadata itself, then it will pick up fresh estimates at import time 

2. Question: If I use the "Import Statistics" feature and create a dynamic VDB which Teiid loads on wildfly startup. Will Teiid then update the cardinality value each time the vdb is loaded, or do I need to set some additional flags in Teiid Designer, for this to happen? And what flags would that be?

3. Or do I have too much hope introduced in my second question and the exported cardinality just becomes a static value in the vdb which is not updated. At least this would explain the Openshift approach you have scatched. Given the later described case, would the openshift implementation also become open source and it would be possible to add such a metadata store as a service to my docker-compose setup?

> Improve performance of odata expand operations
> ----------------------------------------------
>
>                 Key: TEIID-5680
>                 URL: https://issues.jboss.org/browse/TEIID-5680
>             Project: Teiid
>          Issue Type: Enhancement
>          Components: OData
>            Reporter: Christoph John
>            Assignee: Steven Hawkins
>            Priority: Major
>         Attachments: test2.txt
>
>
> Hello Ramesh and Steven,
> this is a follow up regarding an observation in the discussion from TEIID-5643. I thought I open an extra issue for the topic as this seems not to be related to TEIID-5500. 
> As you already know, I am using SAPUI5 as frontend for ODATA requests. SAPUI5 supports binding of a user interface control group (like a list with its list items) to a single ODATA path at a time only. If the control group items require additional information which is stored in a different table in the database, I have to expand those parameters in the odata query.
> When doing so, I am running in a serious performance issue with TEIID, which would render the approach of using sapui5 with Teiid infeasible if we cannot find a way to speedup the issue. At the moment I have a small table with entries (table Diary with about 20 records) for which the query extracts several items (just a single one in the example given below). Now the filtered item is expanded with data from a larger table in the database (FDBProducts with about 680.000 records). The whole query takes about 15s to be processed. The query is given as:
> https://morpheus.fritz.box/odata4/svc/my_nutri_diary/Diary?$select=AmountInG,idDiaryEntry&$expand=fkDiaryToFDBProducts($select=brands,energy_100g,idCode,product_name)&$filter=AddedDateTime%20ge%202019-03-06T00:00:00%2B01:00%20and%20AddedDateTime%20le%202019-03-07T00:00:00%2B01:00%20and%20MealNumber%20eq%20%270%27&$skip=0&$top=100
> I checked the output when using
>  <logger category="org.teiid.CONNECTOR"><level name="TRACE"/></logger>
> This shows the problem. It seems the join operation is not pushed down to the database but the data are rather joined within Teiid. Teiid therefore downloads the entire dataset of the large FDBProducts table, which makes the expand approach infeasible for real world datasets with a certain size. So  my question is, if you can modify Teiid to push down the entire join operation to the underlaying database (I assume this would be the most efficient approach), or alternatively query just the items from the table to be joined which where filtered from the first table if the first option is not possible?
> Thanks for your help.
>  Christoph

--
This message was sent by Atlassian Jira
(v7.12.1#712002)