]
Steven Hawkins resolved TEIID-5331.
-----------------------------------
Fix Version/s: 10.3
10.1.4
10.2.2
Resolution: Done
lead/lag were incorrectly implemented as row value functions (row_number), rather than
normal window functions, which caused the logic to compute the wrong offset index for the
lead/lag row.
LEAD/LAG ignores ORDER BY in the OVER clause
--------------------------------------------
Key: TEIID-5331
URL:
https://issues.jboss.org/browse/TEIID-5331
Project: Teiid
Issue Type: Bug
Components: Query Engine
Affects Versions: 8.12.13.6_4
Reporter: Andrej Šmigala
Assignee: Steven Hawkins
Priority: Blocker
Fix For: 10.3, 10.1.4, 10.2.2
Consider the following query:
{code:sql}
select charvalue, intkey, lead(intkey, 1, NULL) over (order by charvalue, intkey) from
smalla order by intkey
{code}
This is expected to mean that the database orders the rows by charvalue, then intkey,
then for each row finds the lead value in the ordered data set, _then_ sorts all the
results by intkey.
Indeed, when run against e.g. postgres or oracle datasource, the results are as follows:
||charvalue||intkey||lead||
|1|0|2|
|0|1|3|
|1|2|4|
|0|3|5|
|1|4|6|
|...|...|...|
The results are also the same when running the query against teiid with e.g. postgres
datasource, as the lead/lag is pushed down.
However, when run against teiid with datasource that does not support lead/lag pushdown
(such as csv), the results are different:
||charvalue||intkey||lead||
|1|0|1|
|0|1|2|
|1|2|3|
|0|3|4|
|1|4|5|
|...|...|...|
I. e. it appears that the ORDER BY in the OVER clause is ignored and the LEAD/LAG uses
the (possibly arbitrary) ordering returned from the datasource (which in this case is
actually sorted by intkey, but we've seen random ordering from other datasources such
as couchbase).
Please note that the same result comes from teiid even when removing either of the
columns from the OVER (ORDER BY ...) clause, as well when removing the overall ORDER BY
clause.
Processor plan for the above query:
{code}
OPTIMIZATION COMPLETE:
PROCESSOR PLAN:
ProjectNode(0) output=[A.CHARVALUE AS charvalue, A.INTKEY AS intkey, lead(A.INTKEY, 1,
null) OVER (ORDER BY A.CHARVALUE, A.INTKEY)] [A.CHARVALUE AS charvalue, A.INTKEY AS
intkey, lead(A.INTKEY, 1, null) OVER (ORDER BY A.CHARVALUE, A.INTKEY)]
WindowFunctionProjectNode(1) output=[lead(A.INTKEY, 1, null) OVER (ORDER BY
A.CHARVALUE, A.INTKEY), A.CHARVALUE, A.INTKEY]
LimitNode(2) output=[A.CHARVALUE, A.INTKEY] limit 100
SortNode(3) output=[A.CHARVALUE, A.INTKEY] [SORT] [A.INTKEY]
JoinNode(4) [NESTED TABLE JOIN] [CROSS JOIN] output=[A.CHARVALUE, A.INTKEY]
ProjectNode(5) output=[file] [sourceModel.getTextFiles.file]
AccessNode(6) output=[sourceModel.getTextFiles.file,
sourceModel.getTextFiles.filePath] EXEC sourceModel.getTextFiles('smallaCsv.csv')
TextTableNode(7) output=[A.CHARVALUE, A.INTKEY]
{code}