issues related to TEIID-1237

Thursday, 7 October 2010

Hello all,

In looking at expanding our support for stored procedure handling in our procedure
language there are a couple of other cleanups that would be good to discuss.

1. We should not allow the direct usage of stored procedures as subqueries.  Instead of
"WHERE col = (EXEC foo())", we should use "WHERE col = (SELECT * FROM (EXEC
foo()) x)".  I propose that we deprecate this usage and allow the parser to
automatically convert to the latter syntax.

2. To be more concise/closer to standard syntax we should allow dynamic SQL statements to
have the form "EXECUTE [(STRING|IMMEDIATE)] ...".  The choice to use the STRING
keyword in 5.5 was due to not having IMMEDIATE as a reserved word.

Our assignment syntax has several issues.  I propose that the parser continue to allow the
existing syntax, but with the following changes.

3. The plsql assignment operator := will also be accepted.  We do not yet need to make
this the preferred syntax, but it may be a good idea as it disambiguates an assignment
from the equality predicate boolean expression.   

4. We should deprecate the direct assignment of commands to the variable.  For example:
a = select ...;
b = insert into ...;
c = update ...;
d = delete ...;

The proper syntax for query expressions is a scalar subquery, thus "a = select
...;" should be "a = (select ...);".

For nearly every other database, DML statements do not project the rows updated.  They
also do have the possibility of providing a proper result set, such as with the
PostgreSQL/Oracle insert returning clause.  Both of these are reasons why you don't
typically see syntax such as "b = insert into...;".  The more common method for
getting the update count is from an implicit variable.  One possibility is to use our
existing "ROWS_UPDATED" as that implicit variable.  So rather than
"ROWS_UPDATED = insert into ...; x = ROWS_UPDATED;", you would just have
"insert into ...; x = ROWS_UPDATED;".  For most update procedures this obviates
the need to explicitly set the ROWS_UPDATED value as it will be set implicitly by the last
DML statement.  However this could potentially be a breaking change in an existing
procedure, such as "...; ROWS_UPDATED = ...; insert into ...; END".  Another
option is to use more explicit syntax, like PostgreSQL, with "GET DIAGNOSTICS x =
ROWCOUNT"  

This should also change the meaning of "e = exec foo...;".  As per TEIID-1237,
what we would like is "e = exec foo...;" to be a command statement with the side
affect of setting e to the return value of the procedure.  This syntax would only be valid
if the procedure has a return parameter (currently it would be valid if the procedure has
a resultset that has only a single column or no result set and only one out/inout or
return parameter).  The other change would be that the this statement would also set the
implicit return cursor to the result set of the procedure if it has one.  Since these
would be potentially breaking changes (albeit in very narrow circumstances) we could put a
validation in 7.1.1 that only allows that assignment if the procedure returns a result
value and if there is a subsequent command statement.  Otherwise we would have to use a
slightly less intuitive syntax, like SQL Server, such as "exec e = foo...;".

Any thoughts?

Steve

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009