[teiid-users] issues related to TEIID-1237

Thu Oct 7 11:00:30 EDT 2010

Hello all,

In looking at expanding our support for stored procedure handling in our procedure language there are a couple of other cleanups that would be good to discuss.

1. We should not allow the direct usage of stored procedures as subqueries.  Instead of "WHERE col = (EXEC foo())", we should use "WHERE col = (SELECT * FROM (EXEC foo()) x)".  I propose that we deprecate this usage and allow the parser to automatically convert to the latter syntax.

2. To be more concise/closer to standard syntax we should allow dynamic SQL statements to have the form "EXECUTE [(STRING|IMMEDIATE)] ...".  The choice to use the STRING keyword in 5.5 was due to not having IMMEDIATE as a reserved word.

Our assignment syntax has several issues.  I propose that the parser continue to allow the existing syntax, but with the following changes.

3. The plsql assignment operator := will also be accepted.  We do not yet need to make this the preferred syntax, but it may be a good idea as it disambiguates an assignment from the equality predicate boolean expression.   

4. We should deprecate the direct assignment of commands to the variable.  For example:
a = select ...;
b = insert into ...;
c = update ...;
d = delete ...;

The proper syntax for query expressions is a scalar subquery, thus "a = select ...;" should be "a = (select ...);".

For nearly every other database, DML statements do not project the rows updated.  They also do have the possibility of providing a proper result set, such as with the PostgreSQL/Oracle insert returning clause.  Both of these are reasons why you don't typically see syntax such as "b = insert into...;".  The more common method for getting the update count is from an implicit variable.  One possibility is to use our existing "ROWS_UPDATED" as that implicit variable.  So rather than "ROWS_UPDATED = insert into ...; x = ROWS_UPDATED;", you would just have "insert into ...; x = ROWS_UPDATED;".  For most update procedures this obviates the need to explicitly set the ROWS_UPDATED value as it will be set implicitly by the last DML statement.  However this could potentially be a breaking change in an existing procedure, such as "...; ROWS_UPDATED = ...; insert into ...; END".  Another option is to use more explicit syntax, like PostgreSQL, with "GET DIAGNOSTICS x = ROWCOUNT"  

This should also change the meaning of "e = exec foo...;".  As per TEIID-1237, what we would like is "e = exec foo...;" to be a command statement with the side affect of setting e to the return value of the procedure.  This syntax would only be valid if the procedure has a return parameter (currently it would be valid if the procedure has a resultset that has only a single column or no result set and only one out/inout or return parameter).  The other change would be that the this statement would also set the implicit return cursor to the result set of the procedure if it has one.  Since these would be potentially breaking changes (albeit in very narrow circumstances) we could put a validation in 7.1.1 that only allows that assignment if the procedure returns a result value and if there is a subsequent command statement.  Otherwise we would have to use a slightly less intuitive syntax, like SQL Server, such as "exec e = foo...;".

Any thoughts?

Steve