Thread: proposal: SQL parser integration to PL/pgSQL

proposal: SQL parser integration to PL/pgSQL

From
Pavel Stehule
Date:
==Steps==
1. add hook to analyser (transform stage) to substitute unknown
columnref by param - when analyser detect unknown columnref, then call
callback, that returns possible para node or NULL (when external
environment doesn't have a variable). Returned param should be typed
or unknown (for polymorphic params).

2. add special modes to sql parser:
* that allows identify unknown columnref, but don't try to identify
functions, operators - but SRF function's should be identified,
because generate values (same with relations).

* that allows identify potential conflict's identifiers (maybe hook on
transformColumnRef?)

3. with this we should ensure some levels of SQL validation in
PL/pgSQL function:

* Full with warnings [FWW]- this identify identifier's collision (SQL,
PL/pgSQL),
* Full without warnings [FWOW] - this identify unknown functions,
unknown relations,
* Syntax - this identify correct syntax and unknown relation (ignore
check functions and operators),
* None - some "heuristic" minimalistic validation (like current) isn't
compatible and isn't usable.

4. For validation function user could to choose validation's level (GUC1),
5. For run - system is in [FWW] or [FWOW] mode (GUC2) - plans are
generate in compilation time for first run (?? we have to store
parseTree to protect from double query parsing).

==Issues==
* invasive patch - any relation changes (alter table add [drop]
column) should need "recompilation" (not only plan invalidation), new
dependency
* maybe little bit slower first run plpgsql functions,
* some new bugs
* developers have to change some habits - for full validatin should be
necessary creating "skeleton" functions.

==Benefits==
* identifier collisions should be detected clearly and early,
* SQL statements should be fully checked,
* some bugs will be displayed with clean messaage,
* more natural behave for people from Oracle, DB2
* allows named params for SQL

Note: this proposal is related to
http://archives.postgresql.org/pgsql-patches/2007-11/msg00253.php

Notes, comments?

Regards
Pavel Stehule


Re: proposal: SQL parser integration to PL/pgSQL

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> ==Steps==
> 1. add hook to analyser (transform stage) to substitute unknown
> columnref by param - when analyser detect unknown columnref, then call
> callback, that returns possible para node or NULL (when external
> environment doesn't have a variable). Returned param should be typed
> or unknown (for polymorphic params).

IMHO the hook definition should support both the case of external
variables taking precedence over query variables and vice versa.
I don't think the core parser should be forcing that decision.  In any
case we'd probably need both options for plpgsql, so as to be able to
support both traditional and Oracle-compatible behavior.

I'd be inclined to do that by letting the hook function interpose
itself between the parser and the regular transformColumnRef processing,
so that it can call the regular transformColumnRef processing either
before or after doing its external lookups.  Giving it control only
after the regular processing fails would mean there's no way to let
external variables take precedence.

> 2. add special modes to sql parser:

None of those seem like a good idea to me.  The only part that seems
useful is warning about conflicts between external variables and query
variables.  That can be implemented by the hook function itself, if we
define the hook behavior as above.
        regards, tom lane


Re: proposal: SQL parser integration to PL/pgSQL

From
Pavel Stehule
Date:
2009/5/24 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> ==Steps==
>> 1. add hook to analyser (transform stage) to substitute unknown
>> columnref by param - when analyser detect unknown columnref, then call
>> callback, that returns possible para node or NULL (when external
>> environment doesn't have a variable). Returned param should be typed
>> or unknown (for polymorphic params).
>
> IMHO the hook definition should support both the case of external
> variables taking precedence over query variables and vice versa.
> I don't think the core parser should be forcing that decision.  In any
> case we'd probably need both options for plpgsql, so as to be able to
> support both traditional and Oracle-compatible behavior.

good idea

>
> I'd be inclined to do that by letting the hook function interpose
> itself between the parser and the regular transformColumnRef processing,
> so that it can call the regular transformColumnRef processing either
> before or after doing its external lookups.  Giving it control only
> after the regular processing fails would mean there's no way to let
> external variables take precedence.
>

ok

>> 2. add special modes to sql parser:
>
> None of those seem like a good idea to me.  The only part that seems
> useful is warning about conflicts between external variables and query
> variables.  That can be implemented by the hook function itself, if we
> define the hook behavior as above.
>

there is minimal one necessary - for polymorphic variables, we know
name, but we don't know type. And without types, we can't to transform
correctly functions.

regards
Pavel Stehule


>                        regards, tom lane
>


Re: proposal: SQL parser integration to PL/pgSQL

From
Pavel Stehule
Date:
2009/5/25 Pavel Stehule <pavel.stehule@gmail.com>:
> 2009/5/24 Tom Lane <tgl@sss.pgh.pa.us>:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> ==Steps==
>>> 1. add hook to analyser (transform stage) to substitute unknown
>>> columnref by param - when analyser detect unknown columnref, then call
>>> callback, that returns possible para node or NULL (when external
>>> environment doesn't have a variable). Returned param should be typed
>>> or unknown (for polymorphic params).
>>
>> IMHO the hook definition should support both the case of external
>> variables taking precedence over query variables and vice versa.
>> I don't think the core parser should be forcing that decision.  In any
>> case we'd probably need both options for plpgsql, so as to be able to
>> support both traditional and Oracle-compatible behavior.
>
> good idea
>

the problem is place for hook variable - it should be in ParseState
because we need call hooked functions from environments that install
hook.

Pavel

>>
>> I'd be inclined to do that by letting the hook function interpose
>> itself between the parser and the regular transformColumnRef processing,
>> so that it can call the regular transformColumnRef processing either
>> before or after doing its external lookups.  Giving it control only
>> after the regular processing fails would mean there's no way to let
>> external variables take precedence.
>>
>
> ok
>
>>> 2. add special modes to sql parser:
>>
>> None of those seem like a good idea to me.  The only part that seems
>> useful is warning about conflicts between external variables and query
>> variables.  That can be implemented by the hook function itself, if we
>> define the hook behavior as above.
>>
>
> there is minimal one necessary - for polymorphic variables, we know
> name, but we don't know type. And without types, we can't to transform
> correctly functions.
>
> regards
> Pavel Stehule
>
>
>>                        regards, tom lane
>>
>