Thread: plpgsql_check_function - rebase for 9.3

plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello

I am sending lightly refreshed patch for checking plpgsql functions..

I checked different implementation, but without success: a) enhancing
of SPI to some fake mode can has negative impact on application, and
patch was not clear, b) generic plpgsql walker doesn't save lines too.

I invite any ideas how to improve this patch

Regards

Pavel

Attachment

Re: plpgsql_check_function - rebase for 9.3

From
Selena Deckelmann
Date:
Hi!

On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

> I am sending lightly refreshed patch for checking plpgsql functions..
>
> I checked different implementation, but without success: a) enhancing
> of SPI to some fake mode can has negative impact on application, and
> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>
> I invite any ideas how to improve this patch

I reviewed this and did a clean up for bitrot and a little whitespace.
In particular, it needed to learn a little about event triggers.

This patch is a follow on from an earlier review thread I found:
http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at

I dug through that thread a bit, and I believe issues raised by
Laurenz, Petr and Alvaro were resolved by Pavel over time.

All tests pass, and after a read-through, the code seems fine.

This also represents my inaugural use of pg_bsd_indent. I ran it on
pl_check.c - which made things mostly better. Happy to try and fix it
up more if someone can explain to me what (if anything) I did
incorrectly when using it.

-selena

--
http://chesnok.com

Attachment

Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
Selena Deckelmann <selena@chesnok.com> writes:
> This also represents my inaugural use of pg_bsd_indent. I ran it on
> pl_check.c - which made things mostly better. Happy to try and fix it
> up more if someone can explain to me what (if anything) I did
> incorrectly when using it.

It looks like you didn't give it a typedef list?  Fetch the file from
http://buildfarm.postgresql.org/cgi-bin/typedefs.pl
and pass it with --typedefs=filename.  If you're dealing with something
that adds new typedef names, you can add them to the list file manually.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
Andrew Dunstan
Date:
On 10/07/2012 11:25 AM, Tom Lane wrote:
> Selena Deckelmann <selena@chesnok.com> writes:
>> This also represents my inaugural use of pg_bsd_indent. I ran it on
>> pl_check.c - which made things mostly better. Happy to try and fix it
>> up more if someone can explain to me what (if anything) I did
>> incorrectly when using it.
> It looks like you didn't give it a typedef list?  Fetch the file from
> http://buildfarm.postgresql.org/cgi-bin/typedefs.pl
> and pass it with --typedefs=filename.  If you're dealing with something
> that adds new typedef names, you can add them to the list file manually.
>
>             


You's not supposed to be calling pg_bsd_indent directly at all. You's 
supposed to be calling pgindent - it will call pg_bsd_indent as needed 
and adjust the results. See src/tools/pgindent/README.

cheers

andrew




Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello

I am sending a updated version - now it is prepared for event triggers
and it is little bit more robust

I run pgindent, but I have no experience with it, so I am not sure about success

Regards

Pavel Stehule


2012/10/7 Selena Deckelmann <selena@chesnok.com>:
> Hi!
>
> On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>> I am sending lightly refreshed patch for checking plpgsql functions..
>>
>> I checked different implementation, but without success: a) enhancing
>> of SPI to some fake mode can has negative impact on application, and
>> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>>
>> I invite any ideas how to improve this patch
>
> I reviewed this and did a clean up for bitrot and a little whitespace.
> In particular, it needed to learn a little about event triggers.
>
> This patch is a follow on from an earlier review thread I found:
> http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at
>
> I dug through that thread a bit, and I believe issues raised by
> Laurenz, Petr and Alvaro were resolved by Pavel over time.
>
> All tests pass, and after a read-through, the code seems fine.
>
> This also represents my inaugural use of pg_bsd_indent. I ran it on
> pl_check.c - which made things mostly better. Happy to try and fix it
> up more if someone can explain to me what (if anything) I did
> incorrectly when using it.
>
> -selena
>
> --
> http://chesnok.com

Attachment

Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello

a some updated version

* possibility to raise (and filter) performance warnings - detects IO castings
* detects assign composite value to scalar variable

Regards

Pavel Stehule

2012/11/27 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> I am sending a updated version - now it is prepared for event triggers
> and it is little bit more robust
>
> I run pgindent, but I have no experience with it, so I am not sure about success
>
> Regards
>
> Pavel Stehule
>
>
> 2012/10/7 Selena Deckelmann <selena@chesnok.com>:
>> Hi!
>>
>> On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>
>>> I am sending lightly refreshed patch for checking plpgsql functions..
>>>
>>> I checked different implementation, but without success: a) enhancing
>>> of SPI to some fake mode can has negative impact on application, and
>>> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>>>
>>> I invite any ideas how to improve this patch
>>
>> I reviewed this and did a clean up for bitrot and a little whitespace.
>> In particular, it needed to learn a little about event triggers.
>>
>> This patch is a follow on from an earlier review thread I found:
>> http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at
>>
>> I dug through that thread a bit, and I believe issues raised by
>> Laurenz, Petr and Alvaro were resolved by Pavel over time.
>>
>> All tests pass, and after a read-through, the code seems fine.
>>
>> This also represents my inaugural use of pg_bsd_indent. I ran it on
>> pl_check.c - which made things mostly better. Happy to try and fix it
>> up more if someone can explain to me what (if anything) I did
>> incorrectly when using it.
>>
>> -selena
>>
>> --
>> http://chesnok.com

Attachment

Re: plpgsql_check_function - rebase for 9.3

From
"Petr Jelinek"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Pavel Stehule
> Sent: 28 November 2012 11:07
> To: Selena Deckelmann
> Cc: PostgreSQL Hackers
> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
>
> Hello
>
> a some updated version
>
> * possibility to raise (and filter) performance warnings - detects IO castings
> * detects assign composite value to scalar variable
>

Hello,

I was wondering if maybe this could be split to 2 separate patches, one would be all the changes to the existing
plpgsqlcode - rename delete_function  to plpgsql_delete_function and extern plpgsql_delete_function,
copy_plpgsql_datum,plpgsql_estate_setup, plpgsql_destroy_econtext and the other patch would be the actual checker. 

Reasoning for this is that the first patch (exporting some of plpgsql internals) should be very safe to commit and
wouldbe useful by itself even if the checker does not get in 9.3 since it would enable users to write
lints/checkers/analysersfor plpgsql as standalone extensions (for my use case this is actually way more useful than
havingthe checker as part of core). 

Regards
Petr




Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
"Petr Jelinek" <pjmodos@pjmodos.net> writes:
> I was wondering if maybe this could be split to 2 separate patches, one would be all the changes to the existing
plpgsqlcode - rename delete_function  to plpgsql_delete_function and extern plpgsql_delete_function,
copy_plpgsql_datum,plpgsql_estate_setup, plpgsql_destroy_econtext and the other patch would be the actual checker.
 

> Reasoning for this is that the first patch (exporting some of plpgsql internals) should be very safe to commit and
wouldbe useful by itself even if the checker does not get in 9.3 since it would enable users to write
lints/checkers/analysersfor plpgsql as standalone extensions (for my use case this is actually way more useful than
havingthe checker as part of core).
 

What exactly do you have in mind there?  The way we load extensions,
they can't (AFAIK) see each other's defined symbols, so you couldn't
really make an independent extension that would call functions in
plpgsql.  This is not really open for debate, either, as changing that
would risk creating symbol collisions between modules that never had to
worry about polluting global namespace before.

I would note also that we absolutely, positively do not guarantee not
to change plpgsql's private data structures in minor releases.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
"Petr Jelinek"
Date:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Tom Lane
> Sent: 26 January 2013 18:16
> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
> 
> "Petr Jelinek" <pjmodos@pjmodos.net> writes:
> > I was wondering if maybe this could be split to 2 separate patches, one
> would be all the changes to the existing plpgsql code - rename
> delete_function  to plpgsql_delete_function and extern
> plpgsql_delete_function, copy_plpgsql_datum, plpgsql_estate_setup,
> plpgsql_destroy_econtext and the other patch would be the actual checker.
> 
> > Reasoning for this is that the first patch (exporting some of plpgsql
> internals) should be very safe to commit and would be useful by itself
even if
> the checker does not get in 9.3 since it would enable users to write
> lints/checkers/analysers for plpgsql as standalone extensions (for my use
> case this is actually way more useful than having the checker as part of
core).
> 
> What exactly do you have in mind there?  The way we load extensions, they
> can't (AFAIK) see each other's defined symbols, so you couldn't really
make
> an independent extension that would call functions in plpgsql.  This is
not
> really open for debate, either, as changing that would risk creating
symbol
> collisions between modules that never had to worry about polluting global
> namespace before.

I can call functions that are exported by plpgsql.so just fine from
different extension now, I just have to preload the plpgsql.so (via LOAD or
guc) first, so I don't see what is the problem here.
> I would note also that we absolutely, positively do not guarantee not to
> change plpgsql's private data structures in minor releases.

That didn't stop people from from writing plpgsql extensions before, don't
see why it would now, the problem is that they have to use hooks now and
those require the function to be executed, making those plpgsql interfaces
external would help with setting up things so that extension can work with
function without function being executed or duplicating a lot of plpgsql
code. As an example of all of this you can see
https://github.com/okbob/plpgsql_lint which is the original module Pavel
wrote before writing this patch.

The other thing is this is something the patch in current form does anyway
so I don't see the harm.

Regards
Petr




Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
2013/1/26 Petr Jelinek <pjmodos@pjmodos.net>:
>> -----Original Message-----
>> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
>> owner@postgresql.org] On Behalf Of Tom Lane
>> Sent: 26 January 2013 18:16
>> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
>>
>> "Petr Jelinek" <pjmodos@pjmodos.net> writes:
>> > I was wondering if maybe this could be split to 2 separate patches, one
>> would be all the changes to the existing plpgsql code - rename
>> delete_function  to plpgsql_delete_function and extern
>> plpgsql_delete_function, copy_plpgsql_datum, plpgsql_estate_setup,
>> plpgsql_destroy_econtext and the other patch would be the actual checker.
>>
>> > Reasoning for this is that the first patch (exporting some of plpgsql
>> internals) should be very safe to commit and would be useful by itself
> even if
>> the checker does not get in 9.3 since it would enable users to write
>> lints/checkers/analysers for plpgsql as standalone extensions (for my use
>> case this is actually way more useful than having the checker as part of
> core).

A significant improvement in this are can be placing plpgsql.h to
other header files. Now plpgsql extensions have to use private copy of
this file - what is really ugly

Pavel



>>
>> What exactly do you have in mind there?  The way we load extensions, they
>> can't (AFAIK) see each other's defined symbols, so you couldn't really
> make
>> an independent extension that would call functions in plpgsql.  This is
> not
>> really open for debate, either, as changing that would risk creating
> symbol
>> collisions between modules that never had to worry about polluting global
>> namespace before.
>
> I can call functions that are exported by plpgsql.so just fine from
> different extension now, I just have to preload the plpgsql.so (via LOAD or
> guc) first, so I don't see what is the problem here.
>
>> I would note also that we absolutely, positively do not guarantee not to
>> change plpgsql's private data structures in minor releases.
>
> That didn't stop people from from writing plpgsql extensions before, don't
> see why it would now, the problem is that they have to use hooks now and
> those require the function to be executed, making those plpgsql interfaces
> external would help with setting up things so that extension can work with
> function without function being executed or duplicating a lot of plpgsql
> code. As an example of all of this you can see
> https://github.com/okbob/plpgsql_lint which is the original module Pavel
> wrote before writing this patch.
>
> The other thing is this is something the patch in current form does anyway
> so I don't see the harm.
>
> Regards
> Petr
>



Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
"Petr Jelinek" <pjmodos@pjmodos.net> writes:
>> What exactly do you have in mind there?  The way we load extensions, they
>> can't (AFAIK) see each other's defined symbols, so you couldn't really make
>> an independent extension that would call functions in plpgsql.  This is not
>> really open for debate, either, as changing that would risk creating symbol
>> collisions between modules that never had to worry about polluting global
>> namespace before.

> I can call functions that are exported by plpgsql.so just fine from
> different extension now, I just have to preload the plpgsql.so (via LOAD or
> guc) first, so I don't see what is the problem here.

[ pokes around... ]  Hm, it appears that that does work on Linux,
because for some reason we're specifying RTLD_GLOBAL to dlopen().
TBH that seems like a truly horrid idea that we should reconsider.
Aside from the danger of unexpected symbol collisions between
independent loadable modules, I seriously doubt that it works like
that on every platform we support --- so I'd be very strongly against
accepting any code that depends on this working.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
I wrote:
> [ pokes around... ]  Hm, it appears that that does work on Linux,
> because for some reason we're specifying RTLD_GLOBAL to dlopen().
> TBH that seems like a truly horrid idea that we should reconsider.

A bit of research in the archives revealed that we're using it because
back in 2001, the lame hack that then passed for a shared-library
version of python required it:
http://www.postgresql.org/message-id/Pine.LNX.4.30.0105121914200.757-100000@peter.localdomain

There was subsequent discussion of removing it, because reportedly now
(a) that's no longer the case, and (b) we need to get rid of it to allow
plpython2 and plpython3 to coexist in one session.  See for instance:
http://www.postgresql.org/message-id/1277506674.5356.27.camel@vanquo.pezone.net

Nothing's been done about that yet, but I think that assuming that we'll
be using RTLD_GLOBAL forever would be foolish.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
"Petr Jelinek"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 26 January 2013 20:12
> Subject: Re: [HACKERS] plpgsql_check_function - rebase for 9.3
> 
> I wrote:
> > [ pokes around... ]  Hm, it appears that that does work on Linux,
> > because for some reason we're specifying RTLD_GLOBAL to dlopen().
> > TBH that seems like a truly horrid idea that we should reconsider.
> 
> A bit of research in the archives revealed that we're using it because
back in
> 2001, the lame hack that then passed for a shared-library version of
python
> required it:
> http://www.postgresql.org/message-id/Pine.LNX.4.30.0105121914200.757-
> 100000@peter.localdomain
> 
> There was subsequent discussion of removing it, because reportedly now
> (a) that's no longer the case, and (b) we need to get rid of it to allow
> plpython2 and plpython3 to coexist in one session.  See for instance:
> http://www.postgresql.org/message-
> id/1277506674.5356.27.camel@vanquo.pezone.net
> 
> Nothing's been done about that yet, but I think that assuming that we'll
be
> using RTLD_GLOBAL forever would be foolish.
> 

Ok then it was a bad idea after all.

Regards
Petr




Re: plpgsql_check_function - rebase for 9.3

From
Peter Eisentraut
Date:
On 1/26/13 1:53 PM, Tom Lane wrote:
> [ pokes around... ]  Hm, it appears that that does work on Linux,
> because for some reason we're specifying RTLD_GLOBAL to dlopen().
> TBH that seems like a truly horrid idea that we should reconsider.
> Aside from the danger of unexpected symbol collisions between
> independent loadable modules, I seriously doubt that it works like
> that on every platform we support --- so I'd be very strongly against
> accepting any code that depends on this working.

Well, that would kill a lot of potentially useful features, including
the transforms feature I've been working on and any kind of hook or
debugger or profiler on an existing module.  (How do plpgsql plugins
work?)  We also couldn't transparently move functionality out of the
postgres binary into a module.

I see the concern about symbol collisions.  But you can normally work
around that by prefixing exported symbols.



Re: plpgsql_check_function - rebase for 9.3

From
Andrew Dunstan
Date:
On 01/28/2013 10:11 AM, Peter Eisentraut wrote:
> On 1/26/13 1:53 PM, Tom Lane wrote:
>> [ pokes around... ]  Hm, it appears that that does work on Linux,
>> because for some reason we're specifying RTLD_GLOBAL to dlopen().
>> TBH that seems like a truly horrid idea that we should reconsider.
>> Aside from the danger of unexpected symbol collisions between
>> independent loadable modules, I seriously doubt that it works like
>> that on every platform we support --- so I'd be very strongly against
>> accepting any code that depends on this working.
> Well, that would kill a lot of potentially useful features, including
> the transforms feature I've been working on and any kind of hook or
> debugger or profiler on an existing module.  (How do plpgsql plugins
> work?)  We also couldn't transparently move functionality out of the
> postgres binary into a module.
>
> I see the concern about symbol collisions.  But you can normally work
> around that by prefixing exported symbols.
>

Yeah, I was just writing something a couple of days ago that leveraged 
stuff in an extension, so it looks like this is wanted functionality. In 
general we want to be able to layer addon modules, ISTM.

cheers

andrew



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello

I though about any possibility how to reduce a size of this patch.

I see a one solution - if we would not support a detection of multiple
errors - it stops on first error, we can push this code to compilation
stage.

a plpgsql_validator can be overloaded by function

plpgsql_validator(oid, reloid, level)

reloid - is requested by triggers - it is related class oid
level - it is level of checking

0 - same as current implementation - check only syntax errors
1 - stop on first object error - no table, no field, no expected data type
2 - stop on first performance issue - implicit casting identification
(should be removed or moved to next releases)

This proposal reduces functionality proposed for
plpgsql_check_function - but basic functionality is there.

It has not a impact on performance and it allow check all paths in
compile time.

It will not change behave of current CREATE OR REPLACE function -
there is not a problem with back compatibility

It is good for you?

I can have this modification to end of this week.

Regards

Pavel



2012/11/28 Pavel Stehule <pavel.stehule@gmail.com>:
> Hello
>
> a some updated version
>
> * possibility to raise (and filter) performance warnings - detects IO castings
> * detects assign composite value to scalar variable
>
> Regards
>
> Pavel Stehule
>
> 2012/11/27 Pavel Stehule <pavel.stehule@gmail.com>:
>> Hello
>>
>> I am sending a updated version - now it is prepared for event triggers
>> and it is little bit more robust
>>
>> I run pgindent, but I have no experience with it, so I am not sure about success
>>
>> Regards
>>
>> Pavel Stehule
>>
>>
>> 2012/10/7 Selena Deckelmann <selena@chesnok.com>:
>>> Hi!
>>>
>>> On Tue, Jun 26, 2012 at 1:19 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>
>>>> I am sending lightly refreshed patch for checking plpgsql functions..
>>>>
>>>> I checked different implementation, but without success: a) enhancing
>>>> of SPI to some fake mode can has negative impact on application, and
>>>> patch was not clear, b) generic plpgsql walker doesn't save lines too.
>>>>
>>>> I invite any ideas how to improve this patch
>>>
>>> I reviewed this and did a clean up for bitrot and a little whitespace.
>>> In particular, it needed to learn a little about event triggers.
>>>
>>> This patch is a follow on from an earlier review thread I found:
>>> http://archives.postgresql.org/message-id/D960CB61B694CF459DCFB4B0128514C2072DF447@exadv11.host.magwien.gv.at
>>>
>>> I dug through that thread a bit, and I believe issues raised by
>>> Laurenz, Petr and Alvaro were resolved by Pavel over time.
>>>
>>> All tests pass, and after a read-through, the code seems fine.
>>>
>>> This also represents my inaugural use of pg_bsd_indent. I ran it on
>>> pl_check.c - which made things mostly better. Happy to try and fix it
>>> up more if someone can explain to me what (if anything) I did
>>> incorrectly when using it.
>>>
>>> -selena
>>>
>>> --
>>> http://chesnok.com



Re: plpgsql_check_function - rebase for 9.3

From
Josh Berkus
Date:
Folks,

Where are we with this patch?  I'm a bit unclear from the discussion on
whether it passes muster or not.  Things seem to have petered out.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Where are we with this patch?  I'm a bit unclear from the discussion on
> whether it passes muster or not.  Things seem to have petered out.

I took another look at this patch tonight.  I think the thing that is
bothering everybody (including Pavel) is that as submitted, pl_check.c
involves a huge amount of duplication of knowledge and code from
pl_exec.c, and to a lesser extent pl_comp.c.  It certainly looks like a
maintenance disaster in the making.  It doesn't bother me so much that
pl_check.c knows about each syntactic structure in plpgsql: there are
already four or five places you have to touch when adding new syntax.
Rather, it's that there's so much duplication of knowledge about
semantic details, which is largely expressed by copied-and-pasted code
from pl_exec.c.  It seems like a safe bet that we'll sometimes miss the
need to fix pl_check.c when we fix something in pl_exec.c.  There are
annoying duplications from pl_comp.c as well, eg knowledge of exactly
which magic variables are defined in trigger functions.

Having said all that, it's not clear that we can really do any better.
The only obvious alternative is pushing support for a checking mode
directly into pl_exec.c, which would obfuscate and slow down that code
to an unacceptable degree if Pavel's results at
http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
are any indication.  (In that message, Pavel proposes shoveling the
problem into the compile code instead, but that seems unlikely to work
IMO: the main problem in pl_check.c as it stands is duplication of code
from pl_exec.c not pl_comp.c.  So I think that path could only lead to
duplicating the same code into pl_comp.c.)

So question 1 is do we want to accept that this is the implementation
pathway we're going to settle for, or are we going to hold out for a
better one?  I'd be the first in line to hold out if I had a clue how
to move towards a better implementation, but I have none.  Are we
prepared to reject this type of feature entirely because of the
code-duplication problem?  That doesn't seem attractive either.

But, even granting that this implementation approach is acceptable,
the patch does not seem close to being committable as-is: there's
a lot of fit-and-finish work yet to be done.  To make my point I will
just quote from one of the regression test additions:

create or replace function f1()
returns void as $$
declare a1 int; a2 int;
begin select 10,20 into a1;
end;
$$ language plpgsql;
-- raise warning
select plpgsql_check_function('f1()');                        plpgsql_check_function                          
-------------------------------------------------------------------------warning:00000:4:SQL statement:too many
attributiesfor target variablesDetail: There are less target variables than output columns in query.Hint: Check target
variablesin SELECT INTO statement
 
(3 rows)

Do we like this output format?  I don't.  The unlabeled, undocumented
fields crammed into a single line with colon separators are neither
readable nor useful.  If we actually need these fields, why aren't we
splitting the output into multiple columns?  (I'm also wondering why
the patch bothers with an option to emit this same info in XML.  Surely
there is vanishingly small use-case for mechanical examination of this
output.)

This example also shows that the user-visible messages have had neither
editing from a native speaker of English, nor even attention from a
spell checker.  I don't fault Pavel for that (his English is far better
than my Czech) but this patch has been passed by at least one reviewer
who is a native speaker.  Personally I'd also say that neither the
Detail nor Hint messages in this example are worth the electrons they
take up, as they add nothing useful to the basic error message.  I'd be
embarrassed to be presenting output like this to end users of Postgres.

(The code comments are in even worse shape than the user-visible
messages, by the by, where they exist at all.)

A somewhat related point is that it doesn't look like any thought
has been taken about localized message output.

This stuff is certainly all fixable if we agree on the basic
implementation approach; and I can't really fault Pavel for not
worrying much about such details while the implementation approach
hasn't been agreed to.  But I'd judge that there's a week or more's
worth of work here to get to a patch I'd be happy about committing,
even without any change in the basic approach.  That's not time I'm
willing to put in personally right now.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello all

2013/3/26 Tom Lane <tgl@sss.pgh.pa.us>:
> Josh Berkus <josh@agliodbs.com> writes:
>> Where are we with this patch?  I'm a bit unclear from the discussion on
>> whether it passes muster or not.  Things seem to have petered out.
>
> I took another look at this patch tonight.  I think the thing that is
> bothering everybody (including Pavel) is that as submitted, pl_check.c
> involves a huge amount of duplication of knowledge and code from
> pl_exec.c, and to a lesser extent pl_comp.c.  It certainly looks like a
> maintenance disaster in the making.  It doesn't bother me so much that
> pl_check.c knows about each syntactic structure in plpgsql: there are
> already four or five places you have to touch when adding new syntax.
> Rather, it's that there's so much duplication of knowledge about
> semantic details, which is largely expressed by copied-and-pasted code
> from pl_exec.c.  It seems like a safe bet that we'll sometimes miss the
> need to fix pl_check.c when we fix something in pl_exec.c.  There are
> annoying duplications from pl_comp.c as well, eg knowledge of exactly
> which magic variables are defined in trigger functions.
>
> Having said all that, it's not clear that we can really do any better.
> The only obvious alternative is pushing support for a checking mode
> directly into pl_exec.c, which would obfuscate and slow down that code
> to an unacceptable degree if Pavel's results at
> http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
> are any indication.  (In that message, Pavel proposes shoveling the
> problem into the compile code instead, but that seems unlikely to work
> IMO: the main problem in pl_check.c as it stands is duplication of code
> from pl_exec.c not pl_comp.c.  So I think that path could only lead to
> duplicating the same code into pl_comp.c.)
>
> So question 1 is do we want to accept that this is the implementation
> pathway we're going to settle for, or are we going to hold out for a
> better one?  I'd be the first in line to hold out if I had a clue how
> to move towards a better implementation, but I have none.  Are we
> prepared to reject this type of feature entirely because of the
> code-duplication problem?  That doesn't seem attractive either.

I wrote lot of versions and proposed code is redundant, but most
simple and clean.

I am really against to pushing check to pl_exec, because it
significantly decrease code readability and increase some bottle neck
in CPU extensive tests. More, there are too less place for some future
feature - performance advising, more verbose error messages, etc

In PL/pgPSM I used a little bit different architecture - necessary for
PSM and maybe better for PL/pgSQL too. It is two stage compiler -
parsing to AST, and AST compilation. It simplify gram.y and processing
order depends on AST deep iteration and not on bizon rules. It can
have a impact on speed of very large procedures probably - I don't see
different disadvantages. With this architecture I was able do lot of
controls to compile stage without problems.

Most complexity in current code is related to detecting record types
from expressions without expression evaluation. Maybe this code can be
in core or pl_compile.c. Code for Describe (F) message is not too
reusable. It is necessary for

DECLARE r RECORD;
FOR r IN SELECT ...
LOOP  RAISE NOTICE '>>%<<', r.xx;
END LOOP;

>
> But, even granting that this implementation approach is acceptable,
> the patch does not seem close to being committable as-is: there's
> a lot of fit-and-finish work yet to be done.  To make my point I will
> just quote from one of the regression test additions:
>
> create or replace function f1()
> returns void as $$
> declare a1 int; a2 int;
> begin
>   select 10,20 into a1;
> end;
> $$ language plpgsql;
> -- raise warning
> select plpgsql_check_function('f1()');
>                          plpgsql_check_function
> -------------------------------------------------------------------------
>  warning:00000:4:SQL statement:too many attributies for target variables
>  Detail: There are less target variables than output columns in query.
>  Hint: Check target variables in SELECT INTO statement
> (3 rows)
>
> Do we like this output format?  I don't.  The unlabeled, undocumented
> fields crammed into a single line with colon separators are neither
> readable nor useful.  If we actually need these fields, why aren't we
> splitting the output into multiple columns?  (I'm also wondering why
> the patch bothers with an option to emit this same info in XML.  Surely
> there is vanishingly small use-case for mechanical examination of this
> output.)

This format can be reduced, redesigned, changed. It is designed like
gcc output and optimized for using from psql console.

I tested table output - in original CHECK statement implementation,
but it is not too friendly for showing in monitor - it is just too
wide. There are similar arguments like using tabular output for
EXPLAIN, although there are higher complexity and nested structures
(in EXPLAIN). Personally, I can live with tabular output, it is not
user friendly, but it can be used for machine processing simply, and
any advanced user can write some transform functions to any output
format.


>
> This example also shows that the user-visible messages have had neither
> editing from a native speaker of English, nor even attention from a
> spell checker.  I don't fault Pavel for that (his English is far better
> than my Czech) but this patch has been passed by at least one reviewer
> who is a native speaker.  Personally I'd also say that neither the
> Detail nor Hint messages in this example are worth the electrons they
> take up, as they add nothing useful to the basic error message.  I'd be
> embarrassed to be presenting output like this to end users of Postgres.
>
> (The code comments are in even worse shape than the user-visible
> messages, by the by, where they exist at all.)
>
> A somewhat related point is that it doesn't look like any thought
> has been taken about localized message output.
>
> This stuff is certainly all fixable if we agree on the basic
> implementation approach; and I can't really fault Pavel for not
> worrying much about such details while the implementation approach
> hasn't been agreed to.  But I'd judge that there's a week or more's
> worth of work here to get to a patch I'd be happy about committing,
> even without any change in the basic approach.  That's not time I'm
> willing to put in personally right now.

Yes. A details should be precised after final decision. I am hope so
this code doesn't contains significant bug - I have not any reported
issue related to this functionality - it is based on plpgsql_lint -
and this code is used in production by some very large companies. Of
course, there are not too large community - it should be manually
compiled.

Regards

Pavel

>
>                         regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello

I redesigned output from plpgsql_check_function. Now, it returns table
everytime.
Litlle bit code simplification.

postgres=# \sf fx
CREATE OR REPLACE FUNCTION public.fx(integer)
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
BEGIN
  RETURN (SELECT a FROM t1 WHERE b < $1);
END;
$function$
postgres=# select * from plpgsql_check_function('fx(int)');
-[ RECORD 1 ]--------------------------------------
functionid | fx
lineno     | 3
statement  | RETURN
sqlstate   | 42703
message    | column "b" does not exist
detail     | [null]
hint       | [null]
level      | error
position   | 32
query      | SELECT (SELECT a FROM t1 WHERE b < $1)
context    | [null]

Regards

Pavel


2013/3/26 Tom Lane <tgl@sss.pgh.pa.us>:
> Josh Berkus <josh@agliodbs.com> writes:
>> Where are we with this patch?  I'm a bit unclear from the discussion on
>> whether it passes muster or not.  Things seem to have petered out.
>
> I took another look at this patch tonight.  I think the thing that is
> bothering everybody (including Pavel) is that as submitted, pl_check.c
> involves a huge amount of duplication of knowledge and code from
> pl_exec.c, and to a lesser extent pl_comp.c.  It certainly looks like a
> maintenance disaster in the making.  It doesn't bother me so much that
> pl_check.c knows about each syntactic structure in plpgsql: there are
> already four or five places you have to touch when adding new syntax.
> Rather, it's that there's so much duplication of knowledge about
> semantic details, which is largely expressed by copied-and-pasted code
> from pl_exec.c.  It seems like a safe bet that we'll sometimes miss the
> need to fix pl_check.c when we fix something in pl_exec.c.  There are
> annoying duplications from pl_comp.c as well, eg knowledge of exactly
> which magic variables are defined in trigger functions.
>
> Having said all that, it's not clear that we can really do any better.
> The only obvious alternative is pushing support for a checking mode
> directly into pl_exec.c, which would obfuscate and slow down that code
> to an unacceptable degree if Pavel's results at
> http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
> are any indication.  (In that message, Pavel proposes shoveling the
> problem into the compile code instead, but that seems unlikely to work
> IMO: the main problem in pl_check.c as it stands is duplication of code
> from pl_exec.c not pl_comp.c.  So I think that path could only lead to
> duplicating the same code into pl_comp.c.)
>
> So question 1 is do we want to accept that this is the implementation
> pathway we're going to settle for, or are we going to hold out for a
> better one?  I'd be the first in line to hold out if I had a clue how
> to move towards a better implementation, but I have none.  Are we
> prepared to reject this type of feature entirely because of the
> code-duplication problem?  That doesn't seem attractive either.
>
> But, even granting that this implementation approach is acceptable,
> the patch does not seem close to being committable as-is: there's
> a lot of fit-and-finish work yet to be done.  To make my point I will
> just quote from one of the regression test additions:
>
> create or replace function f1()
> returns void as $$
> declare a1 int; a2 int;
> begin
>   select 10,20 into a1;
> end;
> $$ language plpgsql;
> -- raise warning
> select plpgsql_check_function('f1()');
>                          plpgsql_check_function
> -------------------------------------------------------------------------
>  warning:00000:4:SQL statement:too many attributies for target variables
>  Detail: There are less target variables than output columns in query.
>  Hint: Check target variables in SELECT INTO statement
> (3 rows)
>
> Do we like this output format?  I don't.  The unlabeled, undocumented
> fields crammed into a single line with colon separators are neither
> readable nor useful.  If we actually need these fields, why aren't we
> splitting the output into multiple columns?  (I'm also wondering why
> the patch bothers with an option to emit this same info in XML.  Surely
> there is vanishingly small use-case for mechanical examination of this
> output.)
>
> This example also shows that the user-visible messages have had neither
> editing from a native speaker of English, nor even attention from a
> spell checker.  I don't fault Pavel for that (his English is far better
> than my Czech) but this patch has been passed by at least one reviewer
> who is a native speaker.  Personally I'd also say that neither the
> Detail nor Hint messages in this example are worth the electrons they
> take up, as they add nothing useful to the basic error message.  I'd be
> embarrassed to be presenting output like this to end users of Postgres.
>
> (The code comments are in even worse shape than the user-visible
> messages, by the by, where they exist at all.)
>
> A somewhat related point is that it doesn't look like any thought
> has been taken about localized message output.
>
> This stuff is certainly all fixable if we agree on the basic
> implementation approach; and I can't really fault Pavel for not
> worrying much about such details while the implementation approach
> hasn't been agreed to.  But I'd judge that there's a week or more's
> worth of work here to get to a patch I'd be happy about committing,
> even without any change in the basic approach.  That's not time I'm
> willing to put in personally right now.
>
>                         regards, tom lane

Attachment

Re: plpgsql_check_function - rebase for 9.3

From
Peter Eisentraut
Date:
On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
> I redesigned output from plpgsql_check_function. Now, it returns table
> everytime.
> Litlle bit code simplification.

This patch is in the 2013-09 commitfest but needs a rebase.





Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
rebased

Regards

Pavel


2013/8/22 Peter Eisentraut <peter_e@gmx.net>
On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
> I redesigned output from plpgsql_check_function. Now, it returns table
> everytime.
> Litlle bit code simplification.

This patch is in the 2013-09 commitfest but needs a rebase.



Attachment

Re: plpgsql_check_function - rebase for 9.3

From
Steve Singer
Date:
On 08/22/2013 02:02 AM, Pavel Stehule wrote:
> rebased
>
> Regards
>
> Pavel
>
This patch again needs a rebase, it no longer applies cleanly. 
plpgsql_estate_setup now takes a shared estate parameter and the call in 
pl_check needs that.   I passed NULL in and things seem to work.



I think this is another example where are processes aren't working as 
we'd like.

The last time this patch got  a review was in March when Tom said 
http://www.postgresql.org/message-id/27661.1364267665@sss.pgh.pa.us

Shortly after Pavel responded with a new patch that changes the output 
format. 
http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ@mail.gmail.com

I don't much has progress in the following  9 months on this patch.

In Tom's review the issue of code duplication and asks:
"do we want to accept that this is the implementation pathway we're 
going to settle for, or are we going to hold out for a better one? I'd 
be the first in line to hold out if I had a clue how to move towards a 
better implementation, but I have none."

This question is still outstanding.

Here are my comments on this version of the patch:



This version of the patch gives output as a table with the structure:

functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+-----------+----------+---------+--------+------+-------+----------+-------+---------

I like this much better than the previous format.

I think the patch needs more documentation.

The extent of the documentation is
 <title>Checking of embedded SQL</title>
+    <para>
+     The SQL statements inside <application>PL/pgSQL</> functions are
+     checked by validator for semantic errors. These errors
+     can be found by <function>plpgsql_check_function</function>:



I a don't think that this adequately describes the function.  It isn't clear to me what we mean by 'embedded' SQL.
and 'semantic errors'.


I think this would read better as
  <sect2 id="plpgsql-check-function">   <title>Checking SQL Inside Functions</title>   <para>     The SQL Statements
insideof <application>PL/pgsql</> functions can be     checked by a validation function for semantic errors. You can
check    <application>PL/pgsl</application> functions by passing the name of the     function to
<function>plpgsql_check_function</function>.  </para>   <para>   The <function>plpgsql_check_function</function> will
checkthe SQL   inside of a <application>PL/pgsql</application> function for certain   types of errors and warnings.
</para>

but that needs to be expanded on.

I was expecting something like:

create function x() returns void as $$ declare a int4; begin a='f'; end $$ language plpgsql;

to return an error but it doesn't

but

create temp table x(a int4);
create or replace function x() returns void as $$ declare a int4; begin insert into x values('h'); end $$ language
plpgsql;

does give an error when I pass it to the validator.   Is this the intended behavior of the patch? If so we need to
explainwhy the first function doesn't generate an error.
 


I feel we need to document what each of the columns in the output means.  What is the difference between statement,
queryand context?
 

Some random comments on the messages you return:

Line 816:
if (is_expression && tupdesc->natts != 1)        ereport(ERROR,                (errcode(ERRCODE_SYNTAX_ERROR),
      errmsg("qw",                        query->query,                        tupdesc->natts)));
 

Should this be "query \"%s\" returned %d columns but only 1 is allowed"  or something similar?
 Line 837
else        /* XXX: should be a warning? */        ereport(ERROR,                (errmsg("cannot to identify real type
forrecord type variable")));
 

In what situation does this come  up?  Should it be a warning or error?  
Do we mean "the real type for record variable" ?

Line 1000    ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+                                  errmsg("function does not return 
composite type, is not possible to identify composite type")));

Should this be "function does not return a composite type, it is not 
possible to identify the composite type" ?

Line 1109:
    appendStringInfo(&str, "parameter \"%s\" is overlapped",
+                                                      refname);

gives warnings like:

select message,detail FROM  plpgsql_check_function('b(int)');           message           | detail
-----------------------------+-------------------------------------------- parameter "a" is overlapped | Local variable
overlapfunction parameter.
 


How about instead
"parameter "a" is duplicated" | Local variable is named the same as the 
function parameter


Line 1278:

+                         checker_error(cstate,
+                                       0, 0,
+                                 "cannot determinate a result of 
dynamic SQL",
+                                       "Cannot to contine in check.",
+                       "Don't use dynamic SQL and record type together, 
when you would check function.",
+                                       "warning",
+                                       0, NULL, NULL);

How about
"cannot determine the result of dynamic SQL" , "Cannot continue 
validating the function", "Do not use plpgsql_check_function on 
functions with dynamic SQL"
Also this limitation should be explained in the documentation.

I also thing we need to distinguish between warnings generated because 
of problems in the function versus warnings generated because of 
limitations in the validator.    This implies that there is maybe 
something wrong with my function but there is nothing wrong with using 
dynamic SQL in functions this is just telling users about a runtime 
warning of the validator itself.

Same thing around line 1427


I have not done an in-depth read of the code.

I'm sending this out this patch at least gets some review.  I don't 
think that I will  have a lot more time in the next week to do a more 
thorough review or follow-up review


If we aren't happy with the overal approach of this patch then we need 
to tell Pavel.

My vote would be to try to get the patch (documentation, error messages, 
'XXX' items, etc) into a better state so it can eventually be committed


Steve

>
> 2013/8/22 Peter Eisentraut <peter_e@gmx.net <mailto:peter_e@gmx.net>>
>
>     On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
>     > I redesigned output from plpgsql_check_function. Now, it returns
>     table
>     > everytime.
>     > Litlle bit code simplification.
>
>     This patch is in the 2013-09 commitfest but needs a rebase.
>
>
>
>
>




Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
Hello

thank you for review


2013/12/7 Steve Singer <steve@ssinger.info>
On 08/22/2013 02:02 AM, Pavel Stehule wrote:
rebased

Regards

Pavel

This patch again needs a rebase, it no longer applies cleanly. plpgsql_estate_setup now takes a shared estate parameter and the call in pl_check needs that.   I passed NULL in and things seem to work.



I think this is another example where are processes aren't working as we'd like.

The last time this patch got  a review was in March when Tom said http://www.postgresql.org/message-id/27661.1364267665@sss.pgh.pa.us

Shortly after Pavel responded with a new patch that changes the output format. http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ@mail.gmail.com

I don't much has progress in the following  9 months on this patch.

In Tom's review the issue of code duplication and asks:

"do we want to accept that this is the implementation pathway we're going to settle for, or are we going to hold out for a better one? I'd be the first in line to hold out if I had a clue how to move towards a better implementation, but I have none."

yes, I am waiting still to a) have some better idea, or b) have significantly more free time for larger plpgsql refactoring. Nothing of it happens :(


This question is still outstanding.

Here are my comments on this version of the patch:



This version of the patch gives output as a table with the structure:

functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+-----------+----------+---------+--------+------+-------+----------+-------+---------

I like this much better than the previous format.

I think the patch needs more documentation.

The extent of the documentation is

 <title>Checking of embedded SQL</title>
+    <para>
+     The SQL statements inside <application>PL/pgSQL</> functions are
+     checked by validator for semantic errors. These errors
+     can be found by <function>plpgsql_check_function</function>:



I a don't think that this adequately describes the function.  It isn't clear to me what we mean by 'embedded' SQL.
and 'semantic errors'.


I think this would read better as

  <sect2 id="plpgsql-check-function">
   <title>Checking SQL Inside Functions</title>
   <para>
     The SQL Statements inside of <application>PL/pgsql</> functions can be
     checked by a validation function for semantic errors. You can check
     <application>PL/pgsl</application> functions by passing the name of the
     function to <function>plpgsql_check_function</function>.
   </para>
   <para>
   The <function>plpgsql_check_function</function> will check the SQL
   inside of a <application>PL/pgsql</application> function for certain
   types of errors and warnings.
   </para>

but that needs to be expanded on.

I was expecting something like:

create function x() returns void as $$ declare a int4; begin a='f'; end $$ language plpgsql;

to return an error but it doesn't

This is expected. PL/pgSQL use a late casting - so a := '10'; is acceptable. And in this moment plpgsql check doesn't evaluate constant and doesn't try to cast to target type. But this check can be implemented sometime. In this moment, we have no simple way how to identify constant expression in plpgsql. So any constants are expressions from plpgsql perspective.
 

but

create temp table x(a int4);
create or replace function x() returns void as $$ declare a int4; begin insert into x values('h'); end $$ language plpgsql;

it is expected too. SQL doesn't use late casting - casting is controlled by available casts and constants are evaluated early - due possible optimization.
 

does give an error when I pass it to the validator.   Is this the intended behavior of the patch? If so we need to explain why the first function doesn't generate an error.


I feel we need to document what each of the columns in the output means.  What is the difference between statement, query and context?

Some random comments on the messages you return:

Line 816:

        if (is_expression && tupdesc->natts != 1)
                        ereport(ERROR,
                                        (errcode(ERRCODE_SYNTAX_ERROR),
                                         errmsg("qw",
                                                        query->query,
                                                        tupdesc->natts)));

Should this be "query \"%s\" returned %d columns but only 1 is allowed"  or something similar?

 Line 837

        else
                        /* XXX: should be a warning? */
                        ereport(ERROR,
                                        (errmsg("cannot to identify real type for record type variable")));

In what situation does this come  up?  Should it be a warning or error?  Do we mean "the real type for record variable" ?

a most difficult part of plpgsql check function is about transformation dynamic types to know row types. It is necessary for checking usable functions and operators.

typical pattern is:

declare r record;
begin
  for r in select a, b from some_tab
  loop
    raise notice '%', extract(day from r.a);
  end loop;

and we should to detect type of r.a. Sometimes we cannot to do it due using dynamic SQL - plpgsql check doesn't try to evaluate expressions (as protection against side efects).


 

Line 1000
    ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+                                  errmsg("function does not return composite type, is not possible to identify composite type")));

Should this be "function does not return a composite type, it is not possible to identify the composite type" ?

Line 1109:

    appendStringInfo(&str, "parameter \"%s\" is overlapped",
+                                                      refname);

gives warnings like:

select message,detail FROM  plpgsql_check_function('b(int)');
           message           | detail
-----------------------------+--------------------------------------------
 parameter "a" is overlapped | Local variable overlap function parameter.


How about instead
"parameter "a" is duplicated" | Local variable is named the same as the function parameter

I have no idea about good sentence, but "duplicate" probably is not best. Any variable can be locally overlapped. Probably any overlapping should be signalized - it is a bad design always.
 


Line 1278:

+                         checker_error(cstate,
+                                       0, 0,
+                                 "cannot determinate a result of dynamic SQL",
+                                       "Cannot to contine in check.",
+                       "Don't use dynamic SQL and record type together, when you would check function.",
+                                       "warning",
+                                       0, NULL, NULL);

How about
"cannot determine the result of dynamic SQL" , "Cannot continue validating the function", "Do not use plpgsql_check_function on functions with dynamic SQL"
Also this limitation should be explained in the documentation.

I also thing we need to distinguish between warnings generated because of problems in the function versus warnings generated because of limitations in the validator.    This implies that there is maybe something wrong with my function but there is nothing wrong with using dynamic SQL in functions this is just telling users about a runtime warning of the validator itself.

Same thing around line 1427


I have not done an in-depth read of the code.

I'm sending this out this patch at least gets some review.  I don't think that I will  have a lot more time in the next week to do a more thorough review or follow-up review


If we aren't happy with the overal approach of this patch then we need to tell Pavel.

My vote would be to try to get the patch (documentation, error messages, 'XXX' items, etc) into a better state so it can eventually be committed


Thank you

Regards

Pavel

 

Steve


2013/8/22 Peter Eisentraut <peter_e@gmx.net <mailto:peter_e@gmx.net>>


    On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
    > I redesigned output from plpgsql_check_function. Now, it returns
    table
    > everytime.
    > Litlle bit code simplification.

    This patch is in the 2013-09 commitfest but needs a rebase.








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
There is still valid a variant to move check function to contrib for fist moment.

Regards

Pavel


2013/12/7 Pavel Stehule <pavel.stehule@gmail.com>
Hello

thank you for review


2013/12/7 Steve Singer <steve@ssinger.info>
On 08/22/2013 02:02 AM, Pavel Stehule wrote:
rebased

Regards

Pavel

This patch again needs a rebase, it no longer applies cleanly. plpgsql_estate_setup now takes a shared estate parameter and the call in pl_check needs that.   I passed NULL in and things seem to work.



I think this is another example where are processes aren't working as we'd like.

The last time this patch got  a review was in March when Tom said http://www.postgresql.org/message-id/27661.1364267665@sss.pgh.pa.us

Shortly after Pavel responded with a new patch that changes the output format. http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_JjoP21FJmrQ@mail.gmail.com

I don't much has progress in the following  9 months on this patch.

In Tom's review the issue of code duplication and asks:

"do we want to accept that this is the implementation pathway we're going to settle for, or are we going to hold out for a better one? I'd be the first in line to hold out if I had a clue how to move towards a better implementation, but I have none."

yes, I am waiting still to a) have some better idea, or b) have significantly more free time for larger plpgsql refactoring. Nothing of it happens :(


This question is still outstanding.

Here are my comments on this version of the patch:



This version of the patch gives output as a table with the structure:

functionid | lineno | statement | sqlstate | message | detail | hint | level | position | query | context
------------+--------+-----------+----------+---------+--------+------+-------+----------+-------+---------

I like this much better than the previous format.

I think the patch needs more documentation.

The extent of the documentation is

 <title>Checking of embedded SQL</title>
+    <para>
+     The SQL statements inside <application>PL/pgSQL</> functions are
+     checked by validator for semantic errors. These errors
+     can be found by <function>plpgsql_check_function</function>:



I a don't think that this adequately describes the function.  It isn't clear to me what we mean by 'embedded' SQL.
and 'semantic errors'.


I think this would read better as

  <sect2 id="plpgsql-check-function">
   <title>Checking SQL Inside Functions</title>
   <para>
     The SQL Statements inside of <application>PL/pgsql</> functions can be
     checked by a validation function for semantic errors. You can check
     <application>PL/pgsl</application> functions by passing the name of the
     function to <function>plpgsql_check_function</function>.
   </para>
   <para>
   The <function>plpgsql_check_function</function> will check the SQL
   inside of a <application>PL/pgsql</application> function for certain
   types of errors and warnings.
   </para>

but that needs to be expanded on.

I was expecting something like:

create function x() returns void as $$ declare a int4; begin a='f'; end $$ language plpgsql;

to return an error but it doesn't

This is expected. PL/pgSQL use a late casting - so a := '10'; is acceptable. And in this moment plpgsql check doesn't evaluate constant and doesn't try to cast to target type. But this check can be implemented sometime. In this moment, we have no simple way how to identify constant expression in plpgsql. So any constants are expressions from plpgsql perspective.
 

but

create temp table x(a int4);
create or replace function x() returns void as $$ declare a int4; begin insert into x values('h'); end $$ language plpgsql;

it is expected too. SQL doesn't use late casting - casting is controlled by available casts and constants are evaluated early - due possible optimization.
 

does give an error when I pass it to the validator.   Is this the intended behavior of the patch? If so we need to explain why the first function doesn't generate an error.


I feel we need to document what each of the columns in the output means.  What is the difference between statement, query and context?

Some random comments on the messages you return:

Line 816:

        if (is_expression && tupdesc->natts != 1)
                        ereport(ERROR,
                                        (errcode(ERRCODE_SYNTAX_ERROR),
                                         errmsg("qw",
                                                        query->query,
                                                        tupdesc->natts)));

Should this be "query \"%s\" returned %d columns but only 1 is allowed"  or something similar?

 Line 837

        else
                        /* XXX: should be a warning? */
                        ereport(ERROR,
                                        (errmsg("cannot to identify real type for record type variable")));

In what situation does this come  up?  Should it be a warning or error?  Do we mean "the real type for record variable" ?

a most difficult part of plpgsql check function is about transformation dynamic types to know row types. It is necessary for checking usable functions and operators.

typical pattern is:

declare r record;
begin
  for r in select a, b from some_tab
  loop
    raise notice '%', extract(day from r.a);
  end loop;

and we should to detect type of r.a. Sometimes we cannot to do it due using dynamic SQL - plpgsql check doesn't try to evaluate expressions (as protection against side efects).


 

Line 1000
    ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+                                  errmsg("function does not return composite type, is not possible to identify composite type")));

Should this be "function does not return a composite type, it is not possible to identify the composite type" ?

Line 1109:

    appendStringInfo(&str, "parameter \"%s\" is overlapped",
+                                                      refname);

gives warnings like:

select message,detail FROM  plpgsql_check_function('b(int)');
           message           | detail
-----------------------------+--------------------------------------------
 parameter "a" is overlapped | Local variable overlap function parameter.


How about instead
"parameter "a" is duplicated" | Local variable is named the same as the function parameter

I have no idea about good sentence, but "duplicate" probably is not best. Any variable can be locally overlapped. Probably any overlapping should be signalized - it is a bad design always.
 


Line 1278:

+                         checker_error(cstate,
+                                       0, 0,
+                                 "cannot determinate a result of dynamic SQL",
+                                       "Cannot to contine in check.",
+                       "Don't use dynamic SQL and record type together, when you would check function.",
+                                       "warning",
+                                       0, NULL, NULL);

How about
"cannot determine the result of dynamic SQL" , "Cannot continue validating the function", "Do not use plpgsql_check_function on functions with dynamic SQL"
Also this limitation should be explained in the documentation.

I also thing we need to distinguish between warnings generated because of problems in the function versus warnings generated because of limitations in the validator.    This implies that there is maybe something wrong with my function but there is nothing wrong with using dynamic SQL in functions this is just telling users about a runtime warning of the validator itself.

Same thing around line 1427


I have not done an in-depth read of the code.

I'm sending this out this patch at least gets some review.  I don't think that I will  have a lot more time in the next week to do a more thorough review or follow-up review


If we aren't happy with the overal approach of this patch then we need to tell Pavel.

My vote would be to try to get the patch (documentation, error messages, 'XXX' items, etc) into a better state so it can eventually be committed


Thank you

Regards

Pavel

 

Steve


2013/8/22 Peter Eisentraut <peter_e@gmx.net <mailto:peter_e@gmx.net>>


    On Wed, 2013-03-27 at 23:25 +0100, Pavel Stehule wrote:
    > I redesigned output from plpgsql_check_function. Now, it returns
    table
    > everytime.
    > Litlle bit code simplification.

    This patch is in the 2013-09 commitfest but needs a rebase.








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: plpgsql_check_function - rebase for 9.3

From
Peter Eisentraut
Date:
In my opinion, the idea of having a separate lint checker for a language
is antiquated.  If there are problems, they should be diagnosed at
compile time or run time.  You can add options about warning levels or
strictness if there are concerns about which diagnostics are
appropriate.
>         
>         
>         
>         
> 
> 
> 




Re: plpgsql_check_function - rebase for 9.3

From
Peter Eisentraut
Date:
On Sun, 2013-12-08 at 09:45 +0100, Pavel Stehule wrote:
> There is still valid a variant to move check function to contrib for
> fist moment. 

If we are not happy with the code or the interface, then moving it to
contrib is not a solution.  We are still committed to main things in
contrib indefinitely.





Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:



2013/12/8 Peter Eisentraut <peter_e@gmx.net>
In my opinion, the idea of having a separate lint checker for a language
is antiquated.  If there are problems, they should be diagnosed at
compile time or run time.  You can add options about warning levels or
strictness if there are concerns about which diagnostics are
appropriate.

There are two points, that should be solved

a) introduction a dependency to other object in schema - now CREATE FUNCTION is fully independent on others

b) slow start - if we check all paths on start, then start can be slower - and some functions should not work due dependency on temporary tables.

I am thinking about possible marking a function by #option (we have same idea)

some like

#option check_on_first_start
#option check_on_create
#option check_newer

But still I have no idea, how to push check without possible slowdown execution with code duplication

Pavel

 
>
>
>
>
>
>
>


Re: plpgsql_check_function - rebase for 9.3

From
Amit Kapila
Date:
On Sun, Dec 8, 2013 at 10:31 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> 2013/12/8 Peter Eisentraut <peter_e@gmx.net>
>>
>> In my opinion, the idea of having a separate lint checker for a language
>> is antiquated.  If there are problems, they should be diagnosed at
>> compile time or run time.  You can add options about warning levels or
>> strictness if there are concerns about which diagnostics are
>> appropriate.
>
>
> There are two points, that should be solved
>
> a) introduction a dependency to other object in schema - now CREATE FUNCTION
> is fully independent on others
>
> b) slow start - if we check all paths on start, then start can be slower -
> and some functions should not work due dependency on temporary tables.
>
> I am thinking about possible marking a function by #option (we have same
> idea)
>
> some like
>
> #option check_on_first_start
> #option check_on_create
> #option check_newer

what exactly check_newer means, does it mean whenever a function is
replaced (changed)?

> But still I have no idea, how to push check without possible slowdown
> execution with code duplication


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:



2013/12/9 Amit Kapila <amit.kapila16@gmail.com>
On Sun, Dec 8, 2013 at 10:31 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> 2013/12/8 Peter Eisentraut <peter_e@gmx.net>
>>
>> In my opinion, the idea of having a separate lint checker for a language
>> is antiquated.  If there are problems, they should be diagnosed at
>> compile time or run time.  You can add options about warning levels or
>> strictness if there are concerns about which diagnostics are
>> appropriate.
>
>
> There are two points, that should be solved
>
> a) introduction a dependency to other object in schema - now CREATE FUNCTION
> is fully independent on others
>
> b) slow start - if we check all paths on start, then start can be slower -
> and some functions should not work due dependency on temporary tables.
>
> I am thinking about possible marking a function by #option (we have same
> idea)
>
> some like
>
> #option check_on_first_start
> #option check_on_create
> #option check_newer

what exactly check_newer means, does it mean whenever a function is
replaced (changed)?


no, it means, so request for check will be ignored ever - some functions cannot be deeply checked due using dynamic SQL or dynamic created data types - temporary tables created in functions.

Regards

Pavel
 
> But still I have no idea, how to push check without possible slowdown
> execution with code duplication


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: plpgsql_check_function - rebase for 9.3

From
Jim Nasby
Date:
On 12/8/13 11:24 PM, Pavel Stehule wrote:
>      > #option check_on_first_start
>      > #option check_on_create
>      > #option check_newer
>
>     what exactly check_newer means, does it mean whenever a function is
>     replaced (changed)?
>
>
> no, it means, so request for check will be ignored ever - some functions cannot be deeply checked due using dynamic
SQLor dynamic created data types - temporary tables created in functions.
 

So presumably it would be check_never, not check_newer... :) BTW, it's not terribly hard to work around the temp table
issue;you just need to create the expected table in the session when you create the function. But even in this case, I
thinkit would still be good to check what we can, like at least basic plpgsql syntax.
 

Do we really need first_start? ISTM that if you're dependent on run state then you're basically out of luck.
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:



2013/12/9 Jim Nasby <jim@nasby.net>
On 12/8/13 11:24 PM, Pavel Stehule wrote:
     > #option check_on_first_start
     > #option check_on_create
     > #option check_newer

    what exactly check_newer means, does it mean whenever a function is
    replaced (changed)?


no, it means, so request for check will be ignored ever - some functions cannot be deeply checked due using dynamic SQL or dynamic created data types - temporary tables created in functions.

So presumably it would be check_never, not check_newer... :) BTW, it's not terribly hard to work around the temp table issue; you just need to create the expected table in the session when you create the function. But even in this case, I think it would still be good to check what we can, like at least basic plpgsql syntax.

I sorry.

You cannot to create temporary table - this check should not have any side effect - and creating temporary table can run some event trigger.

But there should be some hints for check like annotations or some similar. Or you can minimize a area where check will be disabled.
 

Do we really need first_start? ISTM that if you're dependent on run state then you're basically out of luck.


I afraid so checking on creation time is not enough for plpgsql.

and I have a very good experience with check on start from plpgsql_lint usage when I wrote a regression tests.

A "first start" doesn't create dependency on state - but just more preciously define a time, when checking will be done. Probably a option check_create_and_start can be useful.

Regards

Pavel
 
--
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: plpgsql_check_function - rebase for 9.3

From
Jim Nasby
Date:
On 12/9/13 1:08 PM, Pavel Stehule wrote:
>     So presumably it would be check_never, not check_newer... :) BTW, it's not terribly hard to work around the temp
tableissue; you just need to create the expected table in the session when you create the function. But even in this
case,I think it would still be good to check what we can, like at least basic plpgsql syntax.
 
>
>
> I sorry.
>
> You cannot to create temporary table - this check should not have any side effect - and creating temporary table can
runsome event trigger.
 
>
> But there should be some hints for check like annotations or some similar. Or you can minimize a area where check
willbe disabled.
 

Sorry, I meant that the user can work around it by creating the table. I didn't mean to imply that we would magically
createa temp table to do the checking.
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql_check_function - rebase for 9.3

From
Peter Eisentraut
Date:
On 12/8/13, 12:01 PM, Pavel Stehule wrote:
> But still I have no idea, how to push check without possible slowdown
> execution with code duplication

Create a GUC parameter plpgsql.slow_checks or whatever (perhaps more
specific), which people can turn on when they run their test suites.

This doesn't really have to be all that much different from what we are
currently doing in C with scan-build and address sanitizer, for example.




Re: plpgsql_check_function - rebase for 9.3

From
Amit Kapila
Date:
On Mon, Dec 9, 2013 at 10:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2013/12/9 Amit Kapila <amit.kapila16@gmail.com>
>>
>> >
>> > There are two points, that should be solved
>> >
>> > a) introduction a dependency to other object in schema - now CREATE
>> > FUNCTION
>> > is fully independent on others
>> >
>> > b) slow start - if we check all paths on start, then start can be slower
>> > -
>> > and some functions should not work due dependency on temporary tables.
>> >
>> > I am thinking about possible marking a function by #option (we have same
>> > idea)
>> >
>> > some like
>> >
>> > #option check_on_first_start
>> > #option check_on_create
>> > #option check_newer
>>
>> what exactly check_newer means, does it mean whenever a function is
>> replaced (changed)?
>>
>
> no, it means, so request for check will be ignored ever - some functions
> cannot be deeply checked due using dynamic SQL or dynamic created data types
> - temporary tables created in functions.

Thanks for clarification, the part of name 'newer' has created
confusion. I understand
that creating/identifying dependency in some of the cases will be
quite tricky, does other
similar languages for other databases does that for all cases (objects
in dynamic statements).

Is the main reason for identifying/creating dependency is to mark
function as invalid when
any dependent object is Dropped/Altered?

This thread is from quite some time, so please excuse me if I had
asked anything which has
been discussed previously.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
<div dir="ltr"><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">2013/12/9 Peter Eisentraut <span
dir="ltr"><<ahref="mailto:peter_e@gmx.net" target="_blank">peter_e@gmx.net</a>></span><br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 12/8/13,
12:01PM, Pavel Stehule wrote:<br /> > But still I have no idea, how to push check without possible slowdown<br />
>execution with code duplication<br /><br /></div>Create a GUC parameter plpgsql.slow_checks or whatever (perhaps
more<br/> specific), which people can turn on when they run their test suites.<br /><br /> This doesn't really have to
beall that much different from what we are<br /> currently doing in C with scan-build and address sanitizer, for
example.<br/><br /></blockquote></div><br /></div><div class="gmail_extra">A main issue is placing these tests on
criticalpath. <br /><br /></div><div class="gmail_extra">You have to check it in every expression, in every internal
switch<br /><br /></div><div class="gmail_extra">There are two main purposes<br /><br /></div><div
class="gmail_extra">a)ensure a expression/query is valid (not only syntax valid)<br /></div><div class="gmail_extra">b)
searchall expressions/queries - visit all possible paths in code.<br /><br /></div><div class="gmail_extra">so you
shouldto place new switch everywhere in plpgsql executor, where is entry to some path.<br /></div><div
class="gmail_extra"><br/></div><div class="gmail_extra">Second issue - these check decrease a readability of plpgsql
statementexecutor handlers. This code is relative very readable now. With new switch is little bit (more) less
clean.<br/><br /></div><div class="gmail_extra">I think so fact we use a two other large statement switch (printing,
freeexpressions) is natural, and it hard to write it better.<br /><br /></div><div class="gmail_extra">Regards<br /><br
/></div><divclass="gmail_extra">Pavel<br /></div></div> 

Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:



2013/12/10 Amit Kapila <amit.kapila16@gmail.com>
On Mon, Dec 9, 2013 at 10:54 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2013/12/9 Amit Kapila <amit.kapila16@gmail.com>
>>
>> >
>> > There are two points, that should be solved
>> >
>> > a) introduction a dependency to other object in schema - now CREATE
>> > FUNCTION
>> > is fully independent on others
>> >
>> > b) slow start - if we check all paths on start, then start can be slower
>> > -
>> > and some functions should not work due dependency on temporary tables.
>> >
>> > I am thinking about possible marking a function by #option (we have same
>> > idea)
>> >
>> > some like
>> >
>> > #option check_on_first_start
>> > #option check_on_create
>> > #option check_newer
>>
>> what exactly check_newer means, does it mean whenever a function is
>> replaced (changed)?
>>
>
> no, it means, so request for check will be ignored ever - some functions
> cannot be deeply checked due using dynamic SQL or dynamic created data types
> - temporary tables created in functions.

Thanks for clarification, the part of name 'newer' has created
confusion. I understand
that creating/identifying dependency in some of the cases will be
quite tricky, does other
similar languages for other databases does that for all cases (objects
in dynamic statements).

I am sorry

PL/pgSQL is really specific due invisible SQL base and mixing two environments and languages in user visible area.

Is the main reason for identifying/creating dependency is to mark
function as invalid when
any dependent object is Dropped/Altered?

Now, PG has no any tool for checking dependency between functions and other objects. What has positive effects - we have very simply deploying, that works in almost use cases very well - and works with our temporary tables implementation. There is simple rule - depended object must living before they are >>used in runtime<<. But checking should not be runtime event and we would to decrease a false alarms. So we have to expect so almost all necessary object are created - it is reason, why we decided don't do check in create function statement time. We don't would to introduce new dependency if it will be possible.
 
Regards

Pavel


This thread is from quite some time, so please excuse me if I had
asked anything which has
been discussed previously.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: plpgsql_check_function - rebase for 9.3

From
Robert Haas
Date:
On Tue, Dec 10, 2013 at 1:45 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Now, PG has no any tool for checking dependency between functions and other
> objects. What has positive effects - we have very simply deploying, that
> works in almost use cases very well - and works with our temporary tables
> implementation. There is simple rule - depended object must living before
> they are >>used in runtime<<. But checking should not be runtime event and
> we would to decrease a false alarms. So we have to expect so almost all
> necessary object are created - it is reason, why we decided don't do check
> in create function statement time. We don't would to introduce new
> dependency if it will be possible.

This is a very good point.  Annotating the function itself with
markers that cause it to be more strictly checked will create a
dump/reload problem that we won't enjoy solving.  The decision to
check the function more strictly or not would need to be based on some
kind of session state that users could establish but dump restore
would not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> This is a very good point.  Annotating the function itself with
> markers that cause it to be more strictly checked will create a
> dump/reload problem that we won't enjoy solving.  The decision to
> check the function more strictly or not would need to be based on some
> kind of session state that users could establish but dump restore
> would not.

One would hope that turning off check_function_bodies would be sufficient
to disable any added checking, though, so I don't see this being a problem
for pg_dump.  But there might be other scenarios where an additional knob
would be useful.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
Josh Berkus
Date:
On 12/10/2013 12:50 PM, Tom Lane wrote:
> One would hope that turning off check_function_bodies would be sufficient
> to disable any added checking, though, so I don't see this being a problem
> for pg_dump.  But there might be other scenarios where an additional knob
> would be useful.

I can't think of one, offhand.  And +1 for NOT adding a new knob.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: plpgsql_check_function - rebase for 9.3

From
Amit Kapila
Date:
On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2013/12/10 Amit Kapila <amit.kapila16@gmail.com>
>> On Mon, Dec 9, 2013 at 10:54 AM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>> > 2013/12/9 Amit Kapila <amit.kapila16@gmail.com>
>> >> > There are two points, that should be solved
>> >> >
>> >> > a) introduction a dependency to other object in schema - now CREATE
>> >> > FUNCTION
>> >> > is fully independent on others
>> >> >
>> >> > b) slow start - if we check all paths on start, then start can be
>> >> > slower
>>>>
>> >> > some like
>> >> >
>> >> > #option check_on_first_start
>> >> > #option check_on_create
>> >> > #option check_newer
>> >>
>> >> what exactly check_newer means, does it mean whenever a function is
>> >> replaced (changed)?
>> >>
>> >
>> > no, it means, so request for check will be ignored ever - some functions
>> > cannot be deeply checked due using dynamic SQL or dynamic created data
>> > types
>> > - temporary tables created in functions.
>>
>
>
> Now, PG has no any tool for checking dependency between functions and other
> objects. What has positive effects - we have very simply deploying, that
> works in almost use cases very well - and works with our temporary tables
> implementation. There is simple rule - depended object must living before
> they are >>used in runtime<<. But checking should not be runtime event and
> we would to decrease a false alarms. So we have to expect so almost all
> necessary object are created - it is reason, why we decided don't do check
> in create function statement time.

Isn't that already done for SQL function's (fmgr_sql_validator)?

postgres=# CREATE FUNCTION clean_emp() RETURNS void AS
postgres'#     DELETE FROM emp
postgres'#         WHERE salary < 0;
postgres'# ' LANGUAGE SQL;
ERROR:  relation "emp" does not exist
LINE 2:     DELETE FROM emp                       ^

I mean to say that the above rule stated by you ("There is simple rule
- depended object must living before
they are >>used in runtime<<") doesn't seem to be true for SQL functions.
So isn't it better to do same for plpgsql functions as well?

For doing at runtime (during first execution of function) are you
planing to add it as a extra step
such that if parameter check_on_first_start is set, then do it.

>We don't would to introduce new dependency if it will be possible. In that case what exactly you mean to say in point
a)("introduction
 
a dependency to other object..") above in you mail.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: plpgsql_check_function - rebase for 9.3

From
Tom Lane
Date:
Amit Kapila <amit.kapila16@gmail.com> writes:
> On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> Now, PG has no any tool for checking dependency between functions and other
>> objects.

> Isn't that already done for SQL function's (fmgr_sql_validator)?

Pavel's point is that the only way to find out if the validator will fail
is to run it and see if it fails; and even if it does, how much will you
know about why?  That's not particularly helpful for pg_dump, which needs
to understand dependencies in a fairly deep fashion.  It not only needs to
figure out how to dump the database objects in a dependency-safe order,
but what to do to break dependency loops.

Right now I believe that pg_dump has a workable strategy for every
possible case of circular dependencies, because they are all caused by
secondary attributes of objects that can be split out and applied later,
for example applying a column default via ALTER TABLE ALTER COLUMN SET
DEFAULT rather than listing the default right in the CREATE TABLE command.

However, if function A depends on B and also vice-versa (mutual recursion
is not exactly an unheard-of technique), there is no way to load them both
if the function bodies are both checked at creation time.

I guess we could invent some equivalent of a forward declaration, but
that still leaves us short of understanding what the function body is
depending on.
        regards, tom lane



Re: plpgsql_check_function - rebase for 9.3

From
Amit Kapila
Date:
On Wed, Dec 11, 2013 at 10:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Kapila <amit.kapila16@gmail.com> writes:
>> On Tue, Dec 10, 2013 at 12:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> Now, PG has no any tool for checking dependency between functions and other
>>> objects.
>
>> Isn't that already done for SQL function's (fmgr_sql_validator)?
>
> Pavel's point is that the only way to find out if the validator will fail
> is to run it and see if it fails; and even if it does, how much will you
> know about why?
  One of the important thing at time of function creation, users are
interested in knowing  is that if there are any objects (table/view/sequence ..) that are
used in function body  and are missing and the reason is I think they don't want such
things to come up during execution.
  Similar thing happens for prepared statements in PostgreSQL, like
at time of parse message  only it checks both syntax errors and semantic check (which ensures
statement is meaningful,  for ex. whether objects and columns used in the statements exist)
  Like we do checks other than syntax check at time of creation of
prepared statement, same  thing should be considered meaning full at time of function creation.
  As you mentioned, there are checks (like dependency, mutual
recursion) which are difficult or not  feasible in current design to perform, but so will be the case for
them to execute during first execution  of function. So is it not better to do what is more feasible during
function creation rather than leaving  most of the things at execution phase?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: plpgsql_check_function - rebase for 9.3

From
Pavel Stehule
Date:
<p dir="ltr">I dislike it. a too early check means a issues with temporary tables and mainy new dependency between
functionsin complex projects. It is some what we don't want.<div class="gmail_quote">Dne 12. 12. 2013 5:30 "Amit
Kapila"<<a href="mailto:amit.kapila16@gmail.com">amit.kapila16@gmail.com</a>> napsal(a):<br type="attribution"
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On Wed, Dec 11,
2013at 10:40 AM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br /> > Amit
Kapila<<a href="mailto:amit.kapila16@gmail.com">amit.kapila16@gmail.com</a>> writes:<br /> >> On Tue, Dec
10,2013 at 12:15 PM, Pavel Stehule <<a href="mailto:pavel.stehule@gmail.com">pavel.stehule@gmail.com</a>>
wrote:<br/> >>> Now, PG has no any tool for checking dependency between functions and other<br /> >>>
objects.<br/> ><br /> >> Isn't that already done for SQL function's (fmgr_sql_validator)?<br /> ><br />
>Pavel's point is that the only way to find out if the validator will fail<br /> > is to run it and see if it
fails;and even if it does, how much will you<br /> > know about why?<br /><br />    One of the important thing at
timeof function creation, users are<br /> interested in knowing<br />    is that if there are any objects
(table/view/sequence..) that are<br /> used in function body<br />    and are missing and the reason is I think they
don'twant such<br /> things to come up during execution.<br /><br />    Similar thing happens for prepared statements
inPostgreSQL, like<br /> at time of parse message<br />    only it checks both syntax errors and semantic check (which
ensures<br/> statement is meaningful,<br />    for ex. whether objects and columns used in the statements exist)<br
/><br/>    Like we do checks other than syntax check at time of creation of<br /> prepared statement, same<br />  
 thingshould be considered meaning full at time of function creation.<br /><br />    As you mentioned, there are checks
(likedependency, mutual<br /> recursion) which are difficult or not<br />    feasible in current design to perform, but
sowill be the case for<br /> them to execute during first execution<br />    of function. So is it not better to do
whatis more feasible during<br /> function creation rather than leaving<br />    most of the things at execution
phase?<br/><br /><br /> With Regards,<br /> Amit Kapila.<br /> EnterpriseDB: <a href="http://www.enterprisedb.com"
target="_blank">http://www.enterprisedb.com</a><br/></blockquote></div>