Thread: Review: prepare plans of embedded sql on function start

Review: prepare plans of embedded sql on function start

From
Andy Colson
Date:
Pavel, this patch:

https://commitfest.postgresql.org/action/patch_view?id=624

It applied clean and compiled ok, but I cannot get it to work at all.

$ psql
Timing is on.
psql (9.2devel)
Type "help" for help.

andy=# set plpgsql.prepare_plans to on_start;
ERROR:  unrecognized configuration parameter "plpgsql.prepare_plans"

It was also really upset when I added it to my postgresql.conf file.

I hate to split hairs, but the GUC having option on_start and on_demand seems weird.  Most everything else is a yes/no.
How'd you feel about renaming it to: prepare_plans_on_start = yes/no
 

But really its not start (start might imply you call the function and it starts executing), its on create, so maybe:
prepare_plans_on_create= yes/no
 

-Andy


Re: Review: prepare plans of embedded sql on function start

From
Andrew Dunstan
Date:

On 09/05/2011 05:03 PM, Andy Colson wrote:
> Pavel, this patch:
>
> https://commitfest.postgresql.org/action/patch_view?id=624
>
> It applied clean and compiled ok, but I cannot get it to work at all.
>
> $ psql
> Timing is on.
> psql (9.2devel)
> Type "help" for help.
>
> andy=# set plpgsql.prepare_plans to on_start;
> ERROR:  unrecognized configuration parameter "plpgsql.prepare_plans"
>
>

Did you add plpgsql to custom_variable_classes? It looks like you might 
not have. (I'm not sure why plpgsql switch should require one, though, 
especially since we now load plpgsql by default. It might be better just 
to call it plpgsql_prepare_on_start.)

cheers

andrew



Re: Review: prepare plans of embedded sql on function start

From
Andy Colson
Date:
On 09/05/2011 05:04 PM, Andrew Dunstan wrote:
>
>
> On 09/05/2011 05:03 PM, Andy Colson wrote:
>> Pavel, this patch:
>>
>> https://commitfest.postgresql.org/action/patch_view?id=624
>>
>> It applied clean and compiled ok, but I cannot get it to work at all.
>>
>> $ psql
>> Timing is on.
>> psql (9.2devel)
>> Type "help" for help.
>>
>> andy=# set plpgsql.prepare_plans to on_start;
>> ERROR: unrecognized configuration parameter "plpgsql.prepare_plans"
>>
>>
>
> Did you add plpgsql to custom_variable_classes? It looks like you might not have. (I'm not sure why plpgsql switch
shouldrequire one, though, especially since we now load plpgsql by default. It might be better just to call it
plpgsql_prepare_on_start.)
>
> cheers
>
> andrew
>
>

Ah, yep, that was the problem, thank you.

-Andy


Re: Review: prepare plans of embedded sql on function start

From
Andy Colson
Date:
On 09/05/2011 05:27 PM, Andy Colson wrote:
> On 09/05/2011 05:04 PM, Andrew Dunstan wrote:
>>
>>
>> On 09/05/2011 05:03 PM, Andy Colson wrote:
>>> Pavel, this patch:
>>>
>>> https://commitfest.postgresql.org/action/patch_view?id=624
>>>
>>> It applied clean and compiled ok, but I cannot get it to work at all.
>>>
>>> $ psql
>>> Timing is on.
>>> psql (9.2devel)
>>> Type "help" for help.
>>>
>>> andy=# set plpgsql.prepare_plans to on_start;
>>> ERROR: unrecognized configuration parameter "plpgsql.prepare_plans"
>>>
>>>
>>
>> Did you add plpgsql to custom_variable_classes? It looks like you might not have. (I'm not sure why plpgsql switch
shouldrequire one, though, especially since we now load plpgsql by default. It might be better just to call it
plpgsql_prepare_on_start.)
>>
>> cheers
>>
>> andrew
>>
>>
>
> Ah, yep, that was the problem, thank you.
>
> -Andy
>


However I still cannot get it to work.

andy=# set plpgsql.prepare_plans to on_start;
SET
Time: 0.123 ms
andy=# show plpgsql.prepare_plans; plpgsql.prepare_plans
----------------------- on_start
(1 row)


andy=# create or replace function test1(a integer) returns integer as $$
andy$# begin
andy$# return b+1;
andy$# end;
andy$# $$ language plpgsql;
CREATE FUNCTION
Time: 16.926 ms
andy=#


Oh... shoot, having gone back and read more closely I realize I didnt understand.  I thought the sql would be checked
oncreate.  That's not the case.
 

This is what I'd hopped it was:

create table junk1 (id serial,code1 integer,
);

create or replace function test2() returns integer as $$
declarex integer;
beginselect bob into x from junk1 where id = 4;return x;
end;
$$ language plpgsql;

I was thinking the create function would immediately return saying, unknown column bob, and not create the function.

So now with the function above, this patch has not helped me at all.  I wont get an error until I exec the function.
Justlike without the patch.
 

I'm not so sure how helpful that is.  What is you use the "if false then ... end if" trick to comment out some old
code? You're sill going to check the tables and fields on every exec?
 

Pavel, is there any way to move all that code to the create function?  But, then that would create a dependency where
thereis not one now.  So that would be bad.
 

How about a new "check function test2()" type of call?  I think having the tables/fields checked just once would be
betterthan checking them over and over on ever single execute.
 

-Andy


Re: Review: prepare plans of embedded sql on function start

From
Andy Colson
Date:
Hi Pavel,

I can get:

ERROR:  permission denied to set parameter "plpgsql.prepare_plans"

with this script:


set plpgsql.prepare_plans to on_start;

create or replace function test1(a integer) returns integer as $$
begin
return a+1;
end;
$$ language plpgsql;


If test1() exists, then this script works fine:

select * from test1(1);

set plpgsql.prepare_plans to on_start;

create or replace function test1(a integer) returns integer as $$
begin
return a+1;
end;
$$ language plpgsql;


-Andy


Re: [REVIEW] prepare plans of embedded sql on function start

From
Andy Colson
Date:
Purpose
========
Better test coverage of functions.  On first call of a function, all sql statements will be prepared, even those not
directlycalled.  Think:
 

create function test() returns void as $$
begin  if false then    select badcolumn from badtable;  end if;
end; $$ language plpgsql;


At first I thought this patch would check sql on create, but that would create a dependency, which would be bad.
Before, if you called this function, you'd get no error.  With this patch, and with postgresql.conf settings enabled,
youget:
 

select * from test();

ERROR:  relation "badtable" does not exist
LINE 1: select badcolumn from badtable                              ^
QUERY:  select badcolumn from badtable
CONTEXT:  PL/pgSQL function "test" line 4 at SQL statement



The Patch
=========
Applied ok, compile and make check ran ok.  It seems to add/edit regression tests, but no documentation.

I tried several different things I could think of, but it always found my bugs.  Its disabled by default so wont cause
unexpectedchanges.  Its easy to enable, and to have individual functions exempt themselves.
 



Performance
===========
No penalty.  At first I was concerned every function call would have overhead of extra preparing, but that is not the
case. It's prepared on first call but not subsequent calls.  But that made me worry that the prepare would exist too
longand use old outdated stats, that as well is not a problem.  I was able to setup a test where a bad index was
chosen. I used two different psql sessions.  In one I started a transaction, and selected from my function several
times(and it was slow because it was using a bad index).  In the other psql session I ran analyze on my table.  Back in
myfirst psql session, I just waited, running my function ever once and a while.  Eventually it picked up the new stats
andstart running quick again.
 




Code Review
===========
I am not qualified


Problems
========
I like the idea of this patch.  I think it will help catch more bugs in functions sooner.  However, a function like:

create function test5() returns integer as $$
begin  create temp table junk(id integer);  insert into junk(id) values(100);  drop table temp;  return 1;
end;
$$ language plpgsql;

Will always throw an error because at prepare time, the temp junk table wont exist.  This patch implements new syntax
todisable the check:
 

create function test5() returns integer as $$
#prepare_plans on_demand
begin
...

Was it Tom Lane that said, "if we add new syntax, we have to support it forever"?  As a helpful feature I can see
people(myself included) enabling this system wide.  So what happens to all the plpgsql on pgxn that this happens to
break? Well it needs updated, no problem, but the fix will be to add "#prepare_plans on_demand" in the magic spot.
Thatbreaks it for prior versions of PG.  Is this the syntax we want?  What if we add more "compiler flags" in the
future:

create function test5() returns integer as $$
#prepare_plans on_demand
#disable_xlog
#work_mem 10MB
begin  create temp table junk(id integer);  insert into junk(id) values(100);  drop table temp;  return 1;
end;
$$ language plpgsql;

I don't have an answer to that.  Other sql implement via OPTION(...).

One option I'd thought about, was to extended ANALYZE to support functions.  It would require no additional plpgsql
syntaxchanges, no postgresql.conf settings.  If you wanted to prepare (prepare for a testing purpose, not a performance
purpose)all the sql inside your function, youd:
 

analyze test5();

I'd expect to get errors from that, because the junk table doesn't exist.  I'd expect it, and just never analyze it.



Summary
=======
Its a tough one.  I see benefit here.  I can see myself using it.  If I had to put my finger on it, I'm not 100% sold
onthe syntax.  It is usable though, it does solve problems, so I'd use it.  (I'm not 100% sure ANALYZE is better,
either).

I'm going to leave this patch as "needs review", I think more eyes might be helpful.

-Andy


Re: [REVIEW] prepare plans of embedded sql on function start

From
Tom Lane
Date:
Andy Colson <andy@squeakycode.net> writes:
> [ Andy's dubious about adding plpgsql syntax to control this feature ]

Yeah, that bothers me a lot too.

> One option I'd thought about, was to extended ANALYZE to support functions.

That's actually quite a good idea, not least because the extra checking
happens only when you ask for it and not every time the function is
loaded into a new session.

I'm not that happy with overloading the ANALYZE keyword to mean this
(especially not since there is already meaning attached to the syntax
"ANALYZE x(y)").  But we could certainly use some other name --- I'm
inclined to suggest CHECK:
CHECK FUNCTION function_name(arglist);

People would want some sort of wild card capability; at the very least
"check all plpgsql functions owned by me".  Not sure what that ought
to look like syntactically.

It might also be a good idea to make sure there's room in the syntax to
specify different checking options.  We already would have reason to
want "just do the existing style of validation check" versus this more
intensive check.  And it's not hard to foresee other sorts of checking
in future.

Also, this would force us to invent PL-independent infrastructure for
doing the checking.  I'm envisioning an additional argument to the
existing PL validator function that tells it what checking options to
use.
        regards, tom lane


Re: [REVIEW] prepare plans of embedded sql on function start

From
Marti Raudsepp
Date:
On Sun, Sep 11, 2011 at 01:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not that happy with overloading the ANALYZE keyword to mean this
> But we could certainly use some other name --- I'm
> inclined to suggest CHECK:
>        CHECK FUNCTION function_name(arglist);

Just a thought: pg_check_function(oid)?

> People would want some sort of wild card capability; at the very least
> "check all plpgsql functions owned by me".

SELECT pg_check_function(p.oid) FROM pg_proc p
JOIN pg_user ON (usesysid=proowner) WHERE usename=current_user;

Regards,
Marti


Re: [REVIEW] prepare plans of embedded sql on function start

From
Tom Lane
Date:
Marti Raudsepp <marti@juffo.org> writes:
> On Sun, Sep 11, 2011 at 01:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm not that happy with overloading the ANALYZE keyword to mean this
>> But we could certainly use some other name --- I'm
>> inclined to suggest CHECK:
>>        CHECK FUNCTION function_name(arglist);

> Just a thought: pg_check_function(oid)?

>> People would want some sort of wild card capability; at the very least
>> "check all plpgsql functions owned by me".

> SELECT pg_check_function(p.oid) FROM pg_proc p
> JOIN pg_user ON (usesysid=proowner) WHERE usename=current_user;

Hmm, there's something in what you say --- it gets us out from under the
need to anticipate what wildcard rules people would want.  I think it
loses something in ease-of-use, but we could have the utility command
too for the simple check-one-function case, and direct people to the
function as soon as they want something fancier.

What about check-strength options?
        regards, tom lane


Re: [REVIEW] prepare plans of embedded sql on function start

From
Pavel Stehule
Date:
Hello

thank you very much for review

> Will always throw an error because at prepare time, the temp junk table wont
> exist.  This patch implements new syntax to disable the check:
>
> create function test5() returns integer as $$
> #prepare_plans on_demand
> begin
> ...
>
> Was it Tom Lane that said, "if we add new syntax, we have to support it
> forever"?  As a helpful feature I can see people (myself included) enabling
> this system wide.  So what happens to all the plpgsql on pgxn that this
> happens to break?  Well it needs updated, no problem, but the fix will be to
> add "#prepare_plans on_demand" in the magic spot.  That breaks it for prior
> versions of PG.  Is this the syntax we want?  What if we add more "compiler
> flags" in the future:
>
> create function test5() returns integer as $$
> #prepare_plans on_demand
> #disable_xlog
> #work_mem 10MB
> begin
>  create temp table junk(id integer);
>  insert into junk(id) values(100);
>  drop table temp;
>  return 1;
> end;
> $$ language plpgsql;
>

I am sure, so we will support a plan based statements inside PL very
long time. But this is not only one way, how to change a behave. You
can use a plpgsql.prepare_plans variable too. Theoretically We can
live without prepare_plans option - it doesn't modify a function's
behave - so it must not be part of function body. But I think, so it
is more readable - and it stronger warning for developers -
"attention, this function was not checked deeply". I have no problem
to remove this option, and use only a custom GUC

> I don't have an answer to that.  Other sql implement via OPTION(...).
>
> One option I'd thought about, was to extended ANALYZE to support functions.
>  It would require no additional plpgsql syntax changes, no postgresql.conf
> settings.  If you wanted to prepare (prepare for a testing purpose, not a
> performance purpose) all the sql inside your function, youd:
>
> analyze test5();

I am not against to some analogy to what you mean - just dislike a use
of ANALYZE keyword. Just static check has a one significant
disadvantage - we should not to identify types of NEW and OLD
variables in triggers.



>
> I'd expect to get errors from that, because the junk table doesn't exist.
>  I'd expect it, and just never analyze it.
>
>
>
> Summary
> =======
> Its a tough one.  I see benefit here.  I can see myself using it.  If I had
> to put my finger on it, I'm not 100% sold on the syntax.  It is usable
> though, it does solve problems, so I'd use it.  (I'm not 100% sure ANALYZE
> is better, either).
>
> I'm going to leave this patch as "needs review", I think more eyes might be
> helpful.
>
> -Andy
>

Thank you very much

Pavel Stehule


Re: [REVIEW] prepare plans of embedded sql on function start

From
Pavel Stehule
Date:
Hello

2011/9/11 Tom Lane <tgl@sss.pgh.pa.us>:
> Andy Colson <andy@squeakycode.net> writes:
>> [ Andy's dubious about adding plpgsql syntax to control this feature ]
>
> Yeah, that bothers me a lot too.
>

I like to discussion about syntax - a name "prepare_plans" and
following list is just one (for me - practical) shot. I am sure so
preparing all plans on function start is one functionality what we
want - because it can to do early warnings when some in environments
is not well. And if I remember well, there was one tool that does it
too, but a goal was different - they wanted a faster function
execution in production usage - without "slower" first call. The
overhead of check walker is minimal.

>> One option I'd thought about, was to extended ANALYZE to support functions.
>
> That's actually quite a good idea, not least because the extra checking
> happens only when you ask for it and not every time the function is
> loaded into a new session.
>
> I'm not that happy with overloading the ANALYZE keyword to mean this
> (especially not since there is already meaning attached to the syntax
> "ANALYZE x(y)").  But we could certainly use some other name --- I'm
> inclined to suggest CHECK:
>
>        CHECK FUNCTION function_name(arglist);
>

I proposed a stored procedure "check_function(name, arglist)", but
CHECK FUNCTION is ok for me too. Is easy implement it. Maybe there is
issue - "CHECK" will be a keyword :(

> People would want some sort of wild card capability; at the very least
> "check all plpgsql functions owned by me".  Not sure what that ought
> to look like syntactically.

I don't think. Now (when I looking around me) a owner of functions are
some abstract role. It is terrible to maintain a system where database
objects has a different roles. I can expect a request for check all
functions from some schema or all functions related to some PL.

>
> It might also be a good idea to make sure there's room in the syntax to
> specify different checking options.  We already would have reason to
> want "just do the existing style of validation check" versus this more
> intensive check.  And it's not hard to foresee other sorts of checking
> in future.
>

+1

There is possible check of RAISE statement params and maybe similar.

> Also, this would force us to invent PL-independent infrastructure for
> doing the checking.  I'm envisioning an additional argument to the
> existing PL validator function that tells it what checking options to
> use.
>

+1

yup. But there is query - need we a new special statement?

cannot we enhance a CREATE OR REPLACE FUNCTION statement? Cannot we
enhance of syntax or cannot we enhance of behave.

or has sense to have two statements CREATE FUNCTION and CHECK
FUNCTION? I see a sense. CHECK FUNCTION should be ??parametrized?? One
idea - CHECK FUNCTION can have a own independent hooks on PL hooks.

Resume:

* We want to deep check plans when function is started

* I like a CHECK FUNCTION statement - see a possibilities - but I am
not sure if PL developers will like it too. It means start two
statements - it just idea - what about CREATE OR REPLACE FUNCTION
blabla() ... IMMUTABLE STRICT >>CHECK<< - so CREATE STATEMENT can
optionally to call CHECK statement

>                        regards, tom lane
>

there is still task - what with trigger's functions

Regards

Pavel Stehule


Re: [REVIEW] prepare plans of embedded sql on function start

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I like to discussion about syntax - a name "prepare_plans" and
> following list is just one (for me - practical) shot. I am sure so
> preparing all plans on function start is one functionality what we
> want - because it can to do early warnings when some in environments
> is not well.

I don't think it *is* functionality that we want.  From a performance
standpoint it's certainly not a win: there is no savings from preparing
all the plans at the same time, and you can lose if there are code paths
that are never executed.  The only argument for doing it is more
extensive checking.

And on the checking front, I like Andy's idea better than the original.
There is not value in repeating a checking effort in every single
process when nothing is changing.  Furthermore, because of the problem
of the checks breaking functions that contain DDL, you have to have a
way of disabling it and you have to worry about setting up permission
mechanisms to restrict who can break whose functions, something that
will never work very smoothly IMO.  (Certainly, having to be superuser
to enable checking is not a point in favor of your design.)  Pushing the
checking out as a separately invokable operation neatly bypasses both of
those problems.  Right offhand I'm not sure we need any permission
restrictions at all on a CHECK operation, but at the worst, permission
to call the function ought to be enough.

> there is still task - what with trigger's functions

"CHECK TRIGGER name ON tablename" could be syntactic sugar for calling
the validator on the trigger's function and passing it sufficient
information to set up the trigger arguments properly.
        regards, tom lane


Re: [REVIEW] prepare plans of embedded sql on function start

From
Pavel Stehule
Date:
>>        CHECK FUNCTION function_name(arglist);
>>
>
> I proposed a stored procedure "check_function(name, arglist)", but
> CHECK FUNCTION is ok for me too. Is easy implement it. Maybe there is
> issue - "CHECK" will be a keyword :(
>

CHECK is reserved keyword now, so this is issue.

sorry for noise

Pavel


Re: [REVIEW] prepare plans of embedded sql on function start

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> I'm not that happy with overloading the ANALYZE keyword to mean this
> (especially not since there is already meaning attached to the syntax
> "ANALYZE x(y)").  But we could certainly use some other name --- I'm
> inclined to suggest CHECK:
>
>     CHECK FUNCTION function_name(arglist);

This looks as good as it gets, but as we proposed some new behaviors for
ANALYZE in the past, I though I would bounce them here again for you to
decide about the overall picture.

The idea (that didn't get much traction at the time) was to support
ANALYZE on VIEWS so that we have statistics support for multi-columns or
any user given join.  The very difficult part about that is to be able
to match those stats we would have against a user SQL query.

But such a matching has been talked about in other contexts, it seems to
me, so the day we have that capability we might want to add ANALYZE
support to VIEWS.  ANALYZE could then support tables, indexes, views and
functions, and maybe some more database objects in the future.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: [REVIEW] prepare plans of embedded sql on function start

From
Pavel Stehule
Date:
Hello

I started work on proposed check statement option and there are a few questions?

what is sense of this statement for others PL? When we solve a mainly
PL/pgSQL issue, has sense to implement new statement? Isn't a some
problem in our CREATE FUNCTION design? A separation to two steps
should has a little bit strange behave - we cannot to check a function
before their registration (we can, but we should to do a some game
with names) - there is necessary some a conditional CREATE FUNCTION
statement - some like "CREATE CHECKED FUNCTION " or CHECK FUNCTION
with function body.

comments?

Regards

Pavel Stehule


2011/9/11 Tom Lane <tgl@sss.pgh.pa.us>:
> Andy Colson <andy@squeakycode.net> writes:
>> [ Andy's dubious about adding plpgsql syntax to control this feature ]
>
> Yeah, that bothers me a lot too.
>
>> One option I'd thought about, was to extended ANALYZE to support functions.
>
> That's actually quite a good idea, not least because the extra checking
> happens only when you ask for it and not every time the function is
> loaded into a new session.
>
> I'm not that happy with overloading the ANALYZE keyword to mean this
> (especially not since there is already meaning attached to the syntax
> "ANALYZE x(y)").  But we could certainly use some other name --- I'm
> inclined to suggest CHECK:
>
>        CHECK FUNCTION function_name(arglist);
>
> People would want some sort of wild card capability; at the very least
> "check all plpgsql functions owned by me".  Not sure what that ought
> to look like syntactically.
>
> It might also be a good idea to make sure there's room in the syntax to
> specify different checking options.  We already would have reason to
> want "just do the existing style of validation check" versus this more
> intensive check.  And it's not hard to foresee other sorts of checking
> in future.
>
> Also, this would force us to invent PL-independent infrastructure for
> doing the checking.  I'm envisioning an additional argument to the
> existing PL validator function that tells it what checking options to
> use.
>
>                        regards, tom lane
>


Re: [REVIEW] prepare plans of embedded sql on function start

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I started work on proposed check statement option and there are a few questions?

> what is sense of this statement for others PL?

IMO you should design this as a call to the PL's validator function.
It's not necessary to make other PLs do anything more than their
existing validators do (at least for now).
        regards, tom lane


Re: [REVIEW] prepare plans of embedded sql on function start

From
Pavel Stehule
Date:
Hello

this is initial version of CHECK FUNCTION | CHECK TRIGGER statement

usage is simple


postgres=# CHECK FUNCTION f();
CHECK FUNCTION
Time: 3,411 ms
postgres=# CHECK TRIGGER foo ON omega ;
NOTICE:  checking function "trg()"
CHECK TRIGGER
Time: 73,139 ms
postgres=# select plpgsql_checker('f()'::regprocedure, 0);
 plpgsql_checker
─────────────────

(1 row)

Time: 0,861 ms

second parameter of plpgsql_checker function is relation oid that is
used for trigger checking. A possibility batch checking is reason why
I used new PL function.

when function has a bug, then CHECK FUNCTION show it


postgres=# CHECK FUNCTION fx();
ERROR:  column "z" does not exist
LINE 1: SELECT exists(select * from omega where z = 1)
                                                ^
QUERY:  SELECT exists(select * from omega where z = 1)
CONTEXT:  PL/pgSQL function "fx" line 4 at IF
postgres=#

autocomplete in psql is supported

Regards

Pavel Stehule

Attachment