Thread: plpgsql.consistent_into

plpgsql.consistent_into

From
Marko Tiikkaja
Date:
Greetings fellow elephants,

I would humbly like to submit for your consideration my proposal for
alleviating pain caused by one of the most annoying footguns in
PL/PgSQL: the behaviour of SELECT .. INTO when the query returns more
than one row.  Some of you might know that no exception is raised in
this case (as opposed to INSERT/UPDATE/DELETE .. INTO, all of them
yielding TOO_MANY_ROWS), which can hide subtle bugs in queries if during
testing the query always returns only one row or the "correct" one
happens to be picked up every time.  Additionally, the row_count() after
execution is always going to be either 0 or 1, so even if you want to
explicitly guard against potentially broken queries, you can't do so!

So I added the following compile-time option:


set plpgsql.consistent_into to true;

create or replace function footest() returns void as $$
declare
x int;
begin
   -- too many rows
   select 1 from foo into x;
end$$ language plpgsql;

select footest();
ERROR:  query returned more than one row

It defaults to false to preserve full backwards compatibility.  Also
turning it on makes the executor try and find two rows, so it might have
an effect on performance as well.  The patch, as currently written, also
changes the behaviour of EXECUTE .. INTO, but I don't feel strongly
about whether that should be affected as well or not.


Regards,
Marko Tiikkaja

Attachment

Re: plpgsql.consistent_into

From
Pavel Stehule
Date:
Hello


2014/1/12 Marko Tiikkaja <marko@joh.to>
Greetings fellow elephants,

I would humbly like to submit for your consideration my proposal for alleviating pain caused by one of the most annoying footguns in PL/PgSQL: the behaviour of SELECT .. INTO when the query returns more than one row.  Some of you might know that no exception is raised in this case (as opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the query always returns only one row or the "correct" one happens to be picked up every time.  Additionally, the row_count() after execution is always going to be either 0 or 1, so even if you want to explicitly guard against potentially broken queries, you can't do so!

It is not bad and, sure, - it is very useful and important

but - it is a redundant to INTO STRICT clause. When you use it, then you change a INTO behaviour. Is not better to ensure STRICT option than hidden redefining INTO?

Option INTO (without STRICT clause) is not safe and we should to disallow. I see a three states (not only two)

a) disallow INTO without STRICT (as preferred for new code)
b) implicit check after every INTO without STRICT
c) without check

these modes should be: "strict_required", "strict_default", "strict_legacy"



So I added the following compile-time option:


set plpgsql.consistent_into to true;

This name is not best (there is not clean with it a into should be consistent)

Is question, if this functionality should be enabled by GUC to be used for legacy code (as protection against some kind of hidden bugs)

This topic is interesting idea for me - some checks can be pushed to plpgsql_check (as errors or warnings) too.

Generally I like proposed functionality, just I am not sure, so hidden redefining INTO clause (to INTO STRICT) is what we want. We can do it (but explicitly). I don't know any situation where INTO without STRICT is valid. Introduction of STRICT option was wrong idea - and now is not way to back.

Regards

Pavel

 

create or replace function footest() returns void as $$
declare
x int;
begin
  -- too many rows
  select 1 from foo into x;
end$$ language plpgsql;

select footest();
ERROR:  query returned more than one row

It defaults to false to preserve full backwards compatibility.  Also turning it on makes the executor try and find two rows, so it might have an effect on performance as well.  The patch, as currently written, also changes the behaviour of EXECUTE .. INTO, but I don't feel strongly about whether that should be affected as well or not.


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/12/14, 7:47 AM, Pavel Stehule wrote:
> 2014/1/12 Marko Tiikkaja <marko@joh.to>
>
>> Greetings fellow elephants,
>>
>> I would humbly like to submit for your consideration my proposal for
>> alleviating pain caused by one of the most annoying footguns in PL/PgSQL:
>> the behaviour of SELECT .. INTO when the query returns more than one row.
>>   Some of you might know that no exception is raised in this case (as
>> opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
>> TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the
>> query always returns only one row or the "correct" one happens to be picked
>> up every time.  Additionally, the row_count() after execution is always
>> going to be either 0 or 1, so even if you want to explicitly guard against
>> potentially broken queries, you can't do so!
>>
>
> It is not bad and, sure, - it is very useful and important
>
> but - it is a redundant to INTO STRICT clause. When you use it, then you
> change a INTO behaviour. Is not better to ensure STRICT option than hidden
> redefining INTO?

That only works if the query should never return 0 rows either.  If you 
want to allow for missing rows, STRICT is out of the question.

> Option INTO (without STRICT clause) is not safe and we should to disallow.
> I see a three states (not only two)
>
> a) disallow INTO without STRICT (as preferred for new code)
> b) implicit check after every INTO without STRICT
> c) without check
>
> these modes should be: "strict_required", "strict_default", "strict_legacy"

I can't get excited about this.  Mostly because it doesn't solve the 
problem I'm having.

It is important to be able to execute queries with INTO which might not 
return a row.  That's what FOUND is for.

>> So I added the following compile-time option:
>>
>>
>> set plpgsql.consistent_into to true;
>>
>
> This name is not best (there is not clean with it a into should be
> consistent)

I agree, but I had to pick something.  One of the three hard problems in 
CS..

> Is question, if this functionality should be enabled by GUC to be used for
> legacy code (as protection against some kind of hidden bugs)
>
> This topic is interesting idea for me - some checks can be pushed to
> plpgsql_check (as errors or warnings) too.
>
> Generally I like proposed functionality, just I am not sure, so hidden
> redefining INTO clause (to INTO STRICT) is what we want. We can do it (but
> explicitly). I don't know any situation where INTO without STRICT is valid.
> Introduction of STRICT option was wrong idea - and now is not way to back.

Note that this is different from implicitly STRICTifying every INTO, 
like I said above.


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/12 Marko Tiikkaja <marko@joh.to>
On 1/12/14, 7:47 AM, Pavel Stehule wrote:
2014/1/12 Marko Tiikkaja <marko@joh.to>

Greetings fellow elephants,

I would humbly like to submit for your consideration my proposal for
alleviating pain caused by one of the most annoying footguns in PL/PgSQL:
the behaviour of SELECT .. INTO when the query returns more than one row.
  Some of you might know that no exception is raised in this case (as
opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the
query always returns only one row or the "correct" one happens to be picked
up every time.  Additionally, the row_count() after execution is always
going to be either 0 or 1, so even if you want to explicitly guard against
potentially broken queries, you can't do so!


It is not bad and, sure, - it is very useful and important

but - it is a redundant to INTO STRICT clause. When you use it, then you
change a INTO behaviour. Is not better to ensure STRICT option than hidden
redefining INTO?

That only works if the query should never return 0 rows either.  If you want to allow for missing rows, STRICT is out of the question.

hmm - you have true.

try to find better name.

Other questions is using a GUC for legacy code. I am for this checked mode be default (or can be simply activated for new code)

Regards

Pavel
 


Option INTO (without STRICT clause) is not safe and we should to disallow.
I see a three states (not only two)

a) disallow INTO without STRICT (as preferred for new code)
b) implicit check after every INTO without STRICT
c) without check

these modes should be: "strict_required", "strict_default", "strict_legacy"

I can't get excited about this.  Mostly because it doesn't solve the problem I'm having.

It is important to be able to execute queries with INTO which might not return a row.  That's what FOUND is for.


So I added the following compile-time option:


set plpgsql.consistent_into to true;


This name is not best (there is not clean with it a into should be
consistent)

I agree, but I had to pick something.  One of the three hard problems in CS..


Is question, if this functionality should be enabled by GUC to be used for
legacy code (as protection against some kind of hidden bugs)

This topic is interesting idea for me - some checks can be pushed to
plpgsql_check (as errors or warnings) too.

Generally I like proposed functionality, just I am not sure, so hidden
redefining INTO clause (to INTO STRICT) is what we want. We can do it (but
explicitly). I don't know any situation where INTO without STRICT is valid.
Introduction of STRICT option was wrong idea - and now is not way to back.

Note that this is different from implicitly STRICTifying every INTO, like I said above.


Regards,
Marko Tiikkaja

Re: plpgsql.consistent_into

From
Florian Pflug
Date:
On Jan12, 2014, at 06:51 , Marko Tiikkaja <marko@joh.to> wrote:
> I would humbly like to submit for your consideration my proposal for alleviating pain caused by one of the most
annoyingfootguns in PL/PgSQL: the behaviour of SELECT .. INTO when the query returns more than one row.  Some of you
mightknow that no exception is raised in this case (as opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding
TOO_MANY_ROWS),which can hide subtle bugs in queries if during testing the query always returns only one row or the
"correct"one happens to be picked up every time.  Additionally, the row_count() after execution is always going to be
either0 or 1, so even if you want to explicitly guard against potentially broken queries, you can't do so! 
>
> So I added the following compile-time option:
>
> set plpgsql.consistent_into to true;

I don't think a GUC is the best way to handle this. Handling
this via a per-function setting similar to #variable_conflict would
IMHO be better.So a function containing

#into_surplus_rows error

would complain whereas

#into_surplus_rows ignore_for_select

would leave the behaviour unchanged.

best regards,
Florian Pflug




Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/12/14, 10:19 PM, Florian Pflug wrote:
> On Jan12, 2014, at 06:51 , Marko Tiikkaja <marko@joh.to> wrote:
>>
>> set plpgsql.consistent_into to true;
>
> I don't think a GUC is the best way to handle this. Handling
> this via a per-function setting similar to #variable_conflict would
> IMHO be better.

This is exactly what the patch does.  A global default in the form of a 
GUC, and then a per-function option for overriding that default.


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/12 Florian Pflug <fgp@phlo.org>
On Jan12, 2014, at 06:51 , Marko Tiikkaja <marko@joh.to> wrote:
> I would humbly like to submit for your consideration my proposal for alleviating pain caused by one of the most annoying footguns in PL/PgSQL: the behaviour of SELECT .. INTO when the query returns more than one row.  Some of you might know that no exception is raised in this case (as opposed to INSERT/UPDATE/DELETE .. INTO, all of them yielding TOO_MANY_ROWS), which can hide subtle bugs in queries if during testing the query always returns only one row or the "correct" one happens to be picked up every time.  Additionally, the row_count() after execution is always going to be either 0 or 1, so even if you want to explicitly guard against potentially broken queries, you can't do so!
>
> So I added the following compile-time option:
>
> set plpgsql.consistent_into to true;

I don't think a GUC is the best way to handle this. Handling
this via a per-function setting similar to #variable_conflict would
IMHO be better.So a function containing

#into_surplus_rows error

would complain whereas

#into_surplus_rows ignore_for_select

would leave the behaviour unchanged.

There is  GUC for variable_conflict already too. In this case I would to enable this functionality everywhere (it is tool how to simply eliminate some kind of strange bugs) so it needs a GUC.

We have GUC for plpgsql.variable_conflict three years and I don't know about any problem.

Regards

Pavel
 

best regards,
Florian Pflug



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: plpgsql.consistent_into

From
Florian Pflug
Date:
On Jan12, 2014, at 22:37 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
> There is  GUC for variable_conflict already too. In this case I would to
> enable this functionality everywhere (it is tool how to simply eliminate
> some kind of strange bugs) so it needs a GUC. 
> 
> We have GUC for plpgsql.variable_conflict three years and I don't know
> about any problem.

I must say I hate behaviour-changing GUCs with quite some passion. IMHO
they tend to cause bugs, not avoid them, in the long run. The pattern
usually is
 1) Code gets written, depends on some particular set of settings    to work correctly
 2) Code gets reused, with little further testing since it's supposed    to be battle-proven anyway. Settings get
dropped.
 3) Code blows up for those corner-cases where the setting actually    matter. Debugging is hell, because you
effectivelyhave to go    over the code line-by-line and check if it might be affected by    some GUC or another.
 

Only a few days ago I spent more than an hour tracking down a bug
which, as it turned out, was caused by a regex which subtly changed its
meaning depending on whether standard_conforming_strings is on or off.

Some GUCs are unavoidable - standard_conforming_strings, for example
probably still was a good idea, since the alternative would have been
to stick with the historical, non-standard behaviour forever. 

But in this case, my feeling is that the trouble such a GUC may cause
out-weights the potential benefits. I'm all for having a directive like
#consistent_into (though I feel that the name could convey the
meaning better). If we *really* think that this ought to be the default
from 9.4 onward, then we should 
 *) Change it to always complain, except if the function explictly    specifies "#consistent_into on" or whatever.
 *) Have pg_dump add that to all plpgsql functions if the server    version is < 9.4 or whatever major release this
endsup in
 

That's all just my opinion of course.

best regards,
Florian Pflug




Re: plpgsql.consistent_into

From
Gavin Flower
Date:
On 13/01/14 11:44, Florian Pflug wrote:
> On Jan12, 2014, at 22:37 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> There is  GUC for variable_conflict already too. In this case I would to
>> enable this functionality everywhere (it is tool how to simply eliminate
>> some kind of strange bugs) so it needs a GUC.
>>
>> We have GUC for plpgsql.variable_conflict three years and I don't know
>> about any problem.
> I must say I hate behaviour-changing GUCs with quite some passion. IMHO
> they tend to cause bugs, not avoid them, in the long run. The pattern
> usually is
>
>    1) Code gets written, depends on some particular set of settings
>       to work correctly
>
>    2) Code gets reused, with little further testing since it's supposed
>       to be battle-proven anyway. Settings get dropped.
>
>    3) Code blows up for those corner-cases where the setting actually
>       matter. Debugging is hell, because you effectively have to go
>       over the code line-by-line and check if it might be affected by
>       some GUC or another.
>
> Only a few days ago I spent more than an hour tracking down a bug
> which, as it turned out, was caused by a regex which subtly changed its
> meaning depending on whether standard_conforming_strings is on or off.
>
> Some GUCs are unavoidable - standard_conforming_strings, for example
> probably still was a good idea, since the alternative would have been
> to stick with the historical, non-standard behaviour forever.
>
> But in this case, my feeling is that the trouble such a GUC may cause
> out-weights the potential benefits. I'm all for having a directive like
> #consistent_into (though I feel that the name could convey the
> meaning better). If we *really* think that this ought to be the default
> from 9.4 onward, then we should
>
>    *) Change it to always complain, except if the function explictly
>       specifies "#consistent_into on" or whatever.
>
>    *) Have pg_dump add that to all plpgsql functions if the server
>       version is < 9.4 or whatever major release this ends up in
>
> That's all just my opinion of course.
>
> best regards,
> Florian Pflug
>
>
>
Possibly there should be a warning put out, whenever someone uses some 
behaviour that requires a GUC set to a non-default value?


Cheers,
Gavin



Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/12 Florian Pflug <fgp@phlo.org>
On Jan12, 2014, at 22:37 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
> There is  GUC for variable_conflict already too. In this case I would to
> enable this functionality everywhere (it is tool how to simply eliminate
> some kind of strange bugs) so it needs a GUC.
>
> We have GUC for plpgsql.variable_conflict three years and I don't know
> about any problem.

I must say I hate behaviour-changing GUCs with quite some passion. IMHO
they tend to cause bugs, not avoid them, in the long run. The pattern
usually is

  1) Code gets written, depends on some particular set of settings
     to work correctly

  2) Code gets reused, with little further testing since it's supposed
     to be battle-proven anyway. Settings get dropped.

  3) Code blows up for those corner-cases where the setting actually
     matter. Debugging is hell, because you effectively have to go
     over the code line-by-line and check if it might be affected by
     some GUC or another.

Only a few days ago I spent more than an hour tracking down a bug
which, as it turned out, was caused by a regex which subtly changed its
meaning depending on whether standard_conforming_strings is on or off.

Some GUCs are unavoidable - standard_conforming_strings, for example
probably still was a good idea, since the alternative would have been
to stick with the historical, non-standard behaviour forever.

But in this case, my feeling is that the trouble such a GUC may cause
out-weights the potential benefits. I'm all for having a directive like
#consistent_into (though I feel that the name could convey the
meaning better). If we *really* think that this ought to be the default
from 9.4 onward, then we should

  *) Change it to always complain, except if the function explictly
     specifies "#consistent_into on" or whatever.

  *) Have pg_dump add that to all plpgsql functions if the server
     version is < 9.4 or whatever major release this ends up in

I disagree - automatic code injection can is strange. Is not problem inject code from simple DO statement, but I dislike append lines to source code without any specific user request.

Maybe this problem with GUC can be solved in 9.4. Some specific GUC can be ported with database.

Pavel
 

That's all just my opinion of course.

best regards,
Florian Pflug


Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/13 Gavin Flower <GavinFlower@archidevsys.co.nz>
On 13/01/14 11:44, Florian Pflug wrote:
On Jan12, 2014, at 22:37 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
There is  GUC for variable_conflict already too. In this case I would to
enable this functionality everywhere (it is tool how to simply eliminate
some kind of strange bugs) so it needs a GUC.

We have GUC for plpgsql.variable_conflict three years and I don't know
about any problem.
I must say I hate behaviour-changing GUCs with quite some passion. IMHO
they tend to cause bugs, not avoid them, in the long run. The pattern
usually is

   1) Code gets written, depends on some particular set of settings
      to work correctly

   2) Code gets reused, with little further testing since it's supposed
      to be battle-proven anyway. Settings get dropped.

   3) Code blows up for those corner-cases where the setting actually
      matter. Debugging is hell, because you effectively have to go
      over the code line-by-line and check if it might be affected by
      some GUC or another.

Only a few days ago I spent more than an hour tracking down a bug
which, as it turned out, was caused by a regex which subtly changed its
meaning depending on whether standard_conforming_strings is on or off.

Some GUCs are unavoidable - standard_conforming_strings, for example
probably still was a good idea, since the alternative would have been
to stick with the historical, non-standard behaviour forever.

But in this case, my feeling is that the trouble such a GUC may cause
out-weights the potential benefits. I'm all for having a directive like
#consistent_into (though I feel that the name could convey the
meaning better). If we *really* think that this ought to be the default
from 9.4 onward, then we should

   *) Change it to always complain, except if the function explictly
      specifies "#consistent_into on" or whatever.

   *) Have pg_dump add that to all plpgsql functions if the server
      version is < 9.4 or whatever major release this ends up in

That's all just my opinion of course.

best regards,
Florian Pflug



Possibly there should be a warning put out, whenever someone uses some behaviour that requires a GUC set to a non-default value?

It is a good idea. I though about it. A worry about GUC are legitimate, but we are most static and sometimes we try to design bigger creatures, than we try to solve.

I am thinking so dump can contain a serialized GUC values, and can raises warnings when GUC are different (not only different from default).

Similar problems are with different FROM_COLAPS_LIMIT, JOIN_COLAPS_LIMIT, WORK_MEM, ...

Regards

Pavel


 


Cheers,
Gavin

Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/12 Florian Pflug <fgp@phlo.org>
On Jan12, 2014, at 22:37 , Pavel Stehule <pavel.stehule@gmail.com> wrote:
> There is  GUC for variable_conflict already too. In this case I would to
> enable this functionality everywhere (it is tool how to simply eliminate
> some kind of strange bugs) so it needs a GUC.
>
> We have GUC for plpgsql.variable_conflict three years and I don't know
> about any problem.

I must say I hate behaviour-changing GUCs with quite some passion. IMHO
they tend to cause bugs, not avoid them, in the long run. The pattern
usually is

  1) Code gets written, depends on some particular set of settings
     to work correctly

  2) Code gets reused, with little further testing since it's supposed
     to be battle-proven anyway. Settings get dropped.

  3) Code blows up for those corner-cases where the setting actually
     matter. Debugging is hell, because you effectively have to go
     over the code line-by-line and check if it might be affected by
     some GUC or another.

Only a few days ago I spent more than an hour tracking down a bug
which, as it turned out, was caused by a regex which subtly changed its
meaning depending on whether standard_conforming_strings is on or off.

Some GUCs are unavoidable - standard_conforming_strings, for example
probably still was a good idea, since the alternative would have been
to stick with the historical, non-standard behaviour forever.

But in this case, my feeling is that the trouble such a GUC may cause
out-weights the potential benefits. I'm all for having a directive like
#consistent_into (though I feel that the name could convey the
meaning better). If we *really* think that this ought to be the default
from 9.4 onward, then we should

  *) Change it to always complain, except if the function explictly
     specifies "#consistent_into on" or whatever.

  *) Have pg_dump add that to all plpgsql functions if the server
     version is < 9.4 or whatever major release this ends up in

That's all just my opinion of course.

I am thinking so GUC and plpgsql option can live together. If you like to accent a some behave, then you can use a plpgsql option. On second hand, I would to use a some functionality, that is safe, but I don't would to dirty source code by using repeated options. But I have to check (and calculate with risk) a GUC settings.

One idea: required GUC? Can be nice a possibility to ensure some GUC setting, and restore ensure these values or raises warning.

Back to main topic. Required and described feature doesn't change a behave of INTO clause. I can enable or disable this functionality and well written code should to work without change (and problems). When check is disabled, then execution is just less safe. So in this case, a impact of GUC is significantly less than by you described issues. Does know anybody a use case where this check should be disabled?

Probably we have a different experience about GUC. I had a problem with  standard_conforming_strings and bytea format some years ago. Now I prepare document about required setting. But I can see (from my experience from Czech area) more often  problems related to effective_cache_size or from_collapse_limit and similar GUC. These parameters are behind knowledge (and visibility) typical user.

Best regards

Pavel
 

best regards,
Florian Pflug


Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/13/14, 1:44 AM, Pavel Stehule wrote:
>
>
>
> 2014/1/12 Florian Pflug <fgp@phlo.org <mailto:fgp@phlo.org>>
>
>     On Jan12, 2014, at 22:37 , Pavel Stehule <pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote:
>      > There is  GUC for variable_conflict already too. In this case I would to
>      > enable this functionality everywhere (it is tool how to simply eliminate
>      > some kind of strange bugs) so it needs a GUC.
>      >
>      > We have GUC for plpgsql.variable_conflict three years and I don't know
>      > about any problem.
>
>     I must say I hate behaviour-changing GUCs with quite some passion. IMHO
>     they tend to cause bugs, not avoid them, in the long run. The pattern
>     usually is
>
>        1) Code gets written, depends on some particular set of settings
>           to work correctly
>
>        2) Code gets reused, with little further testing since it's supposed
>           to be battle-proven anyway. Settings get dropped.
>
>        3) Code blows up for those corner-cases where the setting actually
>           matter. Debugging is hell, because you effectively have to go
>           over the code line-by-line and check if it might be affected by
>           some GUC or another.
>
>     Only a few days ago I spent more than an hour tracking down a bug
>     which, as it turned out, was caused by a regex which subtly changed its
>     meaning depending on whether standard_conforming_strings is on or off.
>
>     Some GUCs are unavoidable - standard_conforming_strings, for example
>     probably still was a good idea, since the alternative would have been
>     to stick with the historical, non-standard behaviour forever.
>
>     But in this case, my feeling is that the trouble such a GUC may cause
>     out-weights the potential benefits. I'm all for having a directive like
>     #consistent_into (though I feel that the name could convey the
>     meaning better). If we *really* think that this ought to be the default
>     from 9.4 onward, then we should
>
>        *) Change it to always complain, except if the function explictly
>           specifies "#consistent_into on" or whatever.
>
>        *) Have pg_dump add that to all plpgsql functions if the server
>           version is < 9.4 or whatever major release this ends up in
>
>     That's all just my opinion of course.
>
>
> I am thinking so GUC and plpgsql option can live together. If you like to accent a some behave, then you can use a
plpgsqloption. On second hand, I would to use a some functionality, that is safe, but I don't would to dirty source
codeby using repeated options. But I have to check (and calculate with risk) a GUC settings.
 
>
> One idea: required GUC? Can be nice a possibility to ensure some GUC setting, and restore ensure these values or
raiseswarning.
 
>
> Back to main topic. Required and described feature doesn't change a behave of INTO clause. I can enable or disable
thisfunctionality and well written code should to work without change (and problems). When check is disabled, then
executionis just less safe. So in this case, a impact of GUC is significantly less than by you described issues. Does
knowanybody a use case where this check should be disabled?
 
>
> Probably we have a different experience about GUC. I had a problem with  standard_conforming_strings and bytea format
someyears ago. Now I prepare document about required setting. But I can see (from my experience from Czech area) more
often problems related to effective_cache_size or from_collapse_limit and similar GUC. These parameters are behind
knowledge(and visibility) typical user.
 

ISTM that in this case, it should be safe to make the new default behavior STRICT; if you forget to set the GUC to
disablethan you'll get an error that points directly at the problem, at which point you'll go "Oh, yeah... I forgot to
setX..."
 

Outside of the GUC, I believe the default should definitely be STRICT. If your app is relying on non-strict then you
needto be made aware of that. We should be able to provide a DO block that will change this setting for every function
you'vegot if someone isn't happy with STRICT mode.
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Florian Pflug
Date:
On Jan13, 2014, at 22:49 , Jim Nasby <jim@nasby.net> wrote:
> ISTM that in this case, it should be safe to make the new default behavior STRICT;
> if you forget to set the GUC to disable than you'll get an error that points directly
> at the problem, at which point you'll go "Oh, yeah... I forgot to set X..."

What do you mean by STRICT? STRICT (which we already support) complains if the
query doesn't return *exactly* one row. What Marko wants is to raise an error
for a plain SELECT ... INTO if more than one row is returned, but to still
convert zero rows to NULL.

> Outside of the GUC, I believe the default should definitely be STRICT. If your app is
> relying on non-strict then you need to be made aware of that. We should be able to
> provide a DO block that will change this setting for every function you've got if
> someone isn't happy with STRICT mode.

If you mean that we should change SELECT ... INTO to always behave as if STRICT
had been specified - why on earth would we want to do that? That would break
perfectly fine code for no good reason whatsoever.

In fact, after reading the documentation on SELECT ... INTO, I'm convinced the
the whole consistent_into thing is a bad idea. The documentation states clearly
that
 For INSERT/UPDATE/DELETE with RETURNING, PL/pgSQL reports an error for more than one returned row, even when STRICT is
notspecified. This is because there is no option such as ORDER BY with which to determine which affected row should be
returned.

It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
for a reason. We shouldn't be second-guessing ourselves by changing that later -
not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.

(And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
IMHO just too late for that)

best regards,
Florian Pflug




Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14, 12:41 AM, Florian Pflug wrote:
> In fact, after reading the documentation on SELECT ... INTO, I'm convinced the
> the whole consistent_into thing is a bad idea. The documentation states clearly
> that
>
>    For INSERT/UPDATE/DELETE with RETURNING, PL/pgSQL reports an error for more than
>    one returned row, even when STRICT is not specified. This is because there is no
>    option such as ORDER BY with which to determine which affected row should be
>    returned.
>
> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
> for a reason.

Yeah, it does state that.  But it's a BS reason.  In addition to ORDER 
BY, SELECT also has a LIMIT which you can use to get the "first row" 
behaviour.  There's no way to go to the more sane behaviour from what we 
have right now.

When I've worked with PL/PgSQL, this has been a source of a few bugs 
that would have been noticed during testing if the behaviour of INTO 
wasn't as dangerous as it is right now.  Yes, it breaks backwards 
compatibility, but that's why there's a nice GUC.  If we're not going to 
scrap PL/PgSQL and start over again, we are going to have to do changes 
like this to make the language better.  Also I think that out of all the 
things we could do to break backwards compatibility, this is closer to 
"harmless" than "a pain in the butt".


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Josh Berkus
Date:
On 01/13/2014 03:41 PM, Florian Pflug wrote:
> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
> for a reason. We shouldn't be second-guessing ourselves by changing that later -
> not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.
> 
> (And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
> IMHO just too late for that)

I *really* don't want to go through all my old code to find places where
I used SELECT ... INTO just to pop off the first row, and ignored the
rest.  I doubt anyone else does, either.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: plpgsql.consistent_into

From
Florian Pflug
Date:
On Jan14, 2014, at 00:52 , Marko Tiikkaja <marko@joh.to> wrote:
> When I've worked with PL/PgSQL, this has been a source of a few bugs that
> would have been noticed during testing if the behaviour of INTO wasn't as
> dangerous as it is right now.

The question is, how many bugs stemmed from wrong SQL queries, and what
percentage of those would have been caught by this? The way I see it, there
are thousands of ways to screw up a query, and having it return multiple 
rows instead of one is just one of them.

> Yes, it breaks backwards compatibility, but that's why there's a nice GUC.

Which doesn't help, because the GUC isn't tied to the code. This *adds*
an error case, not remove one - now, instead of getting your code correct,
you *also* have to get the GUC correct. If you even *know* that such a GUC
exists.

> If we're not going to scrap PL/PgSQL and
> start over again, we are going to have to do changes like this to make the
> language better.  Also I think that out of all the things we could do to
> break backwards compatibility, this is closer to "harmless" than "a pain
> in the butt".

I very strongly believe that languages don't get better by adding a thousand
little knobs which subtly change semantics. Look at the mess that is PHP -
we absolutely, certainly don't want to go there. The most important rule in
language design is in my opinion "stick with your choices". C, C++ and JAVA
all seem to follow this, and it's one of the reasons these languages are
popular for big projects, I think.

The way I see it, the only OK way to change existing behaviour is to have
the concept of a "language version", and force code to indicate the language
version it expects. The important thing is that the language version is an
attribute of code, not some global setting that you can change without ever
looking at the code it'd affect.

So if we really want to change this, I think we need to have a
LANGUAGE_VERSION attribute on functions. Each time a major postgres release
changes the behaviour of one of the procedural languages, we'd increment
that language's version, and enable the old behaviour for all functions
tagged with an older one.

best regards,
Florian Pflug









Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/13/14, 5:57 PM, Josh Berkus wrote:
> On 01/13/2014 03:41 PM, Florian Pflug wrote:
>> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
>> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
>> for a reason. We shouldn't be second-guessing ourselves by changing that later -
>> not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.
>>
>> (And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
>> IMHO just too late for that)
>
> I *really* don't want to go through all my old code to find places where
> I used SELECT ... INTO just to pop off the first row, and ignored the
> rest.  I doubt anyone else does, either.

Do you regularly have use cases where you actually want just one RANDOM row? I suspect the far more likely scenario is
thatpeople write code assuming they'll get only one row and they'll end up with extremely hard to trace bugs if that
assumptionis ever wrong.
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/13/14, 6:16 PM, Florian Pflug wrote:
> On Jan14, 2014, at 00:52 , Marko Tiikkaja <marko@joh.to> wrote:
>> When I've worked with PL/PgSQL, this has been a source of a few bugs that
>> would have been noticed during testing if the behaviour of INTO wasn't as
>> dangerous as it is right now.
>
> The question is, how many bugs stemmed from wrong SQL queries, and what
> percentage of those would have been caught by this? The way I see it, there
> are thousands of ways to screw up a query, and having it return multiple
> rows instead of one is just one of them.

A query that's simply wrong is more likely to fail consistently. Non-strict use of INTO is going to fail in very subtle
ways(unless you actually DO want just the first row, in which case you should explicitly use LIMIT 1).
 

>> If we're not going to scrap PL/PgSQL and
>> start over again, we are going to have to do changes like this to make the
>> language better.  Also I think that out of all the things we could do to
>> break backwards compatibility, this is closer to "harmless" than "a pain
>> in the butt".
>
> I very strongly believe that languages don't get better by adding a thousand
> little knobs which subtly change semantics. Look at the mess that is PHP -
> we absolutely, certainly don't want to go there. The most important rule in
> language design is in my opinion "stick with your choices". C, C++ and JAVA
> all seem to follow this, and it's one of the reasons these languages are
> popular for big projects, I think.
>
> The way I see it, the only OK way to change existing behaviour is to have
> the concept of a "language version", and force code to indicate the language
> version it expects. The important thing is that the language version is an
> attribute of code, not some global setting that you can change without ever
> looking at the code it'd affect.
>
> So if we really want to change this, I think we need to have a
> LANGUAGE_VERSION attribute on functions. Each time a major postgres release
> changes the behaviour of one of the procedural languages, we'd increment
> that language's version, and enable the old behaviour for all functions
> tagged with an older one.

I like that idea. It allows us to fix past decisions that were ill considered without hosing all existing code.

BTW, have we always had support for STRICT, or was it added at some point? It's in 8.4, but I don't know how far back
itgoes.
 

And if we've always had it, why on earth didn't we make STRICT the default behavior?
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Florian Pflug
Date:
(Responding to both of your mails here)

On Jan14, 2014, at 01:20 , Jim Nasby <jim@nasby.net> wrote:
> On 1/13/14, 5:57 PM, Josh Berkus wrote:
>> On 01/13/2014 03:41 PM, Florian Pflug wrote:
>>> It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
>>> but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
>>> for a reason. We shouldn't be second-guessing ourselves by changing that later -
>>> not, at least, unless we have a *very* good reason for it. Which, AFAICS, we don't.
>>>
>>> (And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
>>> IMHO just too late for that)
>>
>> I *really* don't want to go through all my old code to find places where
>> I used SELECT ... INTO just to pop off the first row, and ignored the
>> rest.  I doubt anyone else does, either.
>
> Do you regularly have use cases where you actually want just one RANDOM row?
> I suspect the far more likely scenario is that people write code assuming they'll
> get only one row and they'll end up with extremely hard to trace bugs if that
> assumption is ever wrong.

One case that immediatly comes to mind is a JOIN which sometimes returns
multiple rows, and a projection clause that only uses one of the tables
involved in the join.

Another are queries including an ORDER BY - I don't think the patch makes an
exception for those, and even if it did, it probably wouldn't catch all
cases, like e.g. an SRF which returns the rows in a deterministic order.

Or maybe you're picking a row to process next, and don't really care about
the order in which you work through them.

>> The question is, how many bugs stemmed from wrong SQL queries, and what
>> percentage of those would have been caught by this? The way I see it, there
>> are thousands of ways to screw up a query, and having it return multiple
>> rows instead of one is just one of them.
>
> A query that's simply wrong is more likely to fail consistently. Non-strict
> use of INTO is going to fail in very subtle ways (unless you actually DO want
> just the first row, in which case you should explicitly use LIMIT 1).

How so? Say you expect "SELECT * FROM view WHERE c=<n>" to only ever return
one row. Then "SELECT sum(f) FROM table JOIN view ON table.c = view.c" is
just as subtly wrong as the first query is.

> And if we've always had it, why on earth didn't we make STRICT the default
> behavior?

Dunno, but AFAIK pl/pgsql mimics Oracle's PL/SQL, at least in some aspects,
so maybe this is one of the areas where we just do what oracle does.

best regards,
Florian Pflug




Re: plpgsql.consistent_into

From
Tom Lane
Date:
Florian Pflug <fgp@phlo.org> writes:
> On Jan14, 2014, at 01:20 , Jim Nasby <jim@nasby.net> wrote:
>> And if we've always had it, why on earth didn't we make STRICT the default
>> behavior?

> Dunno, but AFAIK pl/pgsql mimics Oracle's PL/SQL, at least in some aspects,
> so maybe this is one of the areas where we just do what oracle does.

STRICT is a relatively recent addition (8.2, looks like).  One of the
reasons it got in was that the author didn't have any grandiose ideas
about making it the new default.  I think this patch would have a lot
better chance if it was just adding another keyword as an alternative
to STRICT, and not hoping to impose the author's opinions on the rest
of the world.  Whatever your opinion of the default behavior, the
fact that it's been that way for upwards of fifteen years without any
mass protests should give you pause about changing it.
        regards, tom lane



Re: plpgsql.consistent_into

From
Josh Berkus
Date:
On 01/13/2014 04:20 PM, Jim Nasby wrote:
> On 1/13/14, 5:57 PM, Josh Berkus wrote:
>> I *really* don't want to go through all my old code to find places where
>> I used SELECT ... INTO just to pop off the first row, and ignored the
>> rest.  I doubt anyone else does, either.
> 
> Do you regularly have use cases where you actually want just one RANDOM
> row? I suspect the far more likely scenario is that people write code
> assuming they'll get only one row and they'll end up with extremely hard
> to trace bugs if that assumption is ever wrong.

Regularly?  No.  But I've seen it, especially as part of a "does this
query return any rows?" test.  That's not the best way to test that, but
that doesn't stop a lot of people doing it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/13/14, 6:36 PM, Florian Pflug wrote:
> On Jan14, 2014, at 01:20 , Jim Nasby<jim@nasby.net>  wrote:
>> >On 1/13/14, 5:57 PM, Josh Berkus wrote:
>>> >>On 01/13/2014 03:41 PM, Florian Pflug wrote:
>>>> >>>It therefor isn't an oversight that SELECT ... INTO allows multiple result rows
>>>> >>>but INSERT/UPDATE/DELETE forbids them, it's been done that way on purpose and
>>>> >>>for a reason. We shouldn't be second-guessing ourselves by changing that later -
>>>> >>>not, at least, unless we have a*very*  good reason for it. Which, AFAICS, we don't.
>>>> >>>
>>>> >>>(And yeah, personally I'd prefer if we'd complain about multiple rows. But it's
>>>> >>>IMHO just too late for that)
>>> >>
>>> >>I*really*  don't want to go through all my old code to find places where
>>> >>I used SELECT ... INTO just to pop off the first row, and ignored the
>>> >>rest.  I doubt anyone else does, either.
>> >
>> >Do you regularly have use cases where you actually want just one RANDOM row?
>> >I suspect the far more likely scenario is that people write code assuming they'll
>> >get only one row and they'll end up with extremely hard to trace bugs if that
>> >assumption is ever wrong.
> One case that immediatly comes to mind is a JOIN which sometimes returns
> multiple rows, and a projection clause that only uses one of the tables
> involved in the join.

Which is just bad coding IMHO.

> Another are queries including an ORDER BY - I don't think the patch makes an
> exception for those, and even if it did, it probably wouldn't catch all
> cases, like e.g. an SRF which returns the rows in a deterministic order.

Sure, but if you've gone to the trouble of putting the ORDER BY in, how hard is it to add LIMIT 1?

> Or maybe you're picking a row to process next, and don't really care about
> the order in which you work through them.

Again, how hard is LIMIT 1?

BTW, my issue here isn't with typing out " STRICT". I'd be fine if the way things worked was you had to specify INTO
STRICTor INTO LOOSE (or whatever you want to call the opposite of strict).
 

My problem is that the default here is plain and simple a bad choice for default behavior. In light of Tom's research
showingthis was added in 8.2 I understand why we went that route, but I'd really like a way to change the default. Or
evendisallow it (IE: force the user to state which route they're going here).
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14, 1:57 AM, Tom Lane wrote:
> Whatever your opinion of the default behavior, the
> fact that it's been that way for upwards of fifteen years without any
> mass protests should give you pause about changing it.

For what it's worth, my patch does not change the default behaviour.  I 
don't think I've ever publicly said that it should.


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/13/14, 7:06 PM, Josh Berkus wrote:
> On 01/13/2014 04:20 PM, Jim Nasby wrote:
>> On 1/13/14, 5:57 PM, Josh Berkus wrote:
>>> I *really* don't want to go through all my old code to find places where
>>> I used SELECT ... INTO just to pop off the first row, and ignored the
>>> rest.  I doubt anyone else does, either.
>>
>> Do you regularly have use cases where you actually want just one RANDOM
>> row? I suspect the far more likely scenario is that people write code
>> assuming they'll get only one row and they'll end up with extremely hard
>> to trace bugs if that assumption is ever wrong.
>
> Regularly?  No.  But I've seen it, especially as part of a "does this
> query return any rows?" test.  That's not the best way to test that, but
> that doesn't stop a lot of people doing it.

Right, and I certainly don't want to force anyone to rewrite all their code. But I'd certainly like a safer default so
peopledon't mistakenly go the "multiple rows is OK" route without doing so very intentionally.
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Josh Berkus
Date:
On 01/13/2014 05:10 PM, Jim Nasby wrote:
> On 1/13/14, 7:06 PM, Josh Berkus wrote:
>> Regularly?  No.  But I've seen it, especially as part of a "does this
>> query return any rows?" test.  That's not the best way to test that, but
>> that doesn't stop a lot of people doing it.
> 
> Right, and I certainly don't want to force anyone to rewrite all their
> code. But I'd certainly like a safer default so people don't mistakenly
> go the "multiple rows is OK" route without doing so very intentionally.

The problem is that if you change the default, you're creating an
unexpected barrier to upgrading.  I just don't think that it's worth
doing so in order to meet some standard of code neatness, especially in
plpgsql, the unwanted bastard child of SQL and ADA.

For people who want to enable this in order to prevent stupid query bugs
from creeping into their plpgsql, that's great, let's have an easy
option to turn on.  But it's hard enough to get people to upgrade as it
is.  If we're going to add an upgrade landmine, it better be for
something really important.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: plpgsql.consistent_into

From
Marti Raudsepp
Date:
On Sun, Jan 12, 2014 at 7:51 AM, Marko Tiikkaja <marko@joh.to> wrote:
> the behaviour of SELECT .. INTO when the query returns more than one row.
> Some of you might know that no exception is raised in this case

Agreed. But I also agree with the rest of the thread about changing
current INTO behavior and introducing new GUC variables.

But PL/pgSQL already has an assignment syntax with the behavior you want:

DECLARE foo int;
BEGIN foo = generate_series(1,1); -- this is OK foo = generate_series(1,2); -- fails foo = 10 WHERE FALSE; -- sets foo
toNULL -- And you can actually do: foo = some_col FROM some_table WHERE bar=10;
 
END;

So if we extend this syntax to support multiple columns, it should
satisfy the use cases you care about.
 foo, bar = col1, col2 FROM some_table WHERE bar=10;

It's ugly without the explicit SELECT though. Perhaps make the SELECT optional:
 foo, bar = SELECT col1, col2 FROM some_table WHERE bar=10;

I think that's more aesthetically pleasing than INTO and also looks
more familiar to other languages.

Plus, now you can copy-paste the query straight to an SQL shell
without another footgun involving creating new tables in your
database.

Regards,
Marti



Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 2014-01-14 02:54, Marti Raudsepp wrote:
> On Sun, Jan 12, 2014 at 7:51 AM, Marko Tiikkaja <marko@joh.to> wrote:
>> the behaviour of SELECT .. INTO when the query returns more than one row.
>> Some of you might know that no exception is raised in this case
>
> Agreed. But I also agree with the rest of the thread about changing
> current INTO behavior and introducing new GUC variables.
>
> But PL/pgSQL already has an assignment syntax with the behavior you want:

According to the docs, that doesn't set FOUND which would make this a 
pain to deal with..


Regards,
Marko Tiikkaja




Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



I am thinking so GUC and plpgsql option can live together. If you like to accent a some behave, then you can use a plpgsql option. On second hand, I would to use a some functionality, that is safe, but I don't would to dirty source code by using repeated options. But I have to check (and calculate with risk) a GUC settings.

One idea: required GUC? Can be nice a possibility to ensure some GUC setting, and restore ensure these values or raises warning.

Back to main topic. Required and described feature doesn't change a behave of INTO clause. I can enable or disable this functionality and well written code should to work without change (and problems). When check is disabled, then execution is just less safe. So in this case, a impact of GUC is significantly less than by you described issues. Does know anybody a use case where this check should be disabled?

Probably we have a different experience about GUC. I had a problem with  standard_conforming_strings and bytea format some years ago. Now I prepare document about required setting. But I can see (from my experience from Czech area) more often  problems related to effective_cache_size or from_collapse_limit and similar GUC. These parameters are behind knowledge (and visibility) typical user.

ISTM that in this case, it should be safe to make the new default behavior STRICT; if you forget to set the GUC to disable than you'll get an error that points directly at the problem, at which point you'll go "Oh, yeah... I forgot to set X..."


I disagree - STRICT can be too restrictive - and a combination SELECT INTO and IF FOUND can be significantly faster - our exception handling is not cheap.

Regards

Pavel
 
Outside of the GUC, I believe the default should definitely be STRICT. If your app is relying on non-strict then you need to be made aware of that. We should be able to provide a DO block that will change this setting for every function you've got if someone isn't happy with STRICT mode.
--
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/14 Florian Pflug <fgp@phlo.org>
On Jan14, 2014, at 00:52 , Marko Tiikkaja <marko@joh.to> wrote:
> When I've worked with PL/PgSQL, this has been a source of a few bugs that
> would have been noticed during testing if the behaviour of INTO wasn't as
> dangerous as it is right now.

The question is, how many bugs stemmed from wrong SQL queries, and what
percentage of those would have been caught by this? The way I see it, there
are thousands of ways to screw up a query, and having it return multiple
rows instead of one is just one of them.

> Yes, it breaks backwards compatibility, but that's why there's a nice GUC.

Which doesn't help, because the GUC isn't tied to the code. This *adds*
an error case, not remove one - now, instead of getting your code correct,
you *also* have to get the GUC correct. If you even *know* that such a GUC
exists.

> If we're not going to scrap PL/PgSQL and
> start over again, we are going to have to do changes like this to make the
> language better.  Also I think that out of all the things we could do to
> break backwards compatibility, this is closer to "harmless" than "a pain
> in the butt".

I very strongly believe that languages don't get better by adding a thousand
little knobs which subtly change semantics. Look at the mess that is PHP -
we absolutely, certainly don't want to go there. The most important rule in
language design is in my opinion "stick with your choices". C, C++ and JAVA
all seem to follow this, and it's one of the reasons these languages are
popular for big projects, I think.

The way I see it, the only OK way to change existing behaviour is to have
the concept of a "language version", and force code to indicate the language
version it expects. The important thing is that the language version is an
attribute of code, not some global setting that you can change without ever
looking at the code it'd affect.

So if we really want to change this, I think we need to have a
LANGUAGE_VERSION attribute on functions. Each time a major postgres release
changes the behaviour of one of the procedural languages, we'd increment
that language's version, and enable the old behaviour for all functions
tagged with an older one.

I dislike this proposal

too enterprise, too complex, too bad - after ten years we can have a ten language versions and it helps nothing.

return back to topic

a) there is agreement so we like this functionality as plpgsql option

b) there is no agreement so we would to see this functionality as default (or in near future)

@b is a topic, that we should not to solve and it is back again and again. Sometimes we have success, sometimes not. Temporal GUC is not enough solution for some people.

So my result - @a can be implement, @b not

and we can continue in other thread about SELECT INTO redesign and about possibilities that we have or have not. It is about personal perspective - someone can thinking about missing check after INTO as about feature, other one can see it as bug. My opinion - it is a bug with possible ambiguous result and possible negative performance impacts. Probably we can precise a doc about wrong and good usage SELECT INTO clause.

I am working on plpgsql_debug extension and I am thinking so I am able (with small change in plpgsql executor) implement this check as extension.  So it can be available for advanced users (that will has a knowledge about additional plpgsql extensions).

Regards

Pavel


regards

Pavel
 

best regards,
Florian Pflug







Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14 10:16 AM, Pavel Stehule wrote:
> 2014/1/14 Florian Pflug <fgp@phlo.org>
>
>> So if we really want to change this, I think we need to have a
>> LANGUAGE_VERSION attribute on functions. Each time a major postgres release
>> changes the behaviour of one of the procedural languages, we'd increment
>> that language's version, and enable the old behaviour for all functions
>> tagged with an older one.
>>
>
> I dislike this proposal
>
> too enterprise, too complex, too bad - after ten years we can have a ten
> language versions and it helps nothing.

At this point I'm almost tempted to agree with Florian -- I'm really 
hoping we could change PL/PgSQL for the better over the next 10 years, 
but given the backwards compatibility requirements we have, this seems 
like an absolute nightmare.  Not saying we need a version number we can 
keep bumping every release, but something similar.

> return back to topic
>
> a) there is agreement so we like this functionality as plpgsql option
>
> b) there is no agreement so we would to see this functionality as default
> (or in near future)
>
> @b is a topic, that we should not to solve and it is back again and again.
> Sometimes we have success, sometimes not. Temporal GUC is not enough
> solution for some people.
>
> So my result - @a can be implement, @b not

FWIW, I would like to have this behaviour even if it's not the default 
(that was my original proposal anyway).  It could help catch a lot of 
bugs in testing, and for important code it could be even turned on in 
production (perhaps on a per-function basis).  Maybe even globally, idk.

> I am working on plpgsql_debug extension and I am thinking so I am able
> (with small change in plpgsql executor) implement this check as extension.
> So it can be available for advanced users (that will has a knowledge about
> additional plpgsql extensions).

While this sounds cool, running a modified postgres locally seems a lot 
easier.  I already have to do that for PL/PgSQL development because of 
the reasons I outlined in the thread "PL/PgSQL, RAISE and error context".


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Marti Raudsepp
Date:
I've always hated INTO in procedures since it makes the code harder to
follow and has very different behavior on the SQL level, in addition
to the multi-row problem you bring up. If we can make assignment
syntax more versatile and eventually replace INTO, then that solves
multiple problems in the language without breaking backwards
compatibility.

On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja <marko@joh.to> wrote:
> On 2014-01-14 02:54, Marti Raudsepp wrote:
>> But PL/pgSQL already has an assignment syntax with the behavior you want:
>
> According to the docs, that doesn't set FOUND which would make this a pain
> to deal with..

Right you are. If we can extend the syntax then we could make it such
that "= SELECT" sets FOUND and other diagnostics, and a simple
assignment doesn't. Which makes sense IMO:

a = 10; -- simple assignments really shouldn't affect FOUND

With explicit SELECT, clearly the intent is to perform a query: a = SELECT foo FROM table;
And this could also work: a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;

AFAICT the fact that this works is more of an accident and should be
discouraged. We can leave it as is for compatibility's sake: a = foo FROM table;

Now, another question is whether it's possible to make the syntax
work. Is this an assignment from the result of a subquery, or is it a
query by itself? a = (SELECT foo FROM table);

Does anyone consider this proposal workable?

Regards,
Marti



Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14 12:28 PM, Marti Raudsepp wrote:
> I've always hated INTO in procedures since it makes the code harder to
> follow and has very different behavior on the SQL level, in addition
> to the multi-row problem you bring up. If we can make assignment
> syntax more versatile and eventually replace INTO, then that solves
> multiple problems in the language without breaking backwards
> compatibility.

I don't personally have a problem with INTO other than the behaviour 
that started this thread.  But I'm willing to consider other options.

> On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja <marko@joh.to> wrote:
>> On 2014-01-14 02:54, Marti Raudsepp wrote:
>>> But PL/pgSQL already has an assignment syntax with the behavior you want:
>>
>> According to the docs, that doesn't set FOUND which would make this a pain
>> to deal with..
>
> Right you are. If we can extend the syntax then we could make it such
> that "= SELECT" sets FOUND and other diagnostics, and a simple
> assignment doesn't. Which makes sense IMO:
>
> a = 10; -- simple assignments really shouldn't affect FOUND

With you so far.

> With explicit SELECT, clearly the intent is to perform a query:
>    a = SELECT foo FROM table;
> And this could also work:
>    a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;

I'm not sure that would work with the grammar.  Basically what PL/PgSQL 
does right now is for a statement like:
  a = 1;

It parses the "a =" part itself, and then just reads until the next 
unquoted semicolon without actually looking at it, and slams a "SELECT " 
in front of it.  With this approach we'd have to look into the query and 
try and guess what it does.  That might be possible, but I don't like 
the idea.

> AFAICT the fact that this works is more of an accident and should be
> discouraged. We can leave it as is for compatibility's sake:
>    a = foo FROM table;

I've always considered that ugly (IIRC it's still undocumented as well), 
and would encourage people not to do that.

> Now, another question is whether it's possible to make the syntax
> work. Is this an assignment from the result of a subquery, or is it a
> query by itself?
>    a = (SELECT foo FROM table);

That looks like a scalar subquery, which is wrong because they can't 
return more than one column (nor can they be INSERT etc., obviously).

How about:
  (a) = SELECT 1;  (a, b) = SELECT 1, 2;  (a, b) = INSERT INTO foo RETURNING col1, col2;

Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count. 
AFAICT this can be parsed unambiguously, too, and we don't need to look 
at the query string because this is new syntax.


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/14 Marko Tiikkaja <marko@joh.to>
On 1/14/14 12:28 PM, Marti Raudsepp wrote:
I've always hated INTO in procedures since it makes the code harder to
follow and has very different behavior on the SQL level, in addition
to the multi-row problem you bring up. If we can make assignment
syntax more versatile and eventually replace INTO, then that solves
multiple problems in the language without breaking backwards
compatibility.

I don't personally have a problem with INTO other than the behaviour that started this thread.  But I'm willing to consider other options.


On Tue, Jan 14, 2014 at 4:30 AM, Marko Tiikkaja <marko@joh.to> wrote:
On 2014-01-14 02:54, Marti Raudsepp wrote:
But PL/pgSQL already has an assignment syntax with the behavior you want:

According to the docs, that doesn't set FOUND which would make this a pain
to deal with..

Right you are. If we can extend the syntax then we could make it such
that "= SELECT" sets FOUND and other diagnostics, and a simple
assignment doesn't. Which makes sense IMO:

a = 10; -- simple assignments really shouldn't affect FOUND

With you so far.


With explicit SELECT, clearly the intent is to perform a query:
   a = SELECT foo FROM table;
And this could also work:
   a = INSERT INTO table (foo) VALUES (10) RETURNING foo_id;

I'm not sure that would work with the grammar.  Basically what PL/PgSQL does right now is for a statement like:

  a = 1;

It parses the "a =" part itself, and then just reads until the next unquoted semicolon without actually looking at it, and slams a "SELECT " in front of it.  With this approach we'd have to look into the query and try and guess what it does.  That might be possible, but I don't like the idea.


AFAICT the fact that this works is more of an accident and should be
discouraged. We can leave it as is for compatibility's sake:
   a = foo FROM table;

I've always considered that ugly (IIRC it's still undocumented as well), and would encourage people not to do that.


Now, another question is whether it's possible to make the syntax
work. Is this an assignment from the result of a subquery, or is it a
query by itself?
   a = (SELECT foo FROM table);

only this form is allowed in SQL/PSM - and it has some logic - you can assign result of subquery (should be one row only) to variable.
 

That looks like a scalar subquery, which is wrong because they can't return more than one column (nor can they be INSERT etc., obviously).

How about:

  (a) = SELECT 1;
  (a, b) = SELECT 1, 2;
  (a, b) = INSERT INTO foo RETURNING col1, col2;


I prefer subquery only syntax - a := (some) or (a,b,c) = (some a,b,c) with possible enhancing for statements with RETURNING

a advance is compatibility with DB2 (SQL/PSM) syntax - and this code is written now - it is done in my sql/psm implementation

Regards

Pavel

 
Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count. AFAICT this can be parsed unambiguously, too, and we don't need to look at the query string because this is new syntax.


Regards,
Marko Tiikkaja



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14 1:28 PM, Pavel Stehule wrote:
> I prefer subquery only syntax - a := (some) or (a,b,c) = (some a,b,c) with
> possible enhancing for statements with RETURNING
>
> a advance is compatibility with DB2 (SQL/PSM) syntax - and this code is
> written now - it is done in my sql/psm implementation

Are you saying that DB2 supports the multi-variable assignment?  I 
couldn't find any reference suggesting that, can you point me to one?


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Pavel Stehule
Date:
Hello


2014/1/14 Marko Tiikkaja <marko@joh.to>
On 1/14/14 1:28 PM, Pavel Stehule wrote:
I prefer subquery only syntax - a := (some) or (a,b,c) = (some a,b,c) with
possible enhancing for statements with RETURNING

a advance is compatibility with DB2 (SQL/PSM) syntax - and this code is
written now - it is done in my sql/psm implementation

Are you saying that DB2 supports the multi-variable assignment?  I couldn't find any reference suggesting that, can you point me to one?

I was wrong - it is different syntax - it is SQL/PSM form - db2 supports SET var=val, var=val, ...

 http://postgres.cz/wiki/Basic_statements_SQL/PSM_samples

Regards

Pavel



Regards,
Marko Tiikkaja

Re: plpgsql.consistent_into

From
Tom Lane
Date:
Marko Tiikkaja <marko@joh.to> writes:
> On 1/14/14 12:28 PM, Marti Raudsepp wrote:
>> Now, another question is whether it's possible to make the syntax
>> work. Is this an assignment from the result of a subquery, or is it a
>> query by itself?
>> a = (SELECT foo FROM table);

> That looks like a scalar subquery, which is wrong because they can't 
> return more than one column (nor can they be INSERT etc., obviously).

Yeah, it's a scalar subquery, which means that plpgsql already assigns
a non-error meaning to this syntax.

> How about:

>    (a) = SELECT 1;
>    (a, b) = SELECT 1, 2;
>    (a, b) = INSERT INTO foo RETURNING col1, col2;

> Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count. 
> AFAICT this can be parsed unambiguously, too, and we don't need to look 
> at the query string because this is new syntax.

The idea of inventing new syntax along this line seems like a positive
direction to pursue.  Since assignment already rejects multiple rows
from the source expression, this wouldn't be weirdly inconsistent.

It might be worth thinking about the <multiple column assignment> UPDATE
syntax that's in recent versions of the SQL standard:
UPDATE targettab SET (a, b, c) = row-valued-expression [ , ... ] [ WHERE ... ]

We don't actually implement this in PG yet, except for trivial cases, but
it will certainly happen eventually. I think your sketch above deviates
unnecessarily from what the standard says for UPDATE.  In particular
I think it'd be better to write things like
(a, b) = ROW(1, 2);(a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);

which would exactly match what you'd write in a multiple-assignment
UPDATE, and it has the same rejects-multiple-rows semantics too.

Also note that the trivial cases we do already implement in UPDATE
look like
UPDATE targettab SET (a, b, c) = (1, 2, 3) [ WHERE ... ]

that is, we allow a row constructor where the optional keyword ROW has
been omitted.  I think people would expect to be able to write this in
plpgsql:
(a, b) = (1, 2);

Now, this doesn't provide any guidance for INSERT/UPDATE/DELETE RETURNING,
but frankly I don't feel any need to invent new syntax for those, since
RETURNING INTO already works the way you want.

I'm not too sure what it'd take to make this work.  Right now,
SELECT (SELECT x, y FROM foo WHERE id = 42);

would generate "ERROR:  subquery must return only one column", but
I think it's mostly a historical artifact that it does that rather than
returning a composite value (of an anonymous record type).  If we were
willing to make that change then it seems like it'd be pretty
straightforward to teach plpgsql to handle
(a, b, ...) = row-valued-expression

where there wouldn't actually be any need to parse the RHS any differently
from the way plpgsql parses an assignment RHS right now.  Which would be
a good thing IMO.  If we don't generalize the behavior of scalar
subqueries then plpgsql would have to jump through a lot of hoops to
support the subselect case.
        regards, tom lane



Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14, 6:15 PM, Tom Lane wrote:
> Marko Tiikkaja <marko@joh.to> writes:
>> How about:
>
>>     (a) = SELECT 1;
>>     (a, b) = SELECT 1, 2;
>>     (a, b) = INSERT INTO foo RETURNING col1, col2;
>
>> Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
>> AFAICT this can be parsed unambiguously, too, and we don't need to look
>> at the query string because this is new syntax.
>
> The idea of inventing new syntax along this line seems like a positive
> direction to pursue.  Since assignment already rejects multiple rows
> from the source expression, this wouldn't be weirdly inconsistent.
>
> It might be worth thinking about the <multiple column assignment> UPDATE
> syntax that's in recent versions of the SQL standard:
>
>     UPDATE targettab SET (a, b, c) = row-valued-expression [ , ... ] [ WHERE ... ]
>
> We don't actually implement this in PG yet, except for trivial cases, but
> it will certainly happen eventually. I think your sketch above deviates
> unnecessarily from what the standard says for UPDATE.  In particular
> I think it'd be better to write things like
>
>     (a, b) = ROW(1, 2);
>     (a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);
>
> which would exactly match what you'd write in a multiple-assignment
> UPDATE, and it has the same rejects-multiple-rows semantics too.

Hmm.  That's a fair point I did not consider.

> Also note that the trivial cases we do already implement in UPDATE
> look like
>
>     UPDATE targettab SET (a, b, c) = (1, 2, 3) [ WHERE ... ]
>
> that is, we allow a row constructor where the optional keyword ROW has
> been omitted.  I think people would expect to be able to write this in
> plpgsql:
>
>     (a, b) = (1, 2);
>
> Now, this doesn't provide any guidance for INSERT/UPDATE/DELETE RETURNING,
> but frankly I don't feel any need to invent new syntax for those, since
> RETURNING INTO already works the way you want.

Yeah, I don't feel strongly about having to support them with this 
syntax.  The inconsistency is a bit ugly, but it's not the end of the world.

> I'm not too sure what it'd take to make this work.  Right now,
>
>     SELECT (SELECT x, y FROM foo WHERE id = 42);
>
> would generate "ERROR:  subquery must return only one column", but
> I think it's mostly a historical artifact that it does that rather than
> returning a composite value (of an anonymous record type).  If we were
> willing to make that change then it seems like it'd be pretty
> straightforward to teach plpgsql to handle
>
>     (a, b, ...) = row-valued-expression
>
> where there wouldn't actually be any need to parse the RHS any differently
> from the way plpgsql parses an assignment RHS right now.  Which would be
> a good thing IMO.  If we don't generalize the behavior of scalar
> subqueries then plpgsql would have to jump through a lot of hoops to
> support the subselect case.

You can already do the equivalent of  (a,b,c) = (1,2,3)  with SELECT .. 
INTO.  Would you oppose to starting the work on this by only supporting 
the subquery syntax, with the implementation being similar to how we 
currently handle SELECT .. INTO?  If we could also not flat out reject 
including that into 9.4, I would be quite happy.


Regards,
Marko Tiikkaja



Re: plpgsql.consistent_into

From
Tom Lane
Date:
Marko Tiikkaja <marko@joh.to> writes:
> On 1/14/14, 6:15 PM, Tom Lane wrote:
>> I'm not too sure what it'd take to make this work.  Right now,
>> 
>> SELECT (SELECT x, y FROM foo WHERE id = 42);
>> 
>> would generate "ERROR:  subquery must return only one column", but
>> I think it's mostly a historical artifact that it does that rather than
>> returning a composite value (of an anonymous record type).  If we were
>> willing to make that change then it seems like it'd be pretty
>> straightforward to teach plpgsql to handle
>> 
>> (a, b, ...) = row-valued-expression
>> 
>> where there wouldn't actually be any need to parse the RHS any differently
>> from the way plpgsql parses an assignment RHS right now.  Which would be
>> a good thing IMO.  If we don't generalize the behavior of scalar
>> subqueries then plpgsql would have to jump through a lot of hoops to
>> support the subselect case.

> You can already do the equivalent of  (a,b,c) = (1,2,3)  with SELECT .. 
> INTO.  Would you oppose to starting the work on this by only supporting 
> the subquery syntax, with the implementation being similar to how we 
> currently handle SELECT .. INTO?

You can try if you want, but I suspect it will result in writing a lot of
basically throwaway code, ie, not something that would be a precursor
of code that could support the generic-row-valued-expression case.
        regards, tom lane



Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/14/14, 11:15 AM, Tom Lane wrote:
>> How about:
>> >    (a) = SELECT 1;
>> >    (a, b) = SELECT 1, 2;
>> >    (a, b) = INSERT INTO foo RETURNING col1, col2;
>> >Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
>> >AFAICT this can be parsed unambiguously, too, and we don't need to look
>> >at the query string because this is new syntax.
> The idea of inventing new syntax along this line seems like a positive
> direction to pursue.  Since assignment already rejects multiple rows
> from the source expression, this wouldn't be weirdly inconsistent.

Do we actually support = right now? We already support

v_field := field FROM table ... ;

and I think it's a bad idea to have different meaning for = and :=.

> I'm not too sure what it'd take to make this work.  Right now,
>
>     SELECT (SELECT x, y FROM foo WHERE id = 42);
>
> would generate "ERROR:  subquery must return only one column", but
> I think it's mostly a historical artifact that it does that rather than
> returning a composite value (of an anonymous record type).  If we were
> willing to make that change then it seems like it'd be pretty
> straightforward to teach plpgsql to handle
>
>     (a, b, ...) = row-valued-expression
>
> where there wouldn't actually be any need to parse the RHS any differently
> from the way plpgsql parses an assignment RHS right now.  Which would be
> a good thing IMO.  If we don't generalize the behavior of scalar
> subqueries then plpgsql would have to jump through a lot of hoops to
> support the subselect case.

I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1
either...)

CREATE FUNCTION f(int) RETURNS text STABLE LANGUAGE sql AS ( SELECT field FROM table WHERE table_id = $1 );
SELECT f(blah_id) FROM ...

to be equivalent to

SELECT ( SELECT field FROM table WHERE table_id = blah_id ) FROM ...

That would make it very easy to do a lot of code simplification with no performance loss.
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Tom Lane
Date:
Jim Nasby <jim@nasby.net> writes:
> Do we actually support = right now? We already support
> v_field := field FROM table ... ;
> and I think it's a bad idea to have different meaning for = and :=.

That ship sailed a *very* long time ago.  See other thread about
documenting rather than ignoring this more-or-less-aboriginal
behavior of plpgsql.

> I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1
either...)

Hm ... too tired to be sure, but I think the issue about inlining a
function of this kind has to do with whether you get the same answers
in corner cases such as subselect fetching no rows.
        regards, tom lane



Re: plpgsql.consistent_into

From
Pavel Stehule
Date:



2014/1/15 Jim Nasby <jim@nasby.net>
On 1/14/14, 11:15 AM, Tom Lane wrote:
How about:
>    (a) = SELECT 1;
>    (a, b) = SELECT 1, 2;
>    (a, b) = INSERT INTO foo RETURNING col1, col2;
>Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
>AFAICT this can be parsed unambiguously, too, and we don't need to look
>at the query string because this is new syntax.
The idea of inventing new syntax along this line seems like a positive
direction to pursue.  Since assignment already rejects multiple rows
from the source expression, this wouldn't be weirdly inconsistent.

Do we actually support = right now? We already support

v_field := field FROM table ... ;

and I think it's a bad idea to have different meaning for = and :=.

It is probably second the most ugly thing on plpgsql. I don't know about other crippled embedded SQL statement in any stored procedure language.

Regards

Pavel
 


I'm not too sure what it'd take to make this work.  Right now,

        SELECT (SELECT x, y FROM foo WHERE id = 42);

would generate "ERROR:  subquery must return only one column", but
I think it's mostly a historical artifact that it does that rather than
returning a composite value (of an anonymous record type).  If we were
willing to make that change then it seems like it'd be pretty
straightforward to teach plpgsql to handle

        (a, b, ...) = row-valued-expression

where there wouldn't actually be any need to parse the RHS any differently
from the way plpgsql parses an assignment RHS right now.  Which would be
a good thing IMO.  If we don't generalize the behavior of scalar
subqueries then plpgsql would have to jump through a lot of hoops to
support the subselect case.

I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1 either...)

CREATE FUNCTION f(int) RETURNS text STABLE LANGUAGE sql AS ( SELECT field FROM table WHERE table_id = $1 );
SELECT f(blah_id) FROM ...

to be equivalent to

SELECT ( SELECT field FROM table WHERE table_id = blah_id ) FROM ...

That would make it very easy to do a lot of code simplification with no performance loss.

--
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: plpgsql.consistent_into

From
Marti Raudsepp
Date:
On Wed, Jan 15, 2014 at 8:23 AM, Jim Nasby <jim@nasby.net> wrote:
> Do we actually support = right now? We already support
>
> v_field := field FROM table ... ;
>
> and I think it's a bad idea to have different meaning for = and :=.

That was already discussed before. Yes, we support both = and := and
they have exactly the same behavior; I agree, we should keep them
equivalent.

Regards,
Marti



Re: plpgsql.consistent_into

From
Jim Nasby
Date:
On 1/15/14, 12:35 AM, Tom Lane wrote:
> Jim Nasby <jim@nasby.net> writes:
>> Do we actually support = right now? We already support
>> v_field := field FROM table ... ;
>> and I think it's a bad idea to have different meaning for = and :=.
>
> That ship sailed a *very* long time ago.  See other thread about
> documenting rather than ignoring this more-or-less-aboriginal
> behavior of plpgsql.

Yeah, I had no idea that was even supported...

>> I have no idea if this is related or not, but I would REALLY like for this to work (doesn't in 8.4, AFAIK not in 9.1
either...)
>
> Hm ... too tired to be sure, but I think the issue about inlining a
> function of this kind has to do with whether you get the same answers
> in corner cases such as subselect fetching no rows.

There was some discussion about this a few years ago and I think that was essentially the issue.

What I think would work is essentially a macro that would dump the function definition right into the query and then
letthe planner deal with it. So
 

SELECT blah, ( SELECT status_code FROM status_code WHERE status_code_id = blah_status_code_id ) FROM blah;

can become simply

SELECT blah, status_code__get_text( blah_status_code_id ) FROM blah;

but have it translate to the same raw SQL, same as views.
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



Re: plpgsql.consistent_into

From
Marko Tiikkaja
Date:
On 1/14/14, 6:15 PM, Tom Lane wrote:
> We don't actually implement this in PG yet, except for trivial cases, but
> it will certainly happen eventually. I think your sketch above deviates
> unnecessarily from what the standard says for UPDATE.  In particular
> I think it'd be better to write things like
>
>     (a, b) = ROW(1, 2);
>     (a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);
>
> which would exactly match what you'd write in a multiple-assignment
> UPDATE, and it has the same rejects-multiple-rows semantics too.

Just in case someone's interested: I won't be working on this for 9.5. 
If someone feels like picking this patch up, be my guest.


.marko