Thread: proposal - plpgsql - all plpgsql auto variables should be constant
Hi
plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's control variable, TG_WHEN, TG_OP, ..
Currently these variables are not protected, what can be source of problems, mainly for not experienced users. I propose mark these variables as constant.
-- today
postgres=# do $$ begin for i in 1..10 loop raise notice 'i=%', i; i := 20; end loop; end; $$;
NOTICE: i=1
NOTICE: i=2
NOTICE: i=3
NOTICE: i=4
NOTICE: i=5
NOTICE: i=6
NOTICE: i=7
NOTICE: i=8
NOTICE: i=9
NOTICE: i=10
DO
NOTICE: i=1
NOTICE: i=2
NOTICE: i=3
NOTICE: i=4
NOTICE: i=5
NOTICE: i=6
NOTICE: i=7
NOTICE: i=8
NOTICE: i=9
NOTICE: i=10
DO
-- after patch
postgres=# do $$ begin for i in 1..10 loop raise notice 'i=%', i; i := 20; end loop; end; $$;
ERROR: variable "i" is declared CONSTANT
LINE 1: ... begin for i in 1..10 loop raise notice 'i=%', i; i := 20; e...
ERROR: variable "i" is declared CONSTANT
LINE 1: ... begin for i in 1..10 loop raise notice 'i=%', i; i := 20; e...
These variables are protected in PL/SQL too.
Comments, notes?
Regards
Pavel
p.s. this is simple implementation - just for function demo. Maybe can be better to introduce new plpgsql_variable's flag like is_protected or similar than using isconst.
Attachment
On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > > plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's control variable, TG_WHEN, TG_OP, .. > > Currently these variables are not protected, what can be source of problems, mainly for not experienced users. I proposemark these variables as constant. > +1 for general idea. > -- today > postgres=# do $$ begin for i in 1..10 loop raise notice 'i=%', i; i := 20; end loop; end; $$; > NOTICE: i=1 > NOTICE: i=2 > NOTICE: i=3 > NOTICE: i=4 > NOTICE: i=5 > NOTICE: i=6 > NOTICE: i=7 > NOTICE: i=8 > NOTICE: i=9 > NOTICE: i=10 > DO > > -- after patch > postgres=# do $$ begin for i in 1..10 loop raise notice 'i=%', i; i := 20; end loop; end; $$; > ERROR: variable "i" is declared CONSTANT CONSTANT looks odd in this context since i's value changes. But you already have a proposal to change that. > > p.s. this is simple implementation - just for function demo. Maybe can be better to introduce new plpgsql_variable's flaglike is_protected or similar than using isconst. Yes, I think that will help. In this case PL/SQL says that "i" can not be used as an assignment target. That's not very clear but something on those lines will help. -- Best Wishes, Ashutosh Bapat
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes: > On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: >> plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's control variable, TG_WHEN, TG_OP, .. >> Currently these variables are not protected, what can be source of problems, mainly for not experienced users. I proposemark these variables as constant. > +1 for general idea. I'm skeptical. If we'd marked them that way from day one, it would have been fine, but to change it now is a whole different discussion. I think the odds that anybody will thank us are much smaller than the odds that there will be complaints. In particular, I'd be just about certain that there are people out there who are changing FOUND and loop control variables manually, and they will not appreciate us breaking their code. As for the trigger variables specifically, what is the rationale for marking TG_OP read-only but not OLD and NEW? But it is dead certain that we won't get away with making the latter two read-only. In short, -1. This ship sailed about twenty years ago. regards, tom lane
pá 24. 4. 2020 v 16:07 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:
> On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's control variable, TG_WHEN, TG_OP, ..
>> Currently these variables are not protected, what can be source of problems, mainly for not experienced users. I propose mark these variables as constant.
> +1 for general idea.
I'm skeptical. If we'd marked them that way from day one, it would have
been fine, but to change it now is a whole different discussion. I think
the odds that anybody will thank us are much smaller than the odds that
there will be complaints. In particular, I'd be just about certain that
there are people out there who are changing FOUND and loop control
variables manually, and they will not appreciate us breaking their code.
This is not black/white issue. Maybe can sense to modify the FOUND variable, but modification of control variable has not any sense. The updated value is rewriten by runtime any iteration. You cannot to use modification of control variable to skip some iterations like in C.
As for the trigger variables specifically, what is the rationale
for marking TG_OP read-only but not OLD and NEW? But it is dead
certain that we won't get away with making the latter two read-only.
For before triggers the NEW have to be updated. Any other maybe should be protected, but there is little bit different kind of informations.
In short, -1. This ship sailed about twenty years ago.
regards, tom lane
At Fri, 24 Apr 2020 16:47:28 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in > pá 24. 4. 2020 v 16:07 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal: > > > Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes: > > > On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule <pavel.stehule@gmail.com> > > wrote: > > >> plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's > > control variable, TG_WHEN, TG_OP, .. > > >> Currently these variables are not protected, what can be source of > > problems, mainly for not experienced users. I propose mark these variables > > as constant. > > > > > +1 for general idea. > > > > I'm skeptical. If we'd marked them that way from day one, it would have > > been fine, but to change it now is a whole different discussion. I think > > the odds that anybody will thank us are much smaller than the odds that > > there will be complaints. In particular, I'd be just about certain that > > there are people out there who are changing FOUND and loop control > > variables manually, and they will not appreciate us breaking their code. > > > > This is not black/white issue. Maybe can sense to modify the FOUND > variable, but modification of control variable has not any sense. The > updated value is rewriten by runtime any iteration. You cannot to use > modification of control variable to skip some iterations like in C. It seems to me, the loop structure is not a parallel of for() in C. It is rather a parallel of foreach of Perl or "for in range()" in Python. So it is natural to me that the i is assignable and reset with the next value at every iteration. I believe that there are many existing cases where the control variable is modified in a loop. On the other hand, I'm not sure about FOUND and the similars and I don't have a firm opinion them. I don't see a use case where they need to be assignable. However, I don't see a clear reason they mustn't be assignable, too. (And the behavior is documented at least for FOUND.) > > As for the trigger variables specifically, what is the rationale > > for marking TG_OP read-only but not OLD and NEW? But it is dead > > certain that we won't get away with making the latter two read-only. > > > > For before triggers the NEW have to be updated. Any other maybe should be > protected, but there is little bit different kind of informations. > > > In short, -1. This ship sailed about twenty years ago. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
po 27. 4. 2020 v 5:02 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com> napsal:
At Fri, 24 Apr 2020 16:47:28 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in
> pá 24. 4. 2020 v 16:07 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
>
> > Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:
> > > On Fri, Apr 24, 2020 at 12:24 PM Pavel Stehule <pavel.stehule@gmail.com>
> > wrote:
> > >> plpgsql generate lot of auto variables - FOUND, SQLERRM, cycle's
> > control variable, TG_WHEN, TG_OP, ..
> > >> Currently these variables are not protected, what can be source of
> > problems, mainly for not experienced users. I propose mark these variables
> > as constant.
> >
> > > +1 for general idea.
> >
> > I'm skeptical. If we'd marked them that way from day one, it would have
> > been fine, but to change it now is a whole different discussion. I think
> > the odds that anybody will thank us are much smaller than the odds that
> > there will be complaints. In particular, I'd be just about certain that
> > there are people out there who are changing FOUND and loop control
> > variables manually, and they will not appreciate us breaking their code.
> >
>
> This is not black/white issue. Maybe can sense to modify the FOUND
> variable, but modification of control variable has not any sense. The
> updated value is rewriten by runtime any iteration. You cannot to use
> modification of control variable to skip some iterations like in C.
It seems to me, the loop structure is not a parallel of for() in C. It
is rather a parallel of foreach of Perl or "for in range()" in
Python. So it is natural to me that the i is assignable and reset with
the next value at every iteration. I believe that there are many
existing cases where the control variable is modified in a loop.
it is based on PL/SQL language and this language is based on ADA.
There loop parameter is constant
Regards
Pavel
On the other hand, I'm not sure about FOUND and the similars and I
don't have a firm opinion them. I don't see a use case where they need
to be assignable. However, I don't see a clear reason they mustn't be
assignable, too. (And the behavior is documented at least for FOUND.)
> > As for the trigger variables specifically, what is the rationale
> > for marking TG_OP read-only but not OLD and NEW? But it is dead
> > certain that we won't get away with making the latter two read-only.
> >
>
> For before triggers the NEW have to be updated. Any other maybe should be
> protected, but there is little bit different kind of informations.
>
> > In short, -1. This ship sailed about twenty years ago.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
po 27. 4. 2020 v 16:26 odesílatel Greg Stark <stark@mit.edu> napsal:
On Fri, 24 Apr 2020 at 10:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I'm skeptical. If we'd marked them that way from day one, it would have
> been fine, but to change it now is a whole different discussion. I think
> the odds that anybody will thank us are much smaller than the odds that
> there will be complaints. In particular, I'd be just about certain that
> there are people out there who are changing FOUND and loop control
> variables manually, and they will not appreciate us breaking their code.
I kind of doubt it would break anybody's code. But I also doubt it's
actually going to help anybody. It's not exactly an easy bug to write,
so meh, I can't really get worked up either way about this.
> As for the trigger variables specifically, what is the rationale
> for marking TG_OP read-only but not OLD and NEW? But it is dead
> certain that we won't get away with making the latter two read-only.
But, uh, this actually seems like it might help people. Obviously we
can't make NEW constant for BEFORE triggers, but for AFTER triggers it
would actually be catching quite an easy-to-write bug. I bet plenty of
people accidentally define triggers as AFTER triggers which are
intending to modify the columns being stored and then don't understand
why they aren't working.
They might not even find out right away if the trigger only modifies
the columns sometimes so it could be the kind of latent bug that
catches people in production (which wouldn't be improved by the patch
but at least it would produce an error rather than silent data
corruption).
The only valid use cases that maybe would cause some pain would be
people using the same function for BEFORE *and* AFTER triggers and
where that code is written to just assign to NEW in both cases. That
seems like it would be odd though since we're not talking about an
audit function or something like that if the function is assigning to
NEW...
this behave can be dynamic and it can be active only for AFTER trigger
--
greg
On Mon, Apr 27, 2020 at 7:56 PM Greg Stark <stark@mit.edu> wrote: > > On Fri, 24 Apr 2020 at 10:08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > I'm skeptical. If we'd marked them that way from day one, it would have > > been fine, but to change it now is a whole different discussion. I think > > the odds that anybody will thank us are much smaller than the odds that > > there will be complaints. In particular, I'd be just about certain that > > there are people out there who are changing FOUND and loop control > > variables manually, and they will not appreciate us breaking their code. > > I kind of doubt it would break anybody's code. But I also doubt it's > actually going to help anybody. It's not exactly an easy bug to write, > so meh, I can't really get worked up either way about this. We could retain the old behaviour by using a GUC which defaults to old behaviour. More GUCs means more confusion, this once guc under plpgsql extension might actually help. -- Best Wishes, Ashutosh Bapat
út 28. 4. 2020 v 13:35 odesílatel Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> napsal:
On Mon, Apr 27, 2020 at 7:56 PM Greg Stark <stark@mit.edu> wrote:
>
> On Fri, 24 Apr 2020 at 10:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > I'm skeptical. If we'd marked them that way from day one, it would have
> > been fine, but to change it now is a whole different discussion. I think
> > the odds that anybody will thank us are much smaller than the odds that
> > there will be complaints. In particular, I'd be just about certain that
> > there are people out there who are changing FOUND and loop control
> > variables manually, and they will not appreciate us breaking their code.
>
> I kind of doubt it would break anybody's code. But I also doubt it's
> actually going to help anybody. It's not exactly an easy bug to write,
> so meh, I can't really get worked up either way about this.
We could retain the old behaviour by using a GUC which defaults to old
behaviour. More GUCs means more confusion, this once guc under plpgsql
extension might actually help.
I am not sure if other GUC can help (in this case). Probably it cannot be default, and beginners has zero knowledge to enable this or similar GUC.
This week I enhanced plpgsql_check about new check https://github.com/okbob/plpgsql_check related to this feature.
I afraid so people who needs these checks and some help probably doesn't know about this extension.
--
Best Wishes,
Ashutosh Bapat