Thread: Re: proposal: schema variables
Hi folks, Thanks for continuing this work. As a side note, I would like to mention how strange the situation in this CF item is. If I understand correctly, there are two hackers threads discussing the same patch, with recent patches posted in both of them. This is obviously confusing, e.g. the main concern from another thread, about names clashing, wasn't even mentioned in this one. Is it possible to reconcile development in one thread?
ne 10. 11. 2024 v 16:24 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
Hi folks,
Thanks for continuing this work. As a side note, I would like to mention
how strange the situation in this CF item is. If I understand correctly,
there are two hackers threads discussing the same patch, with recent
patches posted in both of them. This is obviously confusing, e.g. the
main concern from another thread, about names clashing, wasn't even
mentioned in this one. Is it possible to reconcile development in one
thread?
This is probably my error. I don't try to organize threads, just I'll try to reply in the thread where I got a question.
Regards
Pavel
> On Sun, Nov 10, 2024 at 05:19:09PM GMT, Pavel Stehule wrote: > ne 10. 11. 2024 v 16:24 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> > napsal: > > > Hi folks, > > > > Thanks for continuing this work. As a side note, I would like to mention > > how strange the situation in this CF item is. If I understand correctly, > > there are two hackers threads discussing the same patch, with recent > > patches posted in both of them. This is obviously confusing, e.g. the > > main concern from another thread, about names clashing, wasn't even > > mentioned in this one. Is it possible to reconcile development in one > > thread? > > > > This is probably my error. I don't try to organize threads, just I'll try > to reply in the thread where I got a question. It's fine. I just would appreciate some clarity about which patch to look at. To confirm, the series in this thread contains everything from the other one, plus some improvements, right?
ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
ne 10. 11. 2024 v 16:24 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:Hi folks,
Thanks for continuing this work. As a side note, I would like to mention
how strange the situation in this CF item is. If I understand correctly,
there are two hackers threads discussing the same patch, with recent
patches posted in both of them. This is obviously confusing, e.g. the
main concern from another thread, about names clashing, wasn't even
mentioned in this one. Is it possible to reconcile development in one
thread?This is probably my error. I don't try to organize threads, just I'll try to reply in the thread where I got a question.
I thought a lot of time about better solutions for identifier collisions and I really don't think so there is some consistent user friendly syntax. Personally I think there is an easy already implemented solution - convention - just use a dedicated schema for variables and this schema should not be in the search path. Or use secondary convention - like using prefix "__" for session variables. Common convention is using "_" for PLpgSQL variables. I searched how this issue is solved in other databases, or in standard, and I found nothing special. The Oracle and SQL/PSM has a concept of visibility - the variables are not visible outside packages or modules, but Postgres has nothing similar. It can be emulated by a dedicated schema without inserting a search path, but it is less strong.
I think we can introduce an alternative syntax, that will not be user friendly or readable friendly, but it can be without collisions - or can decrease possible risks.
It is nothing new - SQL does it with old, "new" syntax of inner joins, or in Postgres we can
where salary < 40000
or
where pg_catalog.int4lt(salary, 40000);
or some like we use for operators
OPERATOR(
schema
.
operatorname
)
So introducing VARIABLE(schema.variablename) syntax as an alternative syntax for accessing variables I really like. I strongly prefer to use this as only alternative (secondary) syntax, because I don't think it is friendly syntax or writing friendly, but it is safe, and I can imagine tools that can replace generic syntax to this special, or that detects generic syntax and shows some warning. Then users can choose what they prefer. Two syntaxes - generic and special can be good enough for all - and this can be perfectly consistent with current Postgres.
Regards
Pavel
RegardsPavel
ne 10. 11. 2024 v 18:41 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sun, Nov 10, 2024 at 05:19:09PM GMT, Pavel Stehule wrote:
> ne 10. 11. 2024 v 16:24 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
> napsal:
>
> > Hi folks,
> >
> > Thanks for continuing this work. As a side note, I would like to mention
> > how strange the situation in this CF item is. If I understand correctly,
> > there are two hackers threads discussing the same patch, with recent
> > patches posted in both of them. This is obviously confusing, e.g. the
> > main concern from another thread, about names clashing, wasn't even
> > mentioned in this one. Is it possible to reconcile development in one
> > thread?
> >
>
> This is probably my error. I don't try to organize threads, just I'll try
> to reply in the thread where I got a question.
It's fine. I just would appreciate some clarity about which patch to
look at. To confirm, the series in this thread contains everything from
the other one, plus some improvements, right?
I don't remember any change that can be visible for users in this thread. Laurens does very precious code review (thank you again) - there are bigger changes in comments, and less changes in code - some parts of code are moved between patches, some lines were redundant and removed, some lines were artefacts of some git work and were cleaned. Some redundant tests were removed. There is no new code.
Regards
Pavel
ne 10. 11. 2024 v 18:51 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:ne 10. 11. 2024 v 16:24 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:Hi folks,
Thanks for continuing this work. As a side note, I would like to mention
how strange the situation in this CF item is. If I understand correctly,
there are two hackers threads discussing the same patch, with recent
patches posted in both of them. This is obviously confusing, e.g. the
main concern from another thread, about names clashing, wasn't even
mentioned in this one. Is it possible to reconcile development in one
thread?This is probably my error. I don't try to organize threads, just I'll try to reply in the thread where I got a question.I thought a lot of time about better solutions for identifier collisions and I really don't think so there is some consistent user friendly syntax. Personally I think there is an easy already implemented solution - convention - just use a dedicated schema for variables and this schema should not be in the search path. Or use secondary convention - like using prefix "__" for session variables. Common convention is using "_" for PLpgSQL variables. I searched how this issue is solved in other databases, or in standard, and I found nothing special. The Oracle and SQL/PSM has a concept of visibility - the variables are not visible outside packages or modules, but Postgres has nothing similar. It can be emulated by a dedicated schema without inserting a search path, but it is less strong.
There can be more collisions in Oracle, because the functions without arguments don't need parentheses. Postgres is safer, because this syntax is not allowed.
I think we can introduce an alternative syntax, that will not be user friendly or readable friendly, but it can be without collisions - or can decrease possible risks.It is nothing new - SQL does it with old, "new" syntax of inner joins, or in Postgres we canwhere salary < 40000or
where pg_catalog.int4lt(salary, 40000);or some like we use for operatorsOPERATOR(
schema
.
operatorname
)
So introducing VARIABLE(schema.variablename) syntax as an alternative syntax for accessing variables I really like. I strongly prefer to use this as only alternative (secondary) syntax, because I don't think it is friendly syntax or writing friendly, but it is safe, and I can imagine tools that can replace generic syntax to this special, or that detects generic syntax and shows some warning. Then users can choose what they prefer. Two syntaxes - generic and special can be good enough for all - and this can be perfectly consistent with current Postgres.RegardsPavelRegardsPavel
> On Sun, Nov 10, 2024 at 06:51:40PM GMT, Pavel Stehule wrote: > ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com> > napsal: > I thought a lot of time about better solutions for identifier collisions > and I really don't think so there is some consistent user friendly syntax. > Personally I think there is an easy already implemented solution - > convention - just use a dedicated schema for variables and this schema > should not be in the search path. Or use secondary convention - like using > prefix "__" for session variables. Common convention is using "_" for > PLpgSQL variables. I searched how this issue is solved in other databases, > or in standard, and I found nothing special. The Oracle and SQL/PSM has a > concept of visibility - the variables are not visible outside packages or > modules, but Postgres has nothing similar. It can be emulated by a > dedicated schema without inserting a search path, but it is less strong. > > I think we can introduce an alternative syntax, that will not be user > friendly or readable friendly, but it can be without collisions - or can > decrease possible risks. > > It is nothing new - SQL does it with old, "new" syntax of inner joins, or > in Postgres we can > > where salary < 40000 > > or > > where pg_catalog.int4lt(salary, 40000); > > > or some like we use for operators OPERATOR(*schema*.*operatorname*) > > So introducing VARIABLE(schema.variablename) syntax as an alternative > syntax for accessing variables I really like. I strongly prefer to use this > as only alternative (secondary) syntax, because I don't think it is > friendly syntax or writing friendly, but it is safe, and I can imagine > tools that can replace generic syntax to this special, or that detects > generic syntax and shows some warning. Then users can choose what they > prefer. Two syntaxes - generic and special can be good enough for all - and > this can be perfectly consistent with current Postgres. As far as I recall, last time this topic was discussed in hackers, two options were proposed: the one with VARIABLE(name), what you mention here; and another one with adding variables to the FROM clause. The VARIABLE(...) syntax didn't get much negative feedback, so I guess why not -- if you find it fitting, it would be interesting to see the implementation. I'm afraid it should not be just an alternative syntax, but the only one allowed, because otherwise I don't see how scenarious like "drop a column with the same name" could be avoided. As in the previous thread: -- we've got a variable b at the same time SELECT a, b FROM table1; Then dropping the column b, but everything still works beause the variable b got silently picked up. But if it would be required to say VARIABLE(b), then all fine. And to make sure we're on the same page, could you post couple of examples from curretly existing tests in the patch, how are they going to look like with this proposal? About adding variables to the FROM clause. Looks like this option was quite popular, and you've mentioned some technical challenges implementing that. If you'd like to go with another approach, it would be great to elaborate on that -- maybe even with a PoC, to make a convincing point here.
st 13. 11. 2024 v 17:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sun, Nov 10, 2024 at 06:51:40PM GMT, Pavel Stehule wrote:
> ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
> I thought a lot of time about better solutions for identifier collisions
> and I really don't think so there is some consistent user friendly syntax.
> Personally I think there is an easy already implemented solution -
> convention - just use a dedicated schema for variables and this schema
> should not be in the search path. Or use secondary convention - like using
> prefix "__" for session variables. Common convention is using "_" for
> PLpgSQL variables. I searched how this issue is solved in other databases,
> or in standard, and I found nothing special. The Oracle and SQL/PSM has a
> concept of visibility - the variables are not visible outside packages or
> modules, but Postgres has nothing similar. It can be emulated by a
> dedicated schema without inserting a search path, but it is less strong.
>
> I think we can introduce an alternative syntax, that will not be user
> friendly or readable friendly, but it can be without collisions - or can
> decrease possible risks.
>
> It is nothing new - SQL does it with old, "new" syntax of inner joins, or
> in Postgres we can
>
> where salary < 40000
>
> or
>
> where pg_catalog.int4lt(salary, 40000);
>
>
> or some like we use for operators OPERATOR(*schema*.*operatorname*)
>
> So introducing VARIABLE(schema.variablename) syntax as an alternative
> syntax for accessing variables I really like. I strongly prefer to use this
> as only alternative (secondary) syntax, because I don't think it is
> friendly syntax or writing friendly, but it is safe, and I can imagine
> tools that can replace generic syntax to this special, or that detects
> generic syntax and shows some warning. Then users can choose what they
> prefer. Two syntaxes - generic and special can be good enough for all - and
> this can be perfectly consistent with current Postgres.
As far as I recall, last time this topic was discussed in hackers, two
options were proposed: the one with VARIABLE(name), what you mention
here; and another one with adding variables to the FROM clause. The
VARIABLE(...) syntax didn't get much negative feedback, so I guess why
not -- if you find it fitting, it would be interesting to see the
implementation.
I'm afraid it should not be just an alternative syntax, but the only one
allowed, because otherwise I don't see how scenarious like "drop a
column with the same name" could be avoided. As in the previous thread:
-- we've got a variable b at the same time
SELECT a, b FROM table1;
I am sorry, but I am in very strong opposition against this idea. Nobody did reply to my questions, that can change my opinion.
1. This introduces possible inconsistency between LET syntax and SELECT syntax.
What will be the syntax of LET?
LET var = var FROM var
PLpgSQL does something, and it is really strange, and source of some unwanted bugs. See https://commitfest.postgresql.org/50/5044/
With current design I can support
LET var = expr with wars
or
LET var = (QUERY with vars)
It is perfectly consistent. The expressions will be expressions.
2. I don't know of any product in the world that introduced the same requirement. So this syntax will be proprietary (SQL/PSM it really doesn't require) and shock for any users that know other databases. Proprietary syntax in this area far from syntaxes of other products is hell. Try to explain to users the working with OUT variables of Postgres's procedures and functions. And there is some deeper logic.
3. There is a simple solution - convention. Use your own schema like vars, and use session variables in this schema, When this schema will not be on the search path, then there is not a collision.
Variables living in schema. Nobody without CREATE right can create it. So it is safe. Or use prefix in __ for variables in the search path.
4. this requirement introduces syntax inconsistency between plpgsql variables and session variables - which breaks one goal of the patch - introducing some global variables for plpgsql (and for all PL).
5. Using more variables and FROM clauses decrease readability of FROM clause
SELECT v1, v2, a, b, c FROM t1, v1, v2, t2, ...
6. Usually composite variables don't want to unpack. When you should use FROM clause, then composite variables will be unpacked. Then all fields can be possibly in collision with all other column name
Example
CREATE TYPE t1 AS (id int, name varchar)
CREATE TABLE tab(id int, name varchar)
CREATE VARIABLE var1 AS t1;
SELECT id, name, foo(var1) FROM tab, var1;
Now I have a collision in columns id, name, and everywhere I need to use aliases. Without necessity to use var in FROM clause, I can just write
SELECT id, name, foo(var) FROM tab
and there is not any risk of collision
Then dropping the column b, but everything still works beause the
variable b got silently picked up. But if it would be required to say
VARIABLE(b), then all fine.
but same risk you have any time in plpgsql - all time. I don't remember any bug report related to this issue.
And to make sure we're on the same page, could you post couple of
examples from curretly existing tests in the patch, how are they going
to look like with this proposal?
About adding variables to the FROM clause. Looks like this option was
quite popular, and you've mentioned some technical challenges
implementing that. If you'd like to go with another approach, it would
be great to elaborate on that -- maybe even with a PoC, to make a
convincing point here.
There is not any problem with implementation. I see the main problem with usability, and I really don't want to implement some like LET var = var FROM var; I am sorry
It fixes one issue, but it increases possible collisions - so the variables will be unusable.
Regards
Pavel
st 13. 11. 2024 v 17:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sun, Nov 10, 2024 at 06:51:40PM GMT, Pavel Stehule wrote:
> ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
> I thought a lot of time about better solutions for identifier collisions
> and I really don't think so there is some consistent user friendly syntax.
> Personally I think there is an easy already implemented solution -
> convention - just use a dedicated schema for variables and this schema
> should not be in the search path. Or use secondary convention - like using
> prefix "__" for session variables. Common convention is using "_" for
> PLpgSQL variables. I searched how this issue is solved in other databases,
> or in standard, and I found nothing special. The Oracle and SQL/PSM has a
> concept of visibility - the variables are not visible outside packages or
> modules, but Postgres has nothing similar. It can be emulated by a
> dedicated schema without inserting a search path, but it is less strong.
>
> I think we can introduce an alternative syntax, that will not be user
> friendly or readable friendly, but it can be without collisions - or can
> decrease possible risks.
>
> It is nothing new - SQL does it with old, "new" syntax of inner joins, or
> in Postgres we can
>
> where salary < 40000
>
> or
>
> where pg_catalog.int4lt(salary, 40000);
>
>
> or some like we use for operators OPERATOR(*schema*.*operatorname*)
>
> So introducing VARIABLE(schema.variablename) syntax as an alternative
> syntax for accessing variables I really like. I strongly prefer to use this
> as only alternative (secondary) syntax, because I don't think it is
> friendly syntax or writing friendly, but it is safe, and I can imagine
> tools that can replace generic syntax to this special, or that detects
> generic syntax and shows some warning. Then users can choose what they
> prefer. Two syntaxes - generic and special can be good enough for all - and
> this can be perfectly consistent with current Postgres.
As far as I recall, last time this topic was discussed in hackers, two
options were proposed: the one with VARIABLE(name), what you mention
here; and another one with adding variables to the FROM clause. The
VARIABLE(...) syntax didn't get much negative feedback, so I guess why
not -- if you find it fitting, it would be interesting to see the
implementation.
I'm afraid it should not be just an alternative syntax, but the only one
allowed, because otherwise I don't see how scenarious like "drop a
column with the same name" could be avoided. As in the previous thread:
-- we've got a variable b at the same time
SELECT a, b FROM table1;
Then dropping the column b, but everything still works beause the
variable b got silently picked up. But if it would be required to say
VARIABLE(b), then all fine.
In this scenario you will get a warning related to variable shadowing (before you drop a column).
I think this issue can be partially similar to creating two equally named tables in different schemas (both schemas are in search path). When you drop one table, the query will work, but the result is different. It is the same issue. The SQL has no concept of shadowing and on the base line it is not necessary. But when you integrate SQL with some procedural code then you should solve this issue (or accept). This issue is real, and it is in every procedural enhancement of SQL that I know with the same syntax. On the other hand I doubt this is a real issue. The changes of system catalogue are tested before production - so probably you will read a warning about a shadowed variable, and probably you will get different results, because variable b has the same value for all rows, and probably will have different value than column b. I can imagine the necessity of disabling this warning on production systems. Shadowing by self is not an issue, probably, but it is a signal of code quality problems.
But this scenario is real, and then it is a question if the warning about shadowed variables should be only optional and if it can be disabled. Maybe not. Generally the shadowing is a strange concept - it is safeguard against serious issues, but it should not be used generally and everywhere the developer should rename the conflict identifiers.
Regards
Pavel
And to make sure we're on the same page, could you post couple of
examples from curretly existing tests in the patch, how are they going
to look like with this proposal?
About adding variables to the FROM clause. Looks like this option was
quite popular, and you've mentioned some technical challenges
implementing that. If you'd like to go with another approach, it would
be great to elaborate on that -- maybe even with a PoC, to make a
convincing point here.
čt 14. 11. 2024 v 8:41 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
st 13. 11. 2024 v 17:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:> On Sun, Nov 10, 2024 at 06:51:40PM GMT, Pavel Stehule wrote:
> ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
> I thought a lot of time about better solutions for identifier collisions
> and I really don't think so there is some consistent user friendly syntax.
> Personally I think there is an easy already implemented solution -
> convention - just use a dedicated schema for variables and this schema
> should not be in the search path. Or use secondary convention - like using
> prefix "__" for session variables. Common convention is using "_" for
> PLpgSQL variables. I searched how this issue is solved in other databases,
> or in standard, and I found nothing special. The Oracle and SQL/PSM has a
> concept of visibility - the variables are not visible outside packages or
> modules, but Postgres has nothing similar. It can be emulated by a
> dedicated schema without inserting a search path, but it is less strong.
>
> I think we can introduce an alternative syntax, that will not be user
> friendly or readable friendly, but it can be without collisions - or can
> decrease possible risks.
>
> It is nothing new - SQL does it with old, "new" syntax of inner joins, or
> in Postgres we can
>
> where salary < 40000
>
> or
>
> where pg_catalog.int4lt(salary, 40000);
>
>
> or some like we use for operators OPERATOR(*schema*.*operatorname*)
>
> So introducing VARIABLE(schema.variablename) syntax as an alternative
> syntax for accessing variables I really like. I strongly prefer to use this
> as only alternative (secondary) syntax, because I don't think it is
> friendly syntax or writing friendly, but it is safe, and I can imagine
> tools that can replace generic syntax to this special, or that detects
> generic syntax and shows some warning. Then users can choose what they
> prefer. Two syntaxes - generic and special can be good enough for all - and
> this can be perfectly consistent with current Postgres.
As far as I recall, last time this topic was discussed in hackers, two
options were proposed: the one with VARIABLE(name), what you mention
here; and another one with adding variables to the FROM clause. The
VARIABLE(...) syntax didn't get much negative feedback, so I guess why
not -- if you find it fitting, it would be interesting to see the
implementation.
I'm afraid it should not be just an alternative syntax, but the only one
allowed, because otherwise I don't see how scenarious like "drop a
column with the same name" could be avoided. As in the previous thread:
-- we've got a variable b at the same time
SELECT a, b FROM table1;
Then dropping the column b, but everything still works beause the
variable b got silently picked up. But if it would be required to say
VARIABLE(b), then all fine.
In this scenario you will get a warning related to variable shadowing (before you drop a column).I think this issue can be partially similar to creating two equally named tables in different schemas (both schemas are in search path). When you drop one table, the query will work, but the result is different. It is the same issue. The SQL has no concept of shadowing and on the base line it is not necessary. But when you integrate SQL with some procedural code then you should solve this issue (or accept). This issue is real, and it is in every procedural enhancement of SQL that I know with the same syntax. On the other hand I doubt this is a real issue. The changes of system catalogue are tested before production - so probably you will read a warning about a shadowed variable, and probably you will get different results, because variable b has the same value for all rows, and probably will have different value than column b. I can imagine the necessity of disabling this warning on production systems. Shadowing by self is not an issue, probably, but it is a signal of code quality problems.But this scenario is real, and then it is a question if the warning about shadowed variables should be only optional and if it can be disabled. Maybe not. Generally the shadowing is a strange concept - it is safeguard against serious issues, but it should not be used generally and everywhere the developer should rename the conflict identifiers.
There can be another example against usage of the FROM clause for variables. Because it solves just one special case, but others are not covered.
Theoretically, variables can have the same names as tables. The table overshadows the variable, so it can work. But when somebody drops the variable, then the query still can work. So requirement of usage variable in FROM clause protects us just against drop column, but not against dropping table. In Postgres the dropping table is possibly risky due search_path (that introduces shadowing concept) without introduction variables. There is a possibility of this issue, but how common is this issue?
Regards
Pavel
RegardsPavel
And to make sure we're on the same page, could you post couple of
examples from curretly existing tests in the patch, how are they going
to look like with this proposal?
About adding variables to the FROM clause. Looks like this option was
quite popular, and you've mentioned some technical challenges
implementing that. If you'd like to go with another approach, it would
be great to elaborate on that -- maybe even with a PoC, to make a
convincing point here.
st 13. 11. 2024 v 17:35 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sun, Nov 10, 2024 at 06:51:40PM GMT, Pavel Stehule wrote:
> ne 10. 11. 2024 v 17:19 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
> napsal:
> I thought a lot of time about better solutions for identifier collisions
> and I really don't think so there is some consistent user friendly syntax.
> Personally I think there is an easy already implemented solution -
> convention - just use a dedicated schema for variables and this schema
> should not be in the search path. Or use secondary convention - like using
> prefix "__" for session variables. Common convention is using "_" for
> PLpgSQL variables. I searched how this issue is solved in other databases,
> or in standard, and I found nothing special. The Oracle and SQL/PSM has a
> concept of visibility - the variables are not visible outside packages or
> modules, but Postgres has nothing similar. It can be emulated by a
> dedicated schema without inserting a search path, but it is less strong.
>
> I think we can introduce an alternative syntax, that will not be user
> friendly or readable friendly, but it can be without collisions - or can
> decrease possible risks.
>
> It is nothing new - SQL does it with old, "new" syntax of inner joins, or
> in Postgres we can
>
> where salary < 40000
>
> or
>
> where pg_catalog.int4lt(salary, 40000);
>
>
> or some like we use for operators OPERATOR(*schema*.*operatorname*)
>
> So introducing VARIABLE(schema.variablename) syntax as an alternative
> syntax for accessing variables I really like. I strongly prefer to use this
> as only alternative (secondary) syntax, because I don't think it is
> friendly syntax or writing friendly, but it is safe, and I can imagine
> tools that can replace generic syntax to this special, or that detects
> generic syntax and shows some warning. Then users can choose what they
> prefer. Two syntaxes - generic and special can be good enough for all - and
> this can be perfectly consistent with current Postgres.
As far as I recall, last time this topic was discussed in hackers, two
options were proposed: the one with VARIABLE(name), what you mention
here; and another one with adding variables to the FROM clause. The
VARIABLE(...) syntax didn't get much negative feedback, so I guess why
not -- if you find it fitting, it would be interesting to see the
implementation.
I'm afraid it should not be just an alternative syntax, but the only one
allowed, because otherwise I don't see how scenarious like "drop a
column with the same name" could be avoided. As in the previous thread:
-- we've got a variable b at the same time
SELECT a, b FROM table1;
Then dropping the column b, but everything still works beause the
variable b got silently picked up. But if it would be required to say
VARIABLE(b), then all fine.
And to make sure we're on the same page, could you post couple of
examples from curretly existing tests in the patch, how are they going
to look like with this proposal?
What do you think about the following design? I can implement a warning "variable_usage_guard" when the variable is accessed without using VARIABLE() syntax. We can discuss later if this warning can be enabled by default or not. There I am open to any variant.
So for variable public.a and table public.foo(a, b)
I can write
LET a = 10; -- there is not possible collision
LET a = a + 1; -- there is not possible collision, no warning
SELECT a, b FROM foo; -- there is a collision - and warning "variable a is shadowed"
SELECT VARIABLE(a), b FROM foo; -- no collision, no warning
After ALTER TABLE foo DROP COLUMN a;
SELECT a, b FROM foo; -- possible warning "the usage in variable without safe syntax",
SELECT VARIABLE(a), b FROM foo; -- no warning
I think this design can be good for all. variable_usage_guard can be enabled by default. If somebody uses conventions for collision protection, then he can safely disable it.
Comments, notes?
Regards
Pavel
About adding variables to the FROM clause. Looks like this option was
quite popular, and you've mentioned some technical challenges
implementing that. If you'd like to go with another approach, it would
be great to elaborate on that -- maybe even with a PoC, to make a
convincing point here.
> On Sat, Nov 16, 2024 at 07:10:31AM GMT, Pavel Stehule wrote: Sorry, got distracted. Let me try to answer step by step. > > As far as I recall, last time this topic was discussed in hackers, two > > options were proposed: the one with VARIABLE(name), what you mention > > here; and another one with adding variables to the FROM clause. The > > VARIABLE(...) syntax didn't get much negative feedback, so I guess why > > not -- if you find it fitting, it would be interesting to see the > > implementation. > > > > I'm afraid it should not be just an alternative syntax, but the only one > > allowed, because otherwise I don't see how scenarious like "drop a > > column with the same name" could be avoided. As in the previous thread: > > > > -- we've got a variable b at the same time > > SELECT a, b FROM table1; > > > > I am sorry, but I am in very strong opposition against this idea. Nobody > did reply to my questions, that can change my opinion. From your reply it's not quite clear, are you opposed to have a mandatory VARIABLE syntax, or having variables in the FROM clause? My main proposal was about the former, but the points that are following seems to talk about the latter. I think it's fine to reject the idea about the FROM clause, as long as you got some reasonable arguments. > > Then dropping the column b, but everything still works beause the > > variable b got silently picked up. But if it would be required to say > > VARIABLE(b), then all fine. > > but same risk you have any time in plpgsql - all time. I don't remember any > bug report related to this issue. Which exactly scenario about plpgsql do you have in mind? Just have tried to declare a variable inside a plpgsql function with the same name as a table column, and got an error about an ambiguous reference. > Theoretically, variables can have the same names as tables. The table > overshadows the variable, so it can work. But when somebody drops the > variable, then the query still can work. So requirement of usage variable > in FROM clause protects us just against drop column, but not against > dropping table. In Postgres the dropping table is possibly risky due > search_path (that introduces shadowing concept) without introduction > variables. There is a possibility of this issue, but how common is this > issue? This sounds to me like an argument against allowing name clashing between variables and tables. It makes even more sense, since session variables are in many ways similar to tables. > I think this issue can be partially similar to creating two equally named > tables in different schemas (both schemas are in search path). When you > drop one table, the query will work, but the result is different. It is the > same issue. The SQL has no concept of shadowing and on the base line it is > not necessary. The point is that most of users are aware about schemas and search path dangers. But to me such a precedent is not an excuse to introduce a new feature with similar traps, which folks would have to learn by making mistakes. Judging from the feedback to this patch over time, I've got an impression that lots of people are also not fans of that. > > Then dropping the column b, but everything still works beause the > > variable b got silently picked up. But if it would be required to say > > VARIABLE(b), then all fine. > > > > In this scenario you will get a warning related to variable shadowing > (before you drop a column). > > [...] > > What do you think about the following design? I can implement a warning > "variable_usage_guard" when the variable is accessed without using > VARIABLE() syntax. We can discuss later if this warning can be enabled by > default or not. There I am open to any variant. I don't follow what are you winning by that? In the context of problem above (i.e. dropping a column), such a warning is functionally equivalend to a warning about shadowing. The problem is that it doesn't sound very appealing to have a feature, which requires some extra efforts to be used in a right way (e.g. put everything into a special vars schema, or keep an eye on logs). Most certainly there are such bits in PostgreSQL today, with all the best practices, crowd wisdom, etc. But the bar for new features in this sense is much higher, you can see it from the feedback to this patch. Thus I believe it makes sense, from purely tactical reasons, to not try to convince half of the community to lower the bar, but instead try to modify the feature to make it more acceptable, even if some parts you might not like. Btw, could you repeat, what was exactly your argument against mandatory VARIABLE() syntax? It's somehow scattered across many replies, would be great to summarize it in a couple of phrases. > Shadowing by self is not an issue, probably, but it is a signal of code > quality problems. Agree, but I'm afraid code quality of an average application using PostgreSQL is quite low, so here we are. As a side note, I've recently caught myself thinking "it would be cool to have session variables here". The use case was preparing a policy for RLS, based on some session-level data set by an application. This session-level data is of a composite data type, so simple current_setting is cumbersome to use, and a temporary table will be dropped at the end, taking the policy with it due to the recorded dependency between them. Thus a session variable of some composite type sounds like a good fit.
Hi
As a side note, I've recently caught myself thinking "it would be cool to have
session variables here". The use case was preparing a policy for RLS, based on
some session-level data set by an application. This session-level data is of a
composite data type, so simple current_setting is cumbersome to use, and a
temporary table will be dropped at the end, taking the policy with it due to
the recorded dependency between them. Thus a session variable of some composite
type sounds like a good fit.
yes, RLS support is one mentioned use case, and strong reason the access rights are implemented there.
Regards
Pavel
Dmitry Dolgov: > This sounds to me like an argument against allowing name clashing between > variables and tables. It makes even more sense, since session variables are in > many ways similar to tables. +1 My mental model of a session variable is similar to a single-row, optionally global temporary, table. Is there any substantial difference that I am not aware of? Best, Wolfgang
so 16. 11. 2024 v 15:56 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Dmitry Dolgov:
> This sounds to me like an argument against allowing name clashing between
> variables and tables. It makes even more sense, since session variables are in
> many ways similar to tables.
+1
It doesn't help too much, because the unique tuple (schema, name), and there is a search path.
Secondly, the pg_class is not good enough for description of scalar variables, and enhancing pg_class for scalar variables can be messy.
My mental model of a session variable is similar to a single-row,
optionally global temporary, table.
Is there any substantial difference that I am not aware of?
What I know, the variables are used as query parameters, not as relations - Oracle, DB2, MSSQL, MySQL, ...
semantically, yes - it is a global temporary object, but it can be scalar or composite value - it is not row.
(global (temp)) table can hold 0, 1 or more rows (and rows are always composite of 0..n fields). The variable holds a value of some type. Proposed session variables are like plpgsql variables (only with different scope). In Postgres there is a difference between a scalar variable and composite variable with one field.
Regards
Pavel
Best,
Wolfgang
Pavel Stehule: > (global (temp)) table can hold 0, 1 or more rows (and rows are always > composite of 0..n fields). The variable holds a value of some type. > Proposed session variables are like plpgsql variables (only with > different scope). In Postgres there is a difference between a scalar > variable and composite variable with one field. I can store composite values in table columns, too. A table column can either be scalar or composite in that sense. So, maybe rephrase: Single-row, single-column (global (temp)) table = variable. One "cell" of that table. For me, the major difference between a variable and a table is, that the table has 0...n rows and 0...m columns, while the variable has *exactly* one in both cases, not 0 either. I must put tables into FROM, why not those nice mini-tables called variables as well? Because they are potentially scalar, you say! But: I can already put functions returning scalar values into FROM: SELECT * FROM format('hello'); The function returns a plain string only. I don't know. This just "fits" for me. Or to put it differently: I don't really care whether I have to write "(SELECT variable FROM variable)" instead of just "variable". I don't want session variables for the syntax, I want session variables, because they are **so much better** than custom GUCs. Best, Wolfgang
so 16. 11. 2024 v 18:13 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:
Pavel Stehule:
> (global (temp)) table can hold 0, 1 or more rows (and rows are always
> composite of 0..n fields). The variable holds a value of some type.
> Proposed session variables are like plpgsql variables (only with
> different scope). In Postgres there is a difference between a scalar
> variable and composite variable with one field.
I can store composite values in table columns, too. A table column can
either be scalar or composite in that sense.
So, maybe rephrase: Single-row, single-column (global (temp)) table =
variable. One "cell" of that table.
the tables are tables and variables are variables. For tables you have INSERT, UPDATE, DELETE commands, for variables you have a LET command.
and scalar is not a single column composite.
The session variables can in some particular use cases replace global temp tables, but this is not the goal. I would like to see global temp tables in Postgres too. Maybe session variables prepare a field for this, because some people better understand global temp objects. But again my proposal is not related to global temp tables. This is a different feature.
For me, the major difference between a variable and a table is, that the
table has 0...n rows and 0...m columns, while the variable has *exactly*
one in both cases, not 0 either.
I must put tables into FROM, why not those nice mini-tables called
variables as well? Because they are potentially scalar, you say!
But: I can already put functions returning scalar values into FROM:
SELECT * FROM format('hello');
The function returns a plain string only.
I don't know. This just "fits" for me.
There are more issues - one - when you use some composite in FROM clause, then you expect an unpacked result. But there are a lot of uses, when unpackaging is not wanted. There is a syntax for this but it is really not intuitive and not well readable.
Or to put it differently: I don't really care whether I have to write
"(SELECT variable FROM variable)" instead of just "variable". I don't
want session variables for the syntax, I want session variables, because
they are **so much better** than custom GUCs.
session variables are better than GUC for the proposed purpose. But it should look like variables. The software should respect standards or common typical usage when it is possible. If we introduce fully proprietary design, then it will be hell for all people who know any other databases. And I don't see a strong benefit from this syntax. It solves just one case, it doesn't solve other possible issues, and it introduces another possible risk.
Regards
Pavel
Best,
Wolfgang
so 16. 11. 2024 v 15:27 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Sat, Nov 16, 2024 at 07:10:31AM GMT, Pavel Stehule wrote:
Sorry, got distracted. Let me try to answer step by step.
> > As far as I recall, last time this topic was discussed in hackers, two
> > options were proposed: the one with VARIABLE(name), what you mention
> > here; and another one with adding variables to the FROM clause. The
> > VARIABLE(...) syntax didn't get much negative feedback, so I guess why
> > not -- if you find it fitting, it would be interesting to see the
> > implementation.
> >
> > I'm afraid it should not be just an alternative syntax, but the only one
> > allowed, because otherwise I don't see how scenarious like "drop a
> > column with the same name" could be avoided. As in the previous thread:
> >
> > -- we've got a variable b at the same time
> > SELECT a, b FROM table1;
> >
>
> I am sorry, but I am in very strong opposition against this idea. Nobody
> did reply to my questions, that can change my opinion.
From your reply it's not quite clear, are you opposed to have a mandatory
VARIABLE syntax, or having variables in the FROM clause? My main proposal was
about the former, but the points that are following seems to talk about the
latter. I think it's fine to reject the idea about the FROM clause, as long as
you got some reasonable arguments.
I am against a requirement to specify a variable in the FROM clause.
> > Then dropping the column b, but everything still works beause the
> > variable b got silently picked up. But if it would be required to say
> > VARIABLE(b), then all fine.
>
> but same risk you have any time in plpgsql - all time. I don't remember any
> bug report related to this issue.
Which exactly scenario about plpgsql do you have in mind? Just have tried to
declare a variable inside a plpgsql function with the same name as a table
column, and got an error about an ambiguous reference.
Until you execute the query, you cannot know if there is a conflict or not. So you can change table structure and you can change the procedure's code, and there can be an invisible conflict until execution and query evaluation. The conflict between PL/pgSQL and SQL raises an error. The conflict between session variables and SQL raises warnings. The issue is detected.
> Theoretically, variables can have the same names as tables. The table
> overshadows the variable, so it can work. But when somebody drops the
> variable, then the query still can work. So requirement of usage variable
> in FROM clause protects us just against drop column, but not against
> dropping table. In Postgres the dropping table is possibly risky due
> search_path (that introduces shadowing concept) without introduction
> variables. There is a possibility of this issue, but how common is this
> issue?
This sounds to me like an argument against allowing name clashing between
variables and tables. It makes even more sense, since session variables are in
many ways similar to tables.
It doesn't help too much. It can fix just one issue. But you can have tables with the same name in different schemas inside schemas from search_path. Unique table names solve nothing.
> I think this issue can be partially similar to creating two equally named
> tables in different schemas (both schemas are in search path). When you
> drop one table, the query will work, but the result is different. It is the
> same issue. The SQL has no concept of shadowing and on the base line it is
> not necessary.
The point is that most of users are aware about schemas and search path
dangers. But to me such a precedent is not an excuse to introduce a new feature
with similar traps, which folks would have to learn by making mistakes. Judging
from the feedback to this patch over time, I've got an impression that lots of
people are also not fans of that.
Unfortunately - I don't believe so there is some syntax without traps. You can check all implementations in other databases. These designs are very different, and all have some issues and all have some limits. It is native - you are trying to join the procedural and functional world.
I understand the risks. These risks are there. But there is no silver bullet - all proposed designs fixed just one case, and not others, and then I don't see a strong enough benefit to introduce design that is far from common usage. Maybe I have a different experience, because I am a man from the stored procedures area, and the risk of collisions is a known issue well solved by common conventions and in postgres by plpgsql.variable_conflict setting. The proposed patch set has very similar functionality. I think the introduction of VARIABLE(xx) syntax and safe syntax guard warning the usage of variables can be safe in how it is possible. But still I want to allow "short" "usual" usage to people who use a safe convention. There is no risk when you use a safe prefix or safe schema.
> > Then dropping the column b, but everything still works beause the
> > variable b got silently picked up. But if it would be required to say
> > VARIABLE(b), then all fine.
> >
>
> In this scenario you will get a warning related to variable shadowing
> (before you drop a column).
>
> [...]
>
> What do you think about the following design? I can implement a warning
> "variable_usage_guard" when the variable is accessed without using
> VARIABLE() syntax. We can discuss later if this warning can be enabled by
> default or not. There I am open to any variant.
I don't follow what are you winning by that? In the context of problem above
(i.e. dropping a column), such a warning is functionally equivalend to a
warning about shadowing.
The problem is that it doesn't sound very appealing to have a feature, which
requires some extra efforts to be used in a right way (e.g. put everything into
a special vars schema, or keep an eye on logs). Most certainly there are such
bits in PostgreSQL today, with all the best practices, crowd wisdom, etc. But
the bar for new features in this sense is much higher, you can see it from the
feedback to this patch. Thus I believe it makes sense, from purely tactical
reasons, to not try to convince half of the community to lower the bar, but
instead try to modify the feature to make it more acceptable, even if some
parts you might not like.
Btw, could you repeat, what was exactly your argument against mandatory
VARIABLE() syntax? It's somehow scattered across many replies, would be great
to summarize it in a couple of phrases.
> Shadowing by self is not an issue, probably, but it is a signal of code
> quality problems.
Agree, but I'm afraid code quality of an average application using PostgreSQL
is quite low, so here we are.
As a side note, I've recently caught myself thinking "it would be cool to have
session variables here". The use case was preparing a policy for RLS, based on
some session-level data set by an application. This session-level data is of a
composite data type, so simple current_setting is cumbersome to use, and a
temporary table will be dropped at the end, taking the policy with it due to
the recorded dependency between them. Thus a session variable of some composite
type sounds like a good fit.
so 16. 11. 2024 v 23:49 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
so 16. 11. 2024 v 15:27 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:> On Sat, Nov 16, 2024 at 07:10:31AM GMT, Pavel Stehule wrote:
Sorry, got distracted. Let me try to answer step by step.
> > As far as I recall, last time this topic was discussed in hackers, two
> > options were proposed: the one with VARIABLE(name), what you mention
> > here; and another one with adding variables to the FROM clause. The
> > VARIABLE(...) syntax didn't get much negative feedback, so I guess why
> > not -- if you find it fitting, it would be interesting to see the
> > implementation.
> >
> > I'm afraid it should not be just an alternative syntax, but the only one
> > allowed, because otherwise I don't see how scenarious like "drop a
> > column with the same name" could be avoided. As in the previous thread:
> >
> > -- we've got a variable b at the same time
> > SELECT a, b FROM table1;
> >
>
> I am sorry, but I am in very strong opposition against this idea. Nobody
> did reply to my questions, that can change my opinion.
From your reply it's not quite clear, are you opposed to have a mandatory
VARIABLE syntax, or having variables in the FROM clause? My main proposal was
about the former, but the points that are following seems to talk about the
latter. I think it's fine to reject the idea about the FROM clause, as long as
you got some reasonable arguments.I am against a requirement to specify a variable in the FROM clause.
> > Then dropping the column b, but everything still works beause the
> > variable b got silently picked up. But if it would be required to say
> > VARIABLE(b), then all fine.
>
> but same risk you have any time in plpgsql - all time. I don't remember any
> bug report related to this issue.
Which exactly scenario about plpgsql do you have in mind? Just have tried to
declare a variable inside a plpgsql function with the same name as a table
column, and got an error about an ambiguous reference.Until you execute the query, you cannot know if there is a conflict or not. So you can change table structure and you can change the procedure's code, and there can be an invisible conflict until execution and query evaluation. The conflict between PL/pgSQL and SQL raises an error. The conflict between session variables and SQL raises warnings. The issue is detected.
> Theoretically, variables can have the same names as tables. The table
> overshadows the variable, so it can work. But when somebody drops the
> variable, then the query still can work. So requirement of usage variable
> in FROM clause protects us just against drop column, but not against
> dropping table. In Postgres the dropping table is possibly risky due
> search_path (that introduces shadowing concept) without introduction
> variables. There is a possibility of this issue, but how common is this
> issue?
This sounds to me like an argument against allowing name clashing between
variables and tables. It makes even more sense, since session variables are in
many ways similar to tables.It doesn't help too much. It can fix just one issue. But you can have tables with the same name in different schemas inside schemas from search_path. Unique table names solve nothing.
the combination of pg_class and pg_attribute cannot describe scalar variables (without hacks). Then you need to enhance pg_class, which can be confusing. And on the second hand almost all columns in pg_class have no sense for variables. And when variables and tables are in different tables, you cannot ensure a unique name. Variables are similar to tables only in possibility to hold a value. That is all. But variables don't store data to file, don't store data in pages, don't allow usage of other storages or formats, and don't support foreign storage. The similarity between variables and tables is like the similarity between horses and cars. Both can help with moving.
> I think this issue can be partially similar to creating two equally named
> tables in different schemas (both schemas are in search path). When you
> drop one table, the query will work, but the result is different. It is the
> same issue. The SQL has no concept of shadowing and on the base line it is
> not necessary.
The point is that most of users are aware about schemas and search path
dangers. But to me such a precedent is not an excuse to introduce a new feature
with similar traps, which folks would have to learn by making mistakes. Judging
from the feedback to this patch over time, I've got an impression that lots of
people are also not fans of that.Unfortunately - I don't believe so there is some syntax without traps. You can check all implementations in other databases. These designs are very different, and all have some issues and all have some limits. It is native - you are trying to join the procedural and functional world.I understand the risks. These risks are there. But there is no silver bullet - all proposed designs fixed just one case, and not others, and then I don't see a strong enough benefit to introduce design that is far from common usage. Maybe I have a different experience, because I am a man from the stored procedures area, and the risk of collisions is a known issue well solved by common conventions and in postgres by plpgsql.variable_conflict setting. The proposed patch set has very similar functionality. I think the introduction of VARIABLE(xx) syntax and safe syntax guard warning the usage of variables can be safe in how it is possible. But still I want to allow "short" "usual" usage to people who use a safe convention. There is no risk when you use a safe prefix or safe schema.
> > Then dropping the column b, but everything still works beause the
> > variable b got silently picked up. But if it would be required to say
> > VARIABLE(b), then all fine.
> >
>
> In this scenario you will get a warning related to variable shadowing
> (before you drop a column).
>
> [...]
>
> What do you think about the following design? I can implement a warning
> "variable_usage_guard" when the variable is accessed without using
> VARIABLE() syntax. We can discuss later if this warning can be enabled by
> default or not. There I am open to any variant.
I don't follow what are you winning by that? In the context of problem above
(i.e. dropping a column), such a warning is functionally equivalend to a
warning about shadowing.
The problem is that it doesn't sound very appealing to have a feature, which
requires some extra efforts to be used in a right way (e.g. put everything into
a special vars schema, or keep an eye on logs). Most certainly there are such
bits in PostgreSQL today, with all the best practices, crowd wisdom, etc. But
the bar for new features in this sense is much higher, you can see it from the
feedback to this patch. Thus I believe it makes sense, from purely tactical
reasons, to not try to convince half of the community to lower the bar, but
instead try to modify the feature to make it more acceptable, even if some
parts you might not like.
Btw, could you repeat, what was exactly your argument against mandatory
VARIABLE() syntax? It's somehow scattered across many replies, would be great
to summarize it in a couple of phrases.
> Shadowing by self is not an issue, probably, but it is a signal of code
> quality problems.
Agree, but I'm afraid code quality of an average application using PostgreSQL
is quite low, so here we are.
As a side note, I've recently caught myself thinking "it would be cool to have
session variables here". The use case was preparing a policy for RLS, based on
some session-level data set by an application. This session-level data is of a
composite data type, so simple current_setting is cumbersome to use, and a
temporary table will be dropped at the end, taking the policy with it due to
the recorded dependency between them. Thus a session variable of some composite
type sounds like a good fit.
so 16. 11. 2024 v 23:07 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
so 16. 11. 2024 v 18:13 odesílatel Wolfgang Walther <walther@technowledgy.de> napsal:Pavel Stehule:
> (global (temp)) table can hold 0, 1 or more rows (and rows are always
> composite of 0..n fields). The variable holds a value of some type.
> Proposed session variables are like plpgsql variables (only with
> different scope). In Postgres there is a difference between a scalar
> variable and composite variable with one field.
I can store composite values in table columns, too. A table column can
either be scalar or composite in that sense.
So, maybe rephrase: Single-row, single-column (global (temp)) table =
variable. One "cell" of that table.the tables are tables and variables are variables. For tables you have INSERT, UPDATE, DELETE commands, for variables you have a LET command.and scalar is not a single column composite.
example
assignment to scalar versus single composite
LET a = 10
LET a.a = 10 or LET a = ROW(10)
Single column tables can be the result of some ALTERS - or sometimes you can use a table type. But for example, plpgsql, very strongly differs between scalar and composite variables. So introducing the "new" concept - single field composite is scalar introduces strong inconsistency there.
Regards
Pavel
The session variables can in some particular use cases replace global temp tables, but this is not the goal. I would like to see global temp tables in Postgres too. Maybe session variables prepare a field for this, because some people better understand global temp objects. But again my proposal is not related to global temp tables. This is a different feature.
For me, the major difference between a variable and a table is, that the
table has 0...n rows and 0...m columns, while the variable has *exactly*
one in both cases, not 0 either.
I must put tables into FROM, why not those nice mini-tables called
variables as well? Because they are potentially scalar, you say!
But: I can already put functions returning scalar values into FROM:
SELECT * FROM format('hello');
The function returns a plain string only.
I don't know. This just "fits" for me.There are more issues - one - when you use some composite in FROM clause, then you expect an unpacked result. But there are a lot of uses, when unpackaging is not wanted. There is a syntax for this but it is really not intuitive and not well readable.
Or to put it differently: I don't really care whether I have to write
"(SELECT variable FROM variable)" instead of just "variable". I don't
want session variables for the syntax, I want session variables, because
they are **so much better** than custom GUCs.session variables are better than GUC for the proposed purpose. But it should look like variables. The software should respect standards or common typical usage when it is possible. If we introduce fully proprietary design, then it will be hell for all people who know any other databases. And I don't see a strong benefit from this syntax. It solves just one case, it doesn't solve other possible issues, and it introduces another possible risk.RegardsPavel
Best,
Wolfgang
Em ter., 19 de nov. de 2024 às 16:15, Pavel Stehule <pavel.stehule@gmail.com> escreveu:
I wrote POC of VARIABLE(varname) syntax support
Not related with POC of VARIABLE but seeing your patches ...
Wouldn't it be better to use just one syntax and message for what to do ON COMMIT ?
When creating a new variable you use
CREATE VARIABLE ... ON COMMIT DROP | ON TRANSACTION END RESET
On PSQL \dV+ you show
Transactional end action
Maybe all them could be just ON COMMIT
CREATE VARIABLE ... [ON COMMIT {NO ACTION, DROP, RESET}] and \dV+ just "on commit" on title column
regards
Marcos
Hi
st 20. 11. 2024 v 14:29 odesílatel Marcos Pegoraro <marcos@f10.com.br> napsal:
Em ter., 19 de nov. de 2024 às 16:15, Pavel Stehule <pavel.stehule@gmail.com> escreveu:I wrote POC of VARIABLE(varname) syntax supportNot related with POC of VARIABLE but seeing your patches ...Wouldn't it be better to use just one syntax and message for what to do ON COMMIT ?When creating a new variable you useCREATE VARIABLE ... ON COMMIT DROP | ON TRANSACTION END RESETOn PSQL \dV+ you showTransactional end actionMaybe all them could be just ON COMMITCREATE VARIABLE ... [ON COMMIT {NO ACTION, DROP, RESET}] and \dV+ just "on commit" on title column
ON COMMIT DROP is related to temporary objects. In this case, you don't need to think about ROLLBACK, because in this case, the temp objects are removed implicitly.
ON TRANSACTION END RESET can be used for non temporary objects too. So this is a little bit of a different feature. But the reset is executed if the transaction is ended by ROLLBACK too. So using a syntax just ON COMMIT can be a little bit messy. TRANSACTION END is more intuitive, I think. If I remember there was a proposal ON COMMIT OR ROLLBACK, but I think TRANSACTION END is better and more intuitive, and better describes what is implemented. I can imagine to support clauses ON COMMIT RESET or ON ROLLBACK RESET that can be used independently, but for this time, I don't want to increase a complexity now - reset is just at transaction end without dependency if the transaction was committed or rollbacked.
Regards
Pavel
regardsMarcos
Em qua., 20 de nov. de 2024 às 10:52, Pavel Stehule <pavel.stehule@gmail.com> escreveu:
COMMIT can be a little bit messy. TRANSACTION END is more intuitive, I think.
Exactly to be not messy I would just ON COMMIT for all, and DOCs can explain that this option is ignored for temp objects and do the same at the end of transaction, independently if commited or rolled back
st 20. 11. 2024 v 15:15 odesílatel Marcos Pegoraro <marcos@f10.com.br> napsal:
Em qua., 20 de nov. de 2024 às 10:52, Pavel Stehule <pavel.stehule@gmail.com> escreveu:COMMIT can be a little bit messy. TRANSACTION END is more intuitive, I think.Exactly to be not messy I would just ON COMMIT for all, and DOCs can explain that this option is ignored for temp objects and do the same at the end of transaction, independently if commited or rolled back
I feel it differently - when I read ON COMMIT, then I expect really just a COMMIT event, not ROLLBACK. Attention - the metadata about variables are transactional, but the variables are not transactional (by default).
But this feeling can be subjective. The objective argument against using ON COMMIT like ON TRANSACTION END is fact, so we lost a possibility for a more precious setting.
I can imagine scenarios with ON COMMIT RESET - and variable can hold some last value from transaction, or ON ROLLBACK RESET - and variable can hold some computed value from successfully ended transaction - last inserted id.
In this case, I don't see a problem to reopen a discussion about syntax or postpone this feature. I think the possibility to reset variables at some specified time can be an interesting feature (that can increase safety, because the application doesn't need to solve initial setup), but from the list of implemented features for session variables, this is not too far from the end. If TRANSACTION END is not intuitive - with exactly the same functionality can be RESET AT TRANSACTION START - so the ON TRANSACTION END can be transformed to ON BEGIN RESET, and this syntax can be maybe better, because it signalize, for every transaction, the variable will be initialized to default. What do you think? Can be ON BEGIN RESET acceptable for you.
Regards
Pavel
> On Tue, Nov 19, 2024 at 08:14:01PM +0100, Pavel Stehule wrote: > Hi > > I wrote POC of VARIABLE(varname) syntax support Thanks, the results look good. I'm still opposed the idea of having a warning, and think it has to be an error -- but it's my subjective opinion. Lets see if there are more votes on that topic.
st 20. 11. 2024 v 21:14 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Tue, Nov 19, 2024 at 08:14:01PM +0100, Pavel Stehule wrote:
> Hi
>
> I wrote POC of VARIABLE(varname) syntax support
Thanks, the results look good. I'm still opposed the idea of having a
warning, and think it has to be an error -- but it's my subjective
opinion. Lets see if there are more votes on that topic.
The error breaks the possibility to use variables (session variables) like Oracle's package variables easily. It increases effort for transformation or porting because you should identify variables inside queries and you should wrap it to fence. On the other hand, extensions that can read a query after transformation can easily detect unwrapped variables and they can raise an error. It can be very easy to implement this check to plpgsql_check for example or like plpgsql.extra_check.
In my ideal world, the shadowing warning should be enabled by default, and an unfencing warning disabled by default. But I have not a problem with activating both warnings by default. I think warnings are enough, because if there is some risk then a shadowing warning is activated. And my experience is more positive about warnings, people read it.
Regards
Pavel
On Thu, Dec 5, 2024 at 2:52 PM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > > only rebase hi. disclaimer, i *only* applied v20241205-0001-Enhancing-catalog-for-support-session-variables-and-.patch. create variable v2 as text; alter variable v2 rename to v2; ERROR: session variable "v2" already exists in schema "public" the above is coverage tests for report_namespace_conflict: case VariableRelationId: Assert(OidIsValid(nspOid)); msgfmt = gettext_noop("session variable \"%s\" already exists in schema \"%s\""); break; create type t1 as (a int, b int); CREATE VARIABLE var1 AS t1; alter type t1 drop attribute a; should "alter type t1 drop attribute a;" not allowed? GetCommandLogLevel also needs to deal with case T_CreateSessionVarStmt? there are no regress tests for the change we made in find_composite_type_dependencies? It looks like it will be reachable for sure. CreateVariable, print out error position: if (get_typtype(typid) == TYPTYPE_PSEUDO) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("session variable cannot be pseudo-type %s", format_type_be(typid)), parser_errposition(pstate, stmt->typeName->location))); Alvaro Herrera told me actually, you don't need the extra parentheses around errcode. so you can: if (get_typtype(typid) == TYPTYPE_PSEUDO) ereport(ERROR, errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("session variable cannot be pseudo-type %s", format_type_be(typid)), parser_errposition(pstate, stmt->typeName->location)) pg_variable_is_visible seems to have done twice the system cache call. maybe follow through with the pg_table_is_visible, pg_type_is_visible code pattern. IN src/bin/psql/describe.c + appendPQExpBufferStr(&buf, "\nWHERE true\n"); this is not needed? ------------------------------------------------ some of the `foreach` can change to foreach_oid, foreach_node see: https://git.postgresql.org/cgit/postgresql.git/commit/?id=14dd0f27d7cd56ffae9ecdbe324965073d01a9ff ------------------------------------------------ doc/src/sgml/ref/create_variable.sgml <programlisting> CREATE VARIABLE var1 AS date; LET var1 = current_date; SELECT var1; </programlisting> v20241205-0001-Enhancing-catalog-for-support-session-variables-and-.patch alone cannot do `LET var1 = current_date;`, `SELECT var1;` maybe the following patch can do it. but that makes we cannot commit v20241205-0001-Enhancing-catalog-for-support-session-variables-and-.patch alone. ------------------------------------------------ since CREATE VARIABLE is an extension to standard, so in create_schema.sgml Compatibility section, do we need to mention CREATE SCHEMA CREATE VARIABLE is an extension from standard ? ----------------------------------------------- /* * Drop variable by OID, and register the needed session variable * cleanup. */ void DropVariableById(Oid varid) i am not sure of the meaning of "register the needed session variable cleanup".
On Mon, Dec 9, 2024 at 2:33 AM Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > again. only applied v20241208-0001-Enhancing-catalog-for-support-session-variables-and-.patch. /* we want SessionVariableCreatePostprocess to see the catalog changes. */ 0001 doesn't have SessionVariableCreatePostprocess, so this comment is wrong in the context of 0001. typo: As above, but if the variable isn't found and is_mussing is not NULL is_mussing should be is_missing. ---------------------------------------------- issue with grant.sgml and revoke.sgml. * there are no regress tests for WITH GRANT OPTION but it seems to work; there are no REVOKE CASCADE tests. the following tests show REVOKE CASCADE works. create variable v1 as int; GRANT select on VARIABLE v1 to alice with grant option; set session authorization alice; GRANT select on VARIABLE v1 to bob with grant option; reset session authorization; select varacl from pg_variable where varname = 'v1'; --output {jian=rw/jian,alice=r*/jian,bob=r*/alice} revoke all privileges on variable v1 from alice cascade; select varacl from pg_variable where varname = 'v1'; --output {jian=rw/jian} revoke GRANT OPTION FOR all privileges on variable v1 from alice cascade; also works. * in revoke.sgml and grant.sgml. if you look closely, " | ALL VARIABLES IN SCHEMA schema_name [, ...] }" is wrong because there is no left-curly-bracket, but there is a right-curly-bracket. * in revoke.sgml. <phrase>where <replaceable class="parameter">role_specification</replaceable> can be:</phrase> [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC | CURRENT_ROLE | CURRENT_USER | SESSION_USER should be at the end of the synopsis section? otherwise it looks weird, maybe we can put the REVOKE VARIABLE code upper. grant.sgml changes position looks fine to me. * <para> The <command>GRANT</command> command has two basic variants: one that grants privileges on a database object (table, column, view, foreign table, sequence, database, foreign-data wrapper, foreign server, function, procedure, procedural language, large object, configuration parameter, schema, tablespace, or type), and one that grants membership in a role. These variants are similar in many ways, but they are different enough to be described separately. </para> this <para> in grant.sgml needs to also mention "variable"? * <para> Privileges on databases, tablespaces, schemas, languages, and configuration parameters are <productname>PostgreSQL</productname> extensions. </para> this <para> in grant.sgml needs to also mention "variable"? ---------------------------------------------- * + <para> + Inside a query or an expression, a session variable can be + <quote>shadowed</quote> by a column with the same name. Similarly, the + name of a function or procedure argument or a PL/pgSQL variable (see PL/pgSQL should decorated as <application>PL/pgSQL</application> * we already support \dV and \dV+ in v20241208-0001-Enhancing-catalog-for-support-session-variables-and-.patch so we should update doc/src/sgml/ref/psql-ref.sgml also. I briefly searched \dV in v20241208-0002-Storage-for-session-variables-and-SQL-interface.patch, no entry. * in doc/src/sgml/ddl.sgml <table id="privilege-abbrevs-table"> and <table id="privileges-summary-table"> need to change for variable? <varlistentry id="ddl-priv-select">, <varlistentry id="ddl-priv-update"> need to change for variable? it's kind of tricky. if we only apply v20241208-0001-Enhancing-catalog-for-support-session-variables-and-.patch we can not SELECT or UPDATE variable. so how are we going to mention these privileges for variable? * we can add some tests for EVENT TRIGGER test, since event trigger support CREATE|DROP variable. atually, I think there is a bug. create variable v1 as int; CREATE OR REPLACE FUNCTION event_trigger_report_dropped() RETURNS event_trigger LANGUAGE plpgsql AS $$ DECLARE r record; BEGIN FOR r IN SELECT * from pg_event_trigger_dropped_objects() LOOP IF NOT r.normal AND NOT r.original THEN CONTINUE; END IF; RAISE NOTICE 'NORMAL: orig=% normal=% istemp=% type=% identity=% name=% args=%', r.original, r.normal, r.is_temporary, r.object_type, r.object_identity, r.address_names, r.address_args; END LOOP; END; $$; CREATE EVENT TRIGGER regress_event_trigger_report_dropped ON sql_drop WHEN TAG IN ('DROP VARIABLE') EXECUTE PROCEDURE event_trigger_report_dropped(); --output: src9=# drop variable v1; NOTICE: test_event_trigger: ddl_command_start DROP VARIABLE NOTICE: NORMAL: orig=t normal=f istemp=f type=session variable identity=public.v1 name={public,$} args={} DROP VARIABLE should i expect NOTICE: NORMAL: orig=t normal=f istemp=f type=session variable identity=public.v1 name={public,$} args={} to be NOTICE: NORMAL: orig=t normal=f istemp=f type=session variable identity=public.v1 name={public,v1} args={} ?
Hi
st 20. 11. 2024 v 21:14 odesílatel Dmitry Dolgov <9erthalion6@gmail.com> napsal:
> On Tue, Nov 19, 2024 at 08:14:01PM +0100, Pavel Stehule wrote:
> Hi
>
> I wrote POC of VARIABLE(varname) syntax support
Thanks, the results look good. I'm still opposed the idea of having a
warning, and think it has to be an error -- but it's my subjective
opinion. Lets see if there are more votes on that topic.
Maybe the warning of usage of unfenced variables can be changed (enhanced) to some different mechanism that can be more restrictive (and safer), but I think it can still be user friendly.
My idea is based on assumption so users with knowledge of stored procedures know variables and related risks (and know tools how to minimize risks), and for other people the risk is higher and we should force usage of variable fences. I think we can force usage of variable fences at query runtime, when query is not executed from the SPI environment. This behaviour can be enabled by default, but can be optionally disabled.
CREATE VARIABLE s.x AS int; -- allowed when user has create right on schema s
CREATE VIEW v1 AS SELECT x; -- no problem, the check is dynamic (execution), not static
CREATE VIEW v2 AS SELECT VARIABLE(x); -- no problem
SELECT x; -- fails on access to unfenced variable
SELECT * FROM v1; -- fails on access to unfenced variable
SELECT * FROM v2; -- ok
but inside pl, this check will not be active, and then with default setting I can write an code like
LET var = 10; -- fencing is not allowed there, and there is not any collision
DO $$
BEGIN
RAISE NOTICE 'var=%', var;
RAISE NOTICE 'var=%', (SELECT * FROM v1); --is ok here too
END;
$$;
Outside PL the fencing can be required, inside PL the fencing can be optional. With this proposal we can limit the possible risk usage of unfenced variables only in PL context, and the behaviour can be very similar to PL/SQL or SQL/PSM. This check is only a runtime check, so it has no impact on any DDL statement. It doesn't change the syntax or behavior, so it can be implemented subsequently - it is just a safeguard against unwanted usage of variables in an environment, where users have no possibility to use variables already. I can imagine that this check "allow_unfenced_variables" can have three values (everywhere, pl, nowhere) and the default can be pl. The results of queries don't depend on the current setting of this check. For all values for all possible queries and situations, the result is the same (when it is finished). But sometimes, the check can break the execution - in similar meaning like access rights. All previous proposed warnings can be unchanged.
Comments, notes?
Regards
Pavel
hi. GRANT|REVOKE ALL VARIABLES IN SCHEMA schema_name [, ...] } seems to work. might be better to add tests. also src/bin/psql/tab-complete.in.c COMPLETE_WITH_SCHEMA_QUERY_PLUS(Query_for_list_of_grantables, we can also add "ALL VARIABLES IN SCHEMA " also need change this <para> in grant.sgml: <para> There is also an option to grant privileges on all objects of the same type within one or more schemas. This functionality is currently supported only for tables, sequences, functions, and procedures. <literal>ALL TABLES</literal> also affects views and foreign tables, just like the specific-object <command>GRANT</command> command. <literal>ALL FUNCTIONS</literal> also affects aggregate and window functions, but not procedures, again just like the specific-object <command>GRANT</command> command. Use <literal>ALL ROUTINES</literal> to include procedures. </para> revoke.sgml, we should use <replaceable class="parameter">role_specification</replaceable>? so it will become like: REVOKE [ GRANT OPTION FOR ] { { SELECT | UPDATE } [, ...] | ALL [ PRIVILEGES ] } ON { VARIABLE variable_name [, ...] | ALL VARIABLES IN SCHEMA schema_name [, ...] } FROM role_specification [, ...] maybe also add [ GRANTED BY role_specification ] but I didn't test "REVOKE [ GRANTED BY role_specification ]". Speaking of acl tests, similar to has_table_privilege I am afraid we need to have a function like has_variable_privilege for acl tests. has_table_privilege has 6 function signatures. so there will be more code. ------------------------------------------------------ doc/src/sgml/ref/create_variable.sgml <synopsis> section: CREATE VARIABLE [ IF NOT EXISTS ] name [ AS ] data_type ] [ COLLATE collation ] redundant right square bracket after "data_type".
hi. /* * has_session_variable_privilege variants * These are all named "has_session_variable_privilege" at the SQL level. * They take various combinations of variable name, variable OID, * user name, user OID, or implicit user = current_user. * * The result is a boolean value: true if user has the indicated * privilege, false if not. The variants that take a relation OID * return NULL if the OID doesn't exist. */ /* * has_session_variable_privilege_name_name * Check user privileges on a session variable given * name username, text sessin variable name, and text priv name. */ "The variants that take a relation OID return NULL if the OID doesn't exist." should it be "The variants that take an OID type return NULL if the OID doesn't exist." ? typo, "sessin" should be "session". ----------------<<<>>>>------------------- <sect1 id="ddl-session-variables"> <title>Session Variables</title> only mentioned that "Session variables themselves are persistent, but their values are neither persistent nor shared (like the content of temporary tables). " I feel like this sentence is not that explicit. we actually want to say "Once a session exits, the variable value is reset to NULL, one session cannot see another session variable value." + <para> + A persistent database object that holds a value in session memory. This + value is private to each session and is released when the session ends. + Read or write access to session variables is controlled by privileges, + similar to other database objects. + </para> i do like this description in glossary.sgml. maybe we can copy it and put it to ddl.sgml "<sect1 id="ddl-session-variables"> ----------------<<<>>>>------------------- REVOKE [ GRANT OPTION FOR ] { { SELECT | UPDATE } [, ...] | ALL [ PRIVILEGES ] } ON { VARIABLE <replaceable>variable_name</replaceable> [, ...] | ALL VARIABLES IN SCHEMA <replaceable class="parameter">schema_name</replaceable> [, ...] } FROM { [ GROUP ] <replaceable class="parameter">role_specification</replaceable> | PUBLIC } [, ...] [ GRANTED BY <replaceable class="parameter">role_specification</replaceable> ] [ CASCADE | RESTRICT ] revoke, seems still not right. since with this, we can say: REVOKE ALL PRIVILEGES ON VARIABLE v1 FROM group group alice CASCADE; i think the correct one should be: REVOKE [ GRANT OPTION FOR ] { { SELECT | UPDATE } [, ...] | ALL [ PRIVILEGES ] } ON { VARIABLE <replaceable>variable_name</replaceable> [, ...] | ALL VARIABLES IN SCHEMA <replaceable class="parameter">schema_name</replaceable> [, ...] } FROM <replaceable class="parameter">role_specification</replaceable> [, ...] [ GRANTED BY <replaceable class="parameter">role_specification</replaceable> ] [ CASCADE | RESTRICT ] ----------------<<<>>>>------------------- <programlisting> CREATE VARIABLE public.current_user_id AS integer; GRANT READ ON VARIABLE public.current_user_id TO PUBLIC; LET current_user_id = (SELECT id FROM users WHERE usename = session_user); SELECT current_user_id; </programlisting> "GRANT READ" should be "GRANT SELECT". ----------------<<<>>>>------------------- doc/src/sgml/ref/alter_default_privileges.sgml GRANT { SELECT | UPDATE | ALL [ PRIVILEGES ] } ON VARIABLES TO { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...] [ WITH GRANT OPTION ] the above part is wrong? should be: GRANT { { SELECT | UPDATE } [,...] | ALL [ PRIVILEGES ] } ON VARIABLES TO { [ GROUP ] <replaceable class="parameter">role_name</replaceable> | PUBLIC } [, ...] [ WITH GRANT OPTION ] since we can: ALTER DEFAULT PRIVILEGES FOR ROLE alice IN SCHEMA svartest GRANT SELECT, UPDATE ON VARIABLES TO bob; ----------------<<<>>>>----------------------------- CREATE VARIABLE IF NOT EXISTS v2 AS comp; grant update on variable v2 to alice; set role alice; LET v2.a = 12; --acl permission error LET v2.b = 12; --acl permission error LET v2 = (11,12); --ok. not sure this is the desired behavior, for composite type variables, you are allowed to change all the values, but you are not allowed to update the field value of the composite. The following are normal table test update cases. create type comp as (a int, b int); create table t2(a comp); insert into t2 select '(11,12)'; grant update (a ) on t2 to alice; set role alice; update t2 set a.a = 13; --ok update t2 set a.b = 13; --ok update t2 set a = '(11,13)'; --ok ----------------<<<>>>>----------------------------- domain seems to have an issue. CREATE domain d1 AS int; CREATE VARIABLE var1 AS d1; let var1 = 3; --this should fail?. alter domain d1 add check (value <> 3); select var1; ERROR: value for domain d1 violates check constraint "d1_check" ----------------<<<>>>>----------------------------- doc/src/sgml/ref/alter_variable.sgml <title>Parameters</title> section, the order should be: name, new_owner, new_name, new_schema? I am beginning to look around 0002.
hi. review is based on v20241219-0002-Storage-for-session-variables-and-SQL-interface.patch and v20241219-0001-Enhancing-catalog-for-support-session-variables-and-.patch. in doc/src/sgml/catalogs.sgml <row> <entry role="catalog_table_entry"><para role="column_definition"> <structfield>defaclobjtype</structfield> <type>char</type> </para> <para> Type of object this entry is for: <literal>r</literal> = relation (table, view), <literal>S</literal> = sequence, <literal>f</literal> = function, <literal>T</literal> = type, <literal>n</literal> = schema </para></entry> </row> this need updated for session variable? psql meta command \ddp src/bin/psql/describe.c listDefaultACLs also need to change. ----------------<<>>------------------------- +-- show variable +SELECT public.svar; +SELECT public.svar.c; +SELECT (public.svar).*; + +-- the variable is shadowed, raise error +SELECT public.svar.c FROM public.svar; + +-- can be fixed by alias +SELECT public.svar.c FROM public.svar x The above tests are duplicated in session_variables.sql. ----------------<<>>------------------------- --- a/src/include/nodes/plannodes.h +++ b/src/include/nodes/plannodes.h @@ -49,7 +49,7 @@ typedef struct PlannedStmt NodeTag type; - CmdType commandType; /* select|insert|update|delete|merge|utility */ + CmdType commandType; /* select|let|insert|update|delete|merge|utility */ since we don't have CMD_LET CmdType. so this comment change is not necessary? ----------------<<>>------------------------- the following are simple tests that triggers makeParamSessionVariable error messages. we don't have related tests, we can add it on session_variables.sql. create type t1 as (a int, b int); CREATE VARIABLE v1 text; CREATE VARIABLE v2 as t1; select v1.a; select v2.c; also select v2.c; ERROR: could not identify column "c" in variable LINE 1: select v2.c; ^ the error message is no good. i think we want: ERROR: could not identify column "c" in variable "public.v1" ----------------<<>>------------------------- + /* + * Check for duplicates. Note that this does not really prevent + * duplicates, it's here just to provide nicer error message in common + * case. The real protection is the unique key on the catalog. + */ + if (SearchSysCacheExists2(VARIABLENAMENSP, + PointerGetDatum(varName), + ObjectIdGetDatum(varNamespace))) + { + } I am confused by these comments. i think the SearchSysCacheExists2 do actually prevent duplicates. I noticed that publication_add_relation also has similar comments. ----------------<<>>------------------------- typedef struct LetStmt { NodeTag type; List *target; /* target variable */ Node *query; /* source expression */ int location; } LetStmt; here, location should be a type of ParseLoc. ----------------<<>>------------------------- in 0001, 0002, function SetSessionVariableWithSecurityCheck never being used. ----------------<<>>------------------------- +/* + * transformLetStmt - + * transform an Let Statement + */ +static Query * +transformLetStmt(ParseState *pstate, LetStmt *stmt) ... + /* + * Save target session variable ID. This value is used multiple times: by + * the query rewriter (for getting related defexpr), by planner (for + * acquiring an AccessShareLock on target variable), and by the executor + * (we need to know the target variable ID). + */ + query->resultVariable = varid; "defexpr", do you mean "default expression"? Generally letsmt is like: "let variable = (SelectStmt)" is there nothing related to the DEFAULT expression? "(we need to know the target variable ID)." in ExecuteLetStmt that is true. but I commented out the following lines, the regress test still succeeded. extract_query_dependencies_walker /* record dependency on the target variable of a LET command */ // if (OidIsValid(query->resultVariable)) // record_plan_variable_dependency(context, query->resultVariable); ScanQueryForLocks // /* process session variables */ // if (OidIsValid(parsetree->resultVariable)) // { // if (acquire) // LockDatabaseObject(VariableRelationId, parsetree->resultVariable, // 0, AccessShareLock); // else // UnlockDatabaseObject(VariableRelationId, parsetree->resultVariable, // 0, AccessShareLock); // } ----------------<<>>------------------------- in transformLetStmt, we already did ACL privilege check, we don't need do it again at ExecuteLetStmt?